Back to [Archive] Post-review discussions

U.S. Ethnic/Race Differences in Aptitude by Generation
(b) that's a problem because I'm actually trying to replicate those numbers but I can't. For example when I factor analyze (PAF) all the variables you use in your syntax, the gap for hispanic 1st gen vs white 3rd, is 0.45. It's not the tremendous gap of -3.86. A factor score is expressed in z score no ? 4 is just too much for me to believe it.


The -3.86 was based on reported English ability, not reading numeracy scores. I explained in the text. Do a PAF using "ability to..."

(c) i said I did not understand why you use 2003-1995 for grade 4 but 1999-1995 for grade 8, without reporting the d gaps for the year 2003. It's just to see if you will get different values using 2003 instead of other years.


I used the earliest two years and the latest two.

So where are we on this?
I replicated table 9 column A.

You should add the following in your spreadsheet "syntax".

if J_Q06a_T = 2 or J_Q07a_T =2 and J_Q04a =1 Immigrant =2.
if J_Q06a_T = 2 and J_Q07a_T =2 and J_Q04a =2 Immigrant =1.
if J_Q06a_T = 1 and J_Q07a_T =1 and J_Q04a =1 Immigrant =3.

SET MXCELLS=9000.

FREQUENCIES VARIABLES=J_Q05cUSX3a J_Q05cUSX3b J_Q05cUSX3d J_Q05cUSX3e
/ORDER=ANALYSIS.

FACTOR
/VARIABLES J_Q05cUSX3a J_Q05cUSX3b J_Q05cUSX3d J_Q05cUSX3e
/MISSING LISTWISE
/ANALYSIS J_Q05cUSX3a J_Q05cUSX3b J_Q05cUSX3d J_Q05cUSX3e
/PRINT UNIVARIATE INITIAL EXTRACTION
/CRITERIA MINEIGEN(1) ITERATE(25)
/EXTRACTION PAF
/ROTATION NOROTATE
/SAVE REG(ALL)
/METHOD=CORRELATION.

SORT CASES BY RACETHN_5CAT.
SPLIT FILE SEPARATE BY RACETHN_5CAT.

MEANS TABLES=PVLIT2 PVLIT3 PVLIT4 PVNUM1 PVNUM2 PVNUM3 PVNUM4 PVNUM5 PVNUM6 PVNUM7 PVNUM8 PVNUM9 PVNUM10 PVLIT5 PVLIT6 PVLIT7 PVLIT8 PVLIT10 PVLIT1 PVLIT9 BY Immigrant
/CELLS=MEAN COUNT STDDEV.

MEANS TABLES=FAC1_1 PVLIT2 BY Immigrant
/CELLS=MEAN COUNT STDDEV.

WEIGHT BY SPFWT0.

MEANS TABLES=PVLIT2 PVLIT3 PVLIT4 PVNUM1 PVNUM2 PVNUM3 PVNUM4 PVNUM5 PVNUM6 PVNUM7 PVNUM8 PVNUM9 PVNUM10 PVLIT5 PVLIT6 PVLIT7 PVLIT8 PVLIT10 PVLIT1 PVLIT9 BY Immigrant
/CELLS=MEAN COUNT STDDEV.

MEANS TABLES=FAC1_1 PVLIT2 BY Immigrant
/CELLS=MEAN COUNT STDDEV.

Also, you must say explicitly what the sign of the d gap means here. because it's not what one would expect. The highest score means "not at all". So, 1st generation people have higher score on the english variable, which is not even an ability test, as said (wrongly) in your text because it's just a questionnaire. Categorical variable (4 values). I don't think it's good method to perform PCA or PAF on this type of variables.

Look here
https://nces.ed.gov/surveys/piaac/final_en_bq.htm

[J_R05cUSX3a] With regard to English, how well do you ...
[J_Q05cUSX3a] understand it when it is spoken to you? Would you say ...
Responses
[ layout = radioButton ]
01 Very well
02 Well
03 Not well
04 Not at all
DK
RF

[J_Q05cUSX3b] speak it? Would you say ...
[J_Q05cUSX3d] read it? Would you say ...
[J_Q05cUSX3e] write it? Would you say ...

That's very, very wrong...

If you want to add it nonetheless, I want you to write the sample size in your text, or at the bottom of the table 9.

For english "ability" the sample is 249, 160, 131 for hisp 1st, 2nd, 3rd, and 92, 185, and 3005 for white 1st, 2nd, 3rd, and 57, 28, 544, for black 1st, 2nd, 3rd, and 157, 49, 22, for asians 1st, 2nd, 3rd, respectively.

For the other scales (literacy/numeracy) the sample is the same for every groups, except for asian 1st generation (N=158).

I will need to figure out how you did for the column b and c. But now I'm exhausted. Too much.

Concerning grade 8 TIMSS 2003, I just said it's better to display the information, and let the reader knows that the d gaps in this year has the same pattern as the others. Same thing for sample size (unweighted) for various (sub)groups which must be always reported when the numbers are available.

----
----

EDIT- I attach my replication (xls spreadsheet), if you really want to add that analysis (but which should better be deleted in my opinion) I want you to add the numbers on my file in yours. We need transparency about the method, analysis and all.
In the text, I said:

"To explore the possible magnitude of linguistic bias with regards to first generation Hispanics, we looked at PIAAC 2012 literacy and numeracy scores and self-reported English ability by race/ethnicity and generation. Results are shown in Table 9. English ability scores correlated as expected which allowed us to extract a principle factor. The mean differences, relative to third+ generation non-Hispanic Whites, in this factor are shown in column A. The correlation between 17
this factor and numeracy/literacy for each group are shown in column B ..."

This was clear enough. I wanted to see what effect language ability had. This was the least worse method I could think of. What would you propose?

A factor analysis on a categorical variable is passably acceptable and SPSS doesn't allow for polychoric correlations.
http://www-01.ibm.com/support/docview.wss?uid=swg21477550

You said: "Also, you must say explicitly what the sign of the d gap means here. because it's not what one would expect. The highest score means "not at all". So, 1st generation people have higher score on the english variable, which is not even an ability test, as said (wrongly) in your text because it's just a questionnaire. Categorical variable (4 values). I don't think it's good method to perform PCA or PAF on this type of variables."

When presented d-values, poor ability has a negative sign. So the 1st generation Hispanic score is -3.86. I think that that makes intuitive sense.
[hr]
Meh, I don't even want to waste my time. I just deleted the section. Are we good now?
A factor analysis on a categorical variable is passably acceptable and SPSS doesn't allow for polychoric correlations.
http://www-01.ibm.com/support/docview.wss?uid=swg21477550


Your link says something I was not unaware of: it's difficult to use ordinal variable in factor analysis and treat it as if it's continuous variable. The last time (nearly one year) I checked for studies about this topic, I remembered it has complications, and your link says as much. I wasn't sure (not even today) how exactly I can deal with it. I don't know the detail about the difficulties but I believe the warning should be taken seriously.

Meh, I don't even want to waste my time. I just deleted the section. Are we good now?


It's not time wasting, and I think your way of doing things (i.e., rushing) is no good, such as, removing the analysis because otherwise will delay the date of publication. We must ask if it adds something important, if yes, perhaps include it. I'm not saying you can't make it, but you have to say the variable is a questionnaire, and not really an "ability" because even though you haven't said explicitly whether it was a test or not, the word "ability" really gives me the impression it was a test on english skills, with questions of comprehension and knowledge on english. Of course, questionnaires about whether you think you're good at english can be more or less accurate, people saying they understand english not well surely tell the truth, and you can say it has less reliability than a true test of english, the problem is that if it is not a test, it is unlikely to be subjected to measurement bias. In real english test, you can have measurement bias, even among english tests, assuming the two english tests differ in the items (one test contains lot of words biased against minorities, and the other not). It's a distinction that should be made.

So, I just said you shouldn't treat this variable as an "ability". If you add this precision, I'm not against the inclusion of this analysis (you have to make clear what are the variables you're using, either in the article or the spreadsheet). But you're the one to decide if you want it or not.

In any case, if you can justify why you drop it (other than "waste of your time") I have nothing more to say or to complain. The article is in very good shape. Remember, also, that you have already 4 agreement. So my final opinion would not change anything, unless someone is afraid about my involvement :
http://www.openpsych.net/forum/showthread.php?tid=75

P.S.: If you want to keep the latest version, you should probably modify all references about the tables after page 16, because you're actually jumping from table 8 to table 10 (since you've deleted the 9th). So, in every pages of your text after page 16, you should replace table 17 by 16, and 16 by 15 etc.
Admin
Can just call it "self-reported English ability". That works for you?
It's not time wasting, and I think your way of doing things (i.e., rushing) is no good, such as, removing the analysis because otherwise will delay the date of publication....In any case, if you can justify why you drop it (other than "waste of your time") I have nothing more to say or to complain. The article is in very good shape. Remember, also, that you have already 4 agreement. So my final opinion would not change anything, unless someone is afraid about my involvement


All of the analyses in the discussion are tangential with regards to the main body of the work. Ordinarily, they would not be allowed. Therefore, each analysis in the discussion needs minimal justification. I think that what was analysis 9 could have been defended, insofar as it was treated as a sort of somewhat informational passing remark. But, as you noted, the method was questionable. Since you made issue of that point, I deleted it instead of trying to defend the method, which while not impossible, would be difficult given your level of expectation for such an analysis. That is, analysis 9 added something in terms of information but subtracted something in terms of quality of analysis; I originally retained it because I had a low quality standard for that presented in the discussion; I treated it as a passing thought-analysis. Given your expressed standards, to mitigate the methodological problems, I would have to add caveat after caveat and explain the methodological issues, which is a waste of time/effort give the scope and objective of the paper -- a second order tangent.

To put this another way: the extended discussion, with all the sub analyses, is both unorthodox and unnecessary. Therefore, justification is not needed to cut material, especially when that material is questionable.

...

I think that we are good here.
Admin
You already have 4 reviewers agreeing, I think. So you can proceed with publication any time you wish.
You already have 4 reviewers agreeing, I think. So you can proceed with publication any time you wish.


I am waiting for Meng Hu's explicit approval, because I value his opinion.

Fixed formatting -Emil
I compare the latest version with the one given one day ago (file #302 vs #303), and I see all the changes related to the numbering of tables have been made correctly. Also, I have nothing to complain about your decision to remove table 9 (of the older files) given your explanation.

I approve the publication of course.
Here is the final edited version.
Admin
http://openpsych.net/ODP/2014/07/ethnicrace-differences-in-aptitude-by-generation-in-the-united-states-an-exploratory-meta-analysis/

I did not have time to publish this yesterday, so now the date in the paper and on the website are off by 1 day. It matters very little I think.

Please check to see if there is something wrong. If not, let me know and I will move this thread to the post-review forum.
http://openpsych.net/ODP/2014/07/ethnicrace-differences-in-aptitude-by-generation-in-the-united-states-an-exploratory-meta-analysis/

Please check to see if there is something wrong. If not, let me know and I will move this thread to the post-review forum.


Could you change:

Published: July 26th, 2014,
to
Published: July 26th, 2014

I did not have time to publish this yesterday, so now the date in the paper and on the website are off by 1 day. It matters very little I think.


Nonetheless, I attached a version that says "published July 26th" -- in case that you wish to use that instead.

Otherwise, we are good.