Back to [Archive] Post-review discussions

[ODP] The Global Hereditarian Hypothesis and the NLSF
I don't really like being a reviewer, but the newer version is noticeably


New version attached.
[attachment=87]
I don't really like being a reviewer, but the newer version is noticeably


Newer version attached.
Thanks to our authors for another revision and humoring this persnickety reviewer. The paper keeps improving.


p1

"<b>It has been pointed out</b>, [H]owever, <b>that</b> the existence of g differences on the national level doesn't logically imply the existence of aggregate individual level g differences [4]. It could be, for example, that the national level g differences represent an emergent property which do not characterize differences on the individual and subgroup level."

Bolded portion unnecessary. Also I'm not sure this reference to Wicherts and Wilhelm is necessary since your paper does not quote them or address their concerns about Rindermann's national g factor (i.e. psychometric equivalence).

“The assumption underlying the national migrant research program is that national differences in g represent aggregate individual ones”

The real assumption underlying the national migrant research program is that national differences in IQ really are differences in g instead of some illusionary ecological correlate with test performance. That was Wicherts’ and Wilhelm’s concern. It is indisputable that national IQ averages are aggregate individual differences, because that is precisely how the numbers are calculated!



p2

“… IQs of their parents' nations of origin”

“Furthermore, the correlation between the crime rate by nation of origin in Denmark and in Norway was very high (at about .7-.8, N=20)”

Should remove. Non-IQ correlations are beyond the scope of your paper.


“As noted above Lynn and Vanhanen (2002) not only proposed that the National g differences represent aggregate individual ones, but also that these differences have a hereditarian basis.”

Again the issue is not aggregate vs. individual, but IQ vs. g. Lynn and Vanhanen reported IQ scores and assumed that the disparate measurements were capturing a common mental trait and not, e.g., a Test Familiarity Quotient.


p3

“Using this data set we test, at least to a limited extent, both the spatial transferability hypothesis and the generational transferability hypothesis, both of which are presupposed by the global hereditarian one.”

Giving every sub-hypothesis a name is awkward, especially since your title emphasizes the GHH. There is added ambiguity because you refer to “our hypothesis” (singular) in other parts of the paper as if you are testing one general hypothesis. How about ‘we use this dataset to test the global hereditarian hypothesis by looking at the spatial and generational transferability of migrant IQ scores.’


“if the migrant group cognitive ability differences represented differences in general intelligence owing to national differences in the same trait.”


p4

“(g) Are migrant group entrance test scores associated with migrant group skin
color scores?

(h) If so, do measures of national cognitive ability substantially mediate this
association?”

This is the third time I’m asking you to explain the reasoning behind these analyses in your paper. Nowhere in your Introduction or Research Question do you discuss the relevance of skin color to your hypotheses. (in contrast, you do explain the reasoning behind skin color question (f): establishing representativity).



“The National Longitudinal Survey of Freshman is described as follows[14]:”

Awkward sentence. The quote repeats the name of the dataset as well as the word ‘follows’.


“this is not an ideal sample for testing our hypothesis“

Hypotheses?

“and, thus, make for a more robust test of the ST hypothesis.”

What about the “generational transferability hypothesis”?


“… is provided in the supplementary material.”

Do you have this written yet?


“This analysis yielded few interesting results.”

I think you mean a few! Either way, this sentence is unnecessary.




p5

You added variables for Table 5, but Table 3 is still not clearly labeled. ‘One U.S. born’. One what? ‘Score mean’. What score?

Table 4 is a little messed up. ‘Reported Race’ is cut off by ‘Gen1’. There is no opening parenthesis before ‘Gen 1’. ‘(Vrs. Gen 3)’ is unclear and apparently includes a nonstandard abbreviation.

Please add some text explaining specifically what is shown in Table 3 and Table 4.


p8


“…questions in the affirmative.”

“This finding is consistent with previous research.”

Citations?
Thanks to our authors for another revision and humoring this persnickety reviewer. The paper keeps improving.


I can make most of the other changes. I can't change this:

“As noted above Lynn and Vanhanen (2002) not only proposed that the National g differences represent aggregate individual ones, but also that these differences have a hereditarian basis.”

You said:

Again the issue is not aggregate vs. individual, but IQ vs. g. Lynn and Vanhanen reported IQ scores and assumed that the disparate measurements were capturing a common mental trait and not, e.g., a Test Familiarity Quotient.

The issue is aggregate individual gversus national emergent G. I thought that I was pretty clear on that. And we do address Wicherts concern in a round about way. If national g differences were emergent g, and if individuals between nations and migrants did not differ in g but rather only in non-g, then you would not expect migrant IQs to predict like IQ within populations, since, within populations, the predictive backbone of IQ is g.
Thanks to our authors for another revision and humoring this persnickety reviewer. The paper keeps improving.


Corrections made. Supplemental and SPSS file attached.
A few final suggestions.

On page 6 you describe skin reflectance as "a measure of national skin color". On page 4 you might clarify 'NationalSkinReflect -- Country's average skin reflectance (a skin color measure)'.

p9
"These correlations were statistically significant yet lower than the previously reported ones."

Perhaps a few supporting citations.

"performance of third (or more) generation immigrants.."

Double period.

RE: " I can't change this"

That's fine. It's not how I would choose to operationalize the problem, but it's not my paper.

John+Emil,
Thank you for making the requested changes. It's a good and important paper, and I think it is ready for publication.
John+Emil,
Thank you for making the requested changes. It's a good and important paper, and I think it is ready for publication.


The newly edited version is attached above. Some analyses were redone and tables were edited accordingly.

Thank you for the patient and helpful review.
I went over the paper again.

p1

"even previously critical researchers have come to acknowledge the reality of the national differences[2]"

Needs period.

p2

"the spatial transferability hypothesis (ST)".

Opening quotation mark backwards.

"... suggesting differences in g between the individual citizens of several nations."

Unclear sentence.


"As noted, above Lynn and Vanhanen (2002)"

Bad comma.

"On the Use of Teleological Principles in Philosophy (1788)"

Opening quotation mark backwards.

p3

"A hereditarian hypothesis then predicts, with regards to national, spatial transferability (ST) and generational transferability (GT)."

Incomplete sentence.

"Migrants bring their national general intelligence with them"

Problematic phrasing.


"but a hereditarian hypothesis requires it to be present."

Bolded portion unnecessary.

"spatial transferability hypothesis and the generational transferability hypothesis.."

Double period.

p5

The red color for your updated table 3 is incongruent and distracting.

p9-10

"the possibility that our migrant cognitive ability differences are predictive non-g differences."

Incomplete sentence.
Admin
The quote backwards forward thing is a really minor thing. It is that way in standard LATEX but it's probably fixable if one looks for a solution.
I went over the paper again.


Again, thank you for your diligent editorial help. Corrections were made. The title was also changed. Edit8 is attached.
Admin
The caption text for tables from analysis 3 and 4 needs to be changed. "3a" implies that there is at least a 3b too, but there isn't (anymore).
The paper was worth reading. But before validating your publication, I have some few questions and comments.

1) What I like :

Your robustness check, such as

To compute selectivity we took the difference between the parents' standardized mean educational levels as reported in the NLSF survey and the standardized average schooling years for the origin countries.


2) What I do not like :

MC2014NGMAT -- Meng Hu and Chuck's (2014) National GMAT scores.


It will not be clear to everyone, especially those who do not follow Human Varieties blog. Instead of "Meng Hu & Chuck" you should point out to reference n°8 or n°10 (your "quick post on L&V").

In the vast majority of instances, both parents hailed from the same country; when not, though, we effectively split their representation.


The sentence is not clear. What do you mean by "split their representation" ? Personally, I would just average the parents scores. It's what you did, no ?

Since our per national group sample sizes varied widely, ranging from 0.5 to 136.5, we reran the analyses with minimal per group migrant sample sizes of 5, 10, 15, 20; doing so generally nontrivially increased the correlations. This suggests that our correlations are nontrivially attenuated by sampling error.


Are you sure it's not something like range restriction artifacts ? I would like to know what are these countries excluded. If they belong to one of the extreme scores, you can get some restriction in score ranges. Also, the sentence is not clear. When you say "doing so generally nontrivially increased the correlations", are you saying that increases in minimal N leads to increase in correlations ? If so, I would agree, because the reverse is just odd.

You've written :

The national cognitive ability x GPA associations were significantly mediated by migrant test scores.


I would change it as follows : "were partially mediated by migrant test scores.".

And here :

This process resulted in each individual being assigned a maternal national IQ, a paternal national IQ, a maternal national color score, and a paternal national color score.


The parents' skin color is from the skin reflectance (national) data. I may be skeptical. You use skin color for nations, not the individuals (here, the parents). There might be some degree of inaccuracy that must be kept in mind. It's not a criticism. I just ask myself. I have no certainty.

In table 9, you put "test score" two times, instead of test score and GPA.

In your last reference (n°17), the referenced authors should be, "Thomas R. Coyle, Jason M. Purcell, Anissa C. Snyder" (Purcell is second, not first.)

3) Finally, some recommendations :

If I were you, I would display the variables used and the syntax as well. This can ease replication if someone here (me?) wanted to check your analysis.

The log transformation of GPA, skin color, etc. Is there no other way ? I ask because interpretability of transformed data given unstandardized regression coefficient is difficult (to me). Perhaps you can transform back the variable into its original scale but I was never able to do this without keeping the variable normally distributed when the original variable was not normal (Tell me if you found a way to do it). Why I focuse on unstandardized correlations is because the more I use it, the more I like it, while the more I use standardized regression coefficient, the more I hate it. Its only advantage is the comparability among the independent var., but the non-standardized coefficient gives you a better approximation of the true real-world effect of your independent var. Because it seems to me that more often than not, the standardized regression (or correlation) tends to under-estimate the effects. Generally, 10% correlation is thought to be extremely small, and yet in some instances I was able to get meaningful or at least non-trivial effects. It's not good news that a lot of researchers focused so much (and sometimes, exclusively) on standardized regression coefficients.

The correlation of 70% between the two skin color measures gives an approximation of the reliability of the skin color used in NLSF. Unreliability reduces correlation and effect sizes in general. I believe this reliability level (70%) is close to my expectation. See for example :

Measurement and perception of skin colour in a skin cancer survey
https://genepi.qimr.edu.au/contents/p/staff/CV087.pdf

It gives the correlation with self-assessed skin color with dermatologists assessed natural skin color. I have no idea what the latter is but the correlation is 65%/66% for males and females.

Comparing Alternative Methods of Measuring Skin Color and Damage
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2702995/pdf/nihms-110193.pdf

That study may not be relevant, because it gives the correlation of self reported "natural" skin color (i.e., without tanning) with "natural" skin color rated by an observer (i.e., a research assistant) but given table 2, the correlation is not high, only 0.16 or 0.10, depending on whether you look at the natural or current skin color. However, if I'm not mistaken, table 2 also gives the correlation of current skin color between self-report and that evaluated by the observer. The correlation is 0.73. Not bad.

Questionnaire Items to Assess Skin Color and Erythemal Sensitivity: Reliability, Validity, and “the Dark Shift”
http://cebp.aacrjournals.org/content/19/5/1167.full

This one gives the test-retest of self-assessed skin color; 95%. To cite them : "For skin color, there was 95.5% agreement (versus expected agreement of 78.9%) between assessments (n = 264, k = 0.78, P < 0.001)." That seems large. That study either don't use the same variable of skin color, which in your case is a color card (1-10). Perhaps if you can find the time to search for studies that give estimates of reliability in skin color card using the exact same scale (1-10)...

Also, if I were you, each time you use Lynn & Vanhanen national IQ estimates, I would preferably use Wicherts estimates as well. You know the critics right ? You add this robustness check and you can both add support for the strength of your analysis and validate L&V data. Remember this is what Christopher Eppig and Garett Jones have done, and they confirmed that they don't change their estimates and conclusion. You should really do it.

Wicherts data is available here :
http://wicherts.socsci.uva.nl/

Just look for :
Wicherts, J. M., Borsboom, D., & Dolan, C. V. (2010). Why national IQs do not support evolutionary theories of intelligence. Personality and Individual Differences, 48, 91-96. nationalIQPAID.pdf DATASET in SPSS format (use "save as" to download)
Admin
Are you sure it's not something like range restriction artifacts ? I would like to know what are these countries excluded. If they belong to one of the extreme scores, you can get some restriction in score ranges. Also, the sentence is not clear. When you say "doing so generally nontrivially increased the correlations", are you saying that increases in minimal N leads to increase in correlations ? If so, I would agree, because the reverse is just odd.


Restriction of range would decrease correlations (by reducing variance), but removal of sampling error by removing more error prone data points will increase the correlation, just as was found.

The log transformation of GPA, skin color, etc. Is there no other way ? I ask because interpretability of transformed data given unstandardized regression coefficient is difficult (to me). Perhaps you can transform back the variable into its original scale but I was never able to do this without keeping the variable normally distributed when the original variable was not normal (Tell me if you found a way to do it). Why I focuse on unstandardized correlations is because the more I use it, the more I like it, while the more I use standardized regression coefficient, the more I hate it. Its only advantage is the comparability among the independent var., but the non-standardized coefficient gives you a better approximation of the true real-world effect of your independent var. Because it seems to me that more often than not, the standardized regression (or correlation) tends to under-estimate the effects. Generally, 10% correlation is thought to be extremely small, and yet in some instances I was able to get meaningful or at least non-trivial effects. It's not good news that a lot of researchers focused so much (and sometimes, exclusively) on standardized regression coefficients.


Standardized coefficients can be compared from study to study for the purpose of a meta-analysis. Unstandardized generally cannot.

Also, if I were you, each time you use Lynn & Vanhanen national IQ estimates, I would preferably use Wicherts estimates as well. You know the critics right ? You add this robustness check and you can both add support for the strength of your analysis and validate L&V data. Remember this is what Christopher Eppig and Garett Jones have done, and they confirmed that they don't change their estimates and conclusion. You should really do it.
Wicherts data is available here :
http://wicherts.socsci.uva.nl/
Just look for :
Wicherts, J. M., Borsboom, D., & Dolan, C. V. (2010). Why national IQs do not support evolutionary theories of intelligence. Personality and Individual Differences, 48, 91-96. nationalIQPAID.pdf DATASET in SPSS format (use "save as" to download)


I downloaded the file. However, it does not contain Wichert's IQ estimates, just LV2006's.

But you raise a good point. A good way to compare LV's NIQ's and Wichert et al's is to compare them with different datasets. I have access to plenty of cross-cultural data by now and so we can compare the two NIQ datasets.

I will ask Wicherts if he can give me his NIQ's.
Admin
Wicherts told me that his data are only those found in Table 5:

Wicherts, J. M., Dolan, C. V., & Van der Maas, H. L. J. (2010). A systematic literature review of the average IQ of sub-Saharan Africans. Intelligence, 38, 1-20. wicherts2010AFR.pdf

Most of these countries do not feature in my earlier analyses (crime in DK and Norway). However, I added them to the dataset file from here.

http://www.openpsych.net/forum/showthread.php?tid=33

The correlations are not changed much. They are slightly lower with Wicherts' corrections to the LV12 (r's 807, 816 to 786, 801).
Standardized coeff. are indeed useful for meta-analysis. But my contention is not invalidated here. Unstandardized coeff. is superior than standardized if you want a better look at the real effect. For Wicherts data, i remember I have a file with his estimates, along with all other variables contained in the file referred above. So, I thought it contained his national IQ estimates as well. My memory is not good. I have probably used the data contained in the article you referred to, but my memory is surely defectuous...

The correlations are not changed much. They are slightly lower with Wicherts' corrections to the LV12 (r's 807, 816 to 786, 801).


Are you referring to your article, or the one I reviewed here ? Normally, Wicherts data is just a correction of L&V african IQs. But since John must have use African IQ, I believed you should have used L&V and Wicherts data as well.
The paper was worth reading. But before validating your publication, I have some few questions and comments.


Well, no one can accuse my reviews of being lackadaisical, can they?

Criticisms:

MH: It will not be clear to everyone, especially those who do not follow Human Varieties blog. Instead of "Meng Hu & Chuck" you should point out to reference n°8 or n°10 (your "quick post on L&V

Reference added.

MH: The sentence is not clear. What do you mean by "split their representation" ? Personally, I would just average the parents scores. It's what you did, no?

That is what I did. I rewrote the passage as: We next looked at the association between three measures of national cognitive ability, first generation test scores, second generation test scores, combined first and second generation test scores, and second generation cumulative GPA scores. With regards to migrants, we decomposed test scores and GPA scores separately by biological mother's and biological father's nation of origin; we then averaged the mother's and father's nation of origin scores. In the vast majority of instances, both parents hailed from the same country; when not, though, we effectively split their representation. Readers are referred to the supplementary file for an example of the method employed.

MH: Are you sure it's not something like range restriction artifacts ? I would like to know what are these countries excluded. If they belong to one of the extreme scores, you can get some restriction in score ranges. Also, the sentence is not clear. When you say "doing so generally nontrivially increased the correlations", are you saying that increases in minimal N leads to increase in correlations ? If so, I would agree, because the reverse is just odd.

More national groups, less range restriction, more sampling error. I attached the supplementary and SPSS file here:
http://www.openpsych.net/forum/showthread.php?tid=17&pid=283#pid283

I rewrote this as: Since our per national group sample sizes varied widely, ranging from 0.5 to 136.5, we reran the analyses with minimal per group migrant sample sizes of 5, 10, 15, 20; an increase in the minimal per group migrant sample size generally led to an increase in the correlations. This suggests that our correlations are nontrivially attenuated by sampling error.

MH: I would change it as follows : "were partially mediated by migrant test scores.

Done.

MH: "The parents' skin color is from the skin reflectance (national) data. I may be skeptical. You use skin color for nations, not the individuals (here, the parents). There might be some degree of inaccuracy that must be kept in mind. It's not a criticism. I just ask myself. I have no certainty."

We used the NLSF measure of student skin color AND skin color (reflectance) based on the parents' country of origin. We wanted to see to what extent two corresponded i.e., if the students who reported having Rwandan parents were of a color similar to that a the typical Rwandan. The national-student skin color correlations was used, in turn, to judge the reliability of our method -- as the test score correlation was bound to be lower than this -- and the representativity of our sample.

MH: "In table 9, you put "test score" two times, instead of test score and GPA."

Fixed.

MH: In your last reference (n°17), the referenced authors should be, "Thomas R. Coyle, Jason M. Purcell, Anissa C. Snyder" (Purcell is second, not first.)

That's a mystery. I kept trying to change the bib file but nothing changed when I ran LaTeX. So I decided to leave it as is. I'm not going to waste anymore time on the issue.

MH: If I were you, I would display the variables used and the syntax as well. This can ease replication if someone here (me?) wanted to check your analysis.

Refer to the supplemental file.

MH: The log transformation of GPA, skin color, etc. Is there no other way ?

No, not at this time. I am trying to figure out robust regression with R. But that would have to be for another time, if ever.

MH: "Measurement and perception of skin colour in a skin cancer survey"

We weren't primarily interested in skin color. We just used it as a rough index of group representativity. So I don't think any adjustments need to be made. I'll keep the references in mind, though.

MH: "Also, if I were you, each time you use Lynn & Vanhanen national IQ estimates, I would preferably use Wicherts estimates as well."

This is why we included Altinok et al.’s and M&C's estimates. And we also attached the SPSS file in case someone wants to try the analysis with alternative values. I don't plan on adding other estimates.

But for future analyses would should probably try to get Rindermann's 2014 estimates.

Edited files attached below.
[hr]
Are you referring to your article, or the one I reviewed here ? Normally, Wicherts data is just a correction of L&V african IQs. But since John must have use African IQ, I believed you should have used L&V and Wicherts data as well.


I don't think that Wicherts data are particularly exceptional. For one, this data set doesn't include recent African Achievement data.

Maybe Emil could write H. Rindermann and request his recent estimates. We could then use those scores from now on as "the official national cognitive scores" -- since they are the most comprehensive (IQ+Achievement), recent (2014), and meticulously compiled (H.R.).

That said, we did include multiple data sets and we noted: "Since our national cognitive measures were similarly predictive, for the remainder of the discussion, we simply report result based on L\&V's (2012) national IQs."
I'm OK with your response, and your changes.

Besides, I took the opportunity to look at your data. The distribution of Skin color NLSF is not dangerously skewed. When your correlation don't differ after transformation, my recommendation is not to transform the data, because it loses its original metrics. And personally, I don't know how to interpret unit-changes in log, whereas it's much easier in unit of skin color (original scale).

In general, my criticism is that you rely too much on correlational analysis. As you noted, normality assumption is sometimes difficult or impossible to meet. Thus, other method can be used. You can just compute the mean of cognitive scores by each categories of, say, skin color, parental education, or else. And derive your conclusion about how a shift in categories would change the outcome variable (could be GPA or ACT/SAT...). If you have enough categories, e.g., 10, or more, you can perform that correlation, even in EXCEL. This is what Orlich & Gifford (2006) did.

Test Score, Poverty and Ethnicity - The New American Dilemma (Orlich, Gifford, 2006)
http://www.cha.wa.gov/?q=files/Highstakestesting_poverty_ethnicity.pdf

Look at their tables 1 & 2. Their text says that the correlation between SAT and parental income is near 100%. How so ? Impossible you say ? At first glance, that was also my thoughts, but after closer look, their magical trick is easy to caught. They've done exactly what I recommended you to do (if you can, of course). They have, consequently, derived the SAT scores by income categories (10 in total) and computed the correlations, generally around 0.97 and 0.98. If you don't believe me, try to calculate the correlation with SAT scores by income categories in table 2, and for the (categorical) income variable its values must range from 1 to 10. Two columns with 10 rows each. And you'll see a correlation this high. I verified it before. That's a trick you can easily do. As you've said, correlations using the groups greatly increases the correlations, probably because it removes the random fluctuations due to measurement errors.

So, instead of using individual's skin color "scores", you can compute the mean SAT, GPA or else, by skin color categories. If the skin color variable scales ranges from 1 to 10, that gives you 10 rows, enough for correlational analysis from this approach.

The fact you don't want to use Wicherts data may not protect you from criticism. If adding more tables is lot of works, why not just compute the correlations and just say (in the text) that it didn't change your results ? Also, the fact you use Altinok achievement score is good, but it's not IQ test. Although it can protect you somewhat, it's more definitive to use Wicherts data.

These are only suggestions, however, that can improve the article in my opinion. These are optional, because your article in its actual form (version 8 if I'm not mistaken) is already good enough for publication. I don't have any serious criticism, either on the statistical stuff or the conclusions and interpretation you derived from the analysis. Thus, no objection for its publication. You have my vote and agreement. That means, you can either publish it now, or add some more analyses to it before publishing it. In both cases, you have my vote.
This is what Orlich & Gifford (2006) did.


And here is what I did...

So, forget what I said. Sorry for bothering you.
Admin
Group level correlations always go to 1.0 when there 1) no other relevant variables, and 2) are no error sources, see Lubinski (1996).

Lubinski, David, and Lloyd G. Humphreys. "Seeing the forest from the trees: When predicting the behavior or status of groups, correlate means." Psychology, Public Policy, and Law 2.2 (1996): 363.
Group level correlations always go to 1.0 when there 1) no other relevant variables, and 2) are no error sources, see Lubinski (1996). Lubinski, David, and Lloyd G. Humphreys. "Seeing the forest from the trees: When predicting the behavior or status of groups, correlate means." Psychology, Public Policy, and Law 2.2 (1996): 363.


I found some other easy to use survey data: http://www.princeton.edu/cmd/data/cils-1/
The survey has first and second generation immigrant data and easy to use achievement test scores along with all sorts of other variables. Do you think we should just add another analysis to the one above?

I simply dread having to write up another paper.

Generally, for the US, at least, there is no shortage of data. Off the top of my mind all of the following have country of origin, race and/or skin color, test score and outcome data: the New Immigrant Survey, General Social Survey, NLSY97, and Add health. There are also many surveys specifically about immigrants: http://www.princeton.edu/cmd/data/ Some of these such as "Latin American Migration Project (LAMP)" (if I recall correctly) give pre migration measures of ability relative to the home populations.

We are limited then by time. And much of that is going to writing up the results. So I wonder if we should try to squeeze another study into the one we have.