Back to Post-publication discussions

Income and Education Disparities Track Genetic Ancestry

Submission status
Published

Submission Editor
Noah Carl

Authors
Meng Hu
Emil O. W. Kirkegaard
John Fuerst

Title
Income and Education Disparities Track Genetic Ancestry

Abstract

Structural racism has often been invoked to explain observed disparities in social outcomes, such as in educational attainment and income, among different American racial/ethnic groups. Theorists of structural racism typically argue that racial categories are socially constructed and do not correspond with genetic ancestry; additionally, they argue that social outcome differences are a result of discriminatory social norms, policies, and laws that adversely affect members of non-White race/ethnic groups. Since the examples of social norms and policies commonly provided target individuals based on socially-defined race/ethnicity, and not on genetic ancestry, a logical inference is that social disparities will be related to socially-defined race/ethnicity independent of genetically-identified continental ancestry. In order to evaluate this hypothesis, we employ admixture-regression analysis and examine the independent influences of socially-identified race/ethnicity and genetically-defined ancestry on the educational attainment and income of parents, using data from a large sample of US children. Our study focuses on self-identified Whites, Blacks, Hispanics, and East Asians in the United States. Analyses generally show that the association between socially-identified race/ethnicity and outcomes is mediated by genetic ancestry and that non-White race/ethnicity is unrelated to worse outcomes when controlling for genetic ancestry. For example, conditioned on European genetic ancestry, Americans socially-identified as Black and as Hispanic exhibit equivalent or better social outcomes in both education and income as compared to non-Hispanic Whites. These results are seemingly incongruent with the notion that social outcome differences are due to social policy, norms, and practices which adversely affect individuals primarily based on socially-constructed group status

Keywords
race, income, education attainment, Structural racism, genetic ancestry

Supplemental materials link
https://osf.io/ys5db/

Pdf

Paper

Typeset Pdf

Typeset Paper

Reviewers ( 0 / 0 / 2 )
Peter Frost: Accept
Auguste Comte: Accept

Fri 16 Jun 2023 20:51

Reviewer

This is an excellent study, and I hope to see more like it. Just a few comments:

1. The Introduction is too long and reads like a prequel to the Discussion. Ideally, the Introduction to a paper should be as concise as possible, being essentially a review of the literature and how the current paper relates to previous studies.

2. Previous studies have two findings that are not replicated in the current one:

a) European Americans with some Amerindian ancestry have higher cognitive ability, on average than those who do not. This is probably due to the fact that Amerindian ancestry is higher in the earliest wave of European immigrants, i.e., settlers of British, French, and Dutch origin. They had more opportunities for intermixture with Amerindian groups than later immigrants. This is particularly the case in the study by Fuerst et al. (2021) in Pittsburgh, where European Americans are, for the most part, either "old stock" with some Amerindian admixture or Italian Americans who originally came almost entirely from the south of Italy, a region that has a lower polygenic score for educational attainment (Piffer and Lynn, 2022).

b) African Americans with no European admixture have a higher polygenic score than those who have some, i.e., the Igbo effect (Fuerst and Hu, 2021).

Why were these two findings not replicated? Is the reason because the sample had a more diverse European American population and fewer immigrants from Nigeria?

References

Fuerst, J., E.O.W. Kirkegaard and D. Piffer. (2021). More research needed: There is a robust causal vs. confounding problem for intelligence-associated polygenic scores in context to admixed American populations. Mankind Quarterly 62(1): 151-185. http://doi.org/10.46469/mq.2021.62.1.10

Fuerst, J. and Hu, M. (2021). Genetic Ancestry and General Cognitive Ability in a Sample of American Youths. Mankind Quarterly 62(1), 186-216.

Piffer, D., and R. Lynn. (2022). In Italy, North-South Differences in Student Performance Are Mirrored by Differences in Polygenic Scores for Educational Attainment. Mankind Quarterly 62(4), Article 2. https://doi.org/10.46469/mq.2022.62.4.2   

Author | Admin

Thank you for your reply. We agree with your first remark, and we have shortened the introduction section accordingly.

Regarding the second point, a) you're correct that cognitive ability and amerindian ancestry was slightly positive among Whites in the study by Fuerst et al.'s 2021 'More Research Needed' . However these authors noted that the variance in amerindian ancestry is very low, therefore resulting in large standard errors. It may be the case, as you suggest, that those Phillidelphia Whites with Amerindian ancestry were selective. However, given the trivial amount of ancestry in the White group and the nonsignificant association, the authors insisted that the results to date are ambiguous. 

Regarding the second point, b) the study you're refering to, Fuerst, Hu, & Connor (2021) "Genetic Ancestry and General Cognitive Ability in a Sample of American Youths", examined the effect of genetic ancestry on SES outcomes, not polygenic scores. Perhaps you were refering to another paper?

Finally, I want to insist here that our paper merely evaluates the effect of SIRE, while controlling for ancestry, on education attainment and income. Not intelligence. While it is true that intelligence, education, and income are positively correlated, it doesn't mean the regression will produce similar effects on these variables as dependent outcomes. Still, in Fuerst, Hu & Connor (2021) we were still able to show that SIRE categories were not related to cognitive abilities while african, amerindian ancestry were negatively associated with cognitive abilities. This is similar to the result in the present paper.

Bot

Authors have updated the submission to version #2

Reviewer
Sorry, you're right about Fuerst and Hu (2021). They were referring to cognitive ability, not polygenic scores: 

Figure 2 shows the regression plot for European ancestry and g-scores among Black children. European ancestry is significantly (r = .10, N = 1690) associated with g scores. ... Additionally, the Loess regression line indicated a possible curvilinear relation with a slight uptick in scores at the lowest European ancestry decile. Further analysis showed that this was due to relatively high scores of individuals from African immigrant families.

Wouldn't income and educational attainment also show this same uptick among African Americans who have no European ancestry? According to Table 1, about a quarter of African Americans are of immigrant origin. Are income and educational attainment higher among them than among non-immigrant African Americans? If so, how does this factor affect your findings if you look solely at non-immigrant African Americans? I imagine it would strengthen your findings.

Author | Admin

We agree this is an interesting question.

We must admit, though, that we no longer have the relevant raw data to run supplemental analysis, so this possibility will have to be explored in another study. But we previously outputted regression plots (along with the correlation) for the subgroups - they show a positive association between European ancesty and outcomes among Blacks. We have now added those to the supplementary file. https://osf.io/ys5db/

In these original plots, unfortunately, 1) the samples of blacks were for all black families with two biological parents, irrespective of immigrant status, 2) there were no loess lines to see exactly whether there was an upstick in outcome values at the lowest european ancestry decile, but after eyeballing these plots it doesn't seem like there is an upstick in either outcome values.

We have finally added this note in the paper about these bivariate correlations between outcome and ancestry for blacks and for hispanics. "In the supplementary file, we also provided the bivariate correlation of European ancestry with income and with educational attainment for two biological Black families (respectively, .19 and .18) and for two biological Hispanic families (respectively, .43 and .54)"
 

Reviewer
I received an email asking me to comment again on this submission. I have little more to add, other than to say that it suffers from excessive wordiness. Why, for instance, do you write "Yet, nonetheless," on page 3? Use one word or the other but not both. I'm also skeptical of the claim that a person with 2% African ancestry would be considered "Black American." A lot of White Americans would meet that threshold, e.g., Jimmy Carter, Hillary Clinton, Carol Channing, Ava Gardner, J. Edgar Hoover (by a long shot), Babe Ruth (by a long shot), and perhaps Dwight D. Eisenhower.
 
 Ava Gardner
 
In any case, I approve this manuscript for publication.
Author | Admin

Thanks for your reply. I have fixed some of these excessive wordiness. I could only find a few of them.

About your comment here :

I'm also skeptical of the claim that a person with 2% African ancestry would be considered "Black American." 

It's possible the table 1 is confusing. For us, the table was easy to follow. If you think otherwise, we might eventually modify it. The 2% value of african ancestry concerned non hispanic Whites. The racial and ethnic groups are displayed across the columns and the rows display the variables, not the opposite. For non hispanic blacks, it's actually 25% european and 70% african ancestry.

EDIT: I misread your comment here, as you said [considered "blacks"] not whites.

Reviewer

Minor issue:

East Asian ancestry shows a weak non-statistically significant positive association with education as compared to European ancestry (b = 0.13, p = .363).

A non-statistically significant association shouldn't be described as "weak" or "positive"—no association was found. Should be something like: "East Asian ancestry shows no statistically significant association with education..."

East Asian ancestry is also non-statistically significantly negatively associated with income as compared to European ancestry (b = -0.23, p = .102).

Same comment probably applies to p = .102.

Author | Admin

Thank you for your observation. Indeed, the phrasing may sound a bit confusing. We are going to change these sentences as follows, for the next update: 

"East Asian ancestry shows a weak positive association (but non-significant) with education"

"East Asian ancestry also shows a negative association (but non-significant) with income as compared to European ancestry"

Reviewer
Replying to Meng Hu

Thank you for your observation. Indeed, the phrasing may sound a bit confusing. We are going to change these sentences as follows, for the next update: 

"East Asian ancestry shows a weak positive association (but non-significant) with education"

"East Asian ancestry also shows a negative association (but non-significant) with income as compared to European ancestry"

You're still describing it as a "weak positive association." p = .363 means no association was found.

Author | Admin

Actually, the p value does not say or imply anything about association. It's all about the probability of the found association.

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretationsEuropean journal of epidemiology31, 337-350.

  1. A null -hypothesis P value greater than 0.05 means that no effect was observed, or that absence of an effect was shown or demonstrated. No! Observing P > 0.05 for the null hypothesis only means that the null is one among the many hypotheses that have P > 0.05. Thus, unless the point estimate (observed association) equals the null value exactly, it is a mistake to conclude from P > 0.05 that a study found “no association” or “no evidence” of an effect. If the null P value is less than 1 some association must be present in the data, and one must look at the point estimate to determine the effect size most compatible with the data under the assumed model.

Dahiru, T. (2008). P-value, a true test of statistical significance? A cautionary note. Annals of Ibadan postgraduate medicine, 6(1), 21-26.

The P value is defined as the probability under the assumption of no effect or no difference (null hypothesis), of obtaining a result equal to or more extreme than what was actually observed.

...

Statistical significance implies clinical importance. (No. Statistical significance says very little about the clinical importance of relation. There is a big gulf of difference between statistical significance and clinical significance. By statistical definition at á = 0.05, it means that 1 in 20 comparisons in which null hypothesis is true will result in P < 0.05!.

Reviewer
Replying to Meng Hu

Actually, the p value does not say or imply anything about association. It's all about the probability of the found association.

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretationsEuropean journal of epidemiology31, 337-350.

  1. A null -hypothesis P value greater than 0.05 means that no effect was observed, or that absence of an effect was shown or demonstrated. No! Observing P > 0.05 for the null hypothesis only means that the null is one among the many hypotheses that have P > 0.05. Thus, unless the point estimate (observed association) equals the null value exactly, it is a mistake to conclude from P > 0.05 that a study found “no association” or “no evidence” of an effect. If the null P value is less than 1 some association must be present in the data, and one must look at the point estimate to determine the effect size most compatible with the data under the assumed model.

Dahiru, T. (2008). P-value, a true test of statistical significance? A cautionary note. Annals of Ibadan postgraduate medicine, 6(1), 21-26.

The P value is defined as the probability under the assumption of no effect or no difference (null hypothesis), of obtaining a result equal to or more extreme than what was actually observed.

...

Statistical significance implies clinical importance. (No. Statistical significance says very little about the clinical importance of relation. There is a big gulf of difference between statistical significance and clinical significance. By statistical definition at á = 0.05, it means that 1 in 20 comparisons in which null hypothesis is true will result in P < 0.05!.

If you regress anything on anything you will virtually always find an "association" in the data. Greenland et al. obviously aren't saying that all findings where p < 1 count as something that should be reported as "evidence." If that's how it worked, you could publish anything. They say "one must look at the point estimate to determine the effect size most compatible with the data under the assumed model." Suppose 1 person in a population of 10,000 jumps out of an airplane without a parachute and dies. A logistic regression says jumping out of airplanes without a parachute leads to death with p = .1. The fact that p > .05 doesn't reflect the totality of the evidence re what happens when you jump out of an airplane. That doesn't mean that anything goes and that Greenland et al. abandon the concept of statistical inference. In this context, your p value of .363 says that we can't reject the null hypothesis at anything close to the level of conviction that is accepted as meaningful in social science.

The second quote by Dahiru is warning against confusing p < .05 with evidence for an effect, which is not the issue here at all.

Author | Admin

Greenland did not say that anything counts as evidence but, rather, any reporting of an association should rely on point estimates (effect size) and certainly not on p value. This is because p is merely a test of an hypothesis. What they tried to explain in the paragraph is that statistical significance should not be taken as practical significance.

That doesn't mean that anything goes and that Greenland et al. abandon the concept of statistical inference.

Of course not. But they said it is dangerous to interpret non significance as saying "there is no association". The quote earlier explains this very clearly

it is a mistake to conclude from P > 0.05 that a study found “no association” or “no evidence” of an effect.

Now with respect to Dahiru:

The second quote by Dahiru is warning against confusing p < .05 with evidence for an effect, which is not the issue here at all.

He said more than this. He explained two things: That p stands for probability and measures how likely it is that any observed difference between groups is due to chance. Again, it has nothing to do with affirming there is "no association". The main point with the second paragraph is that statistical significance should not be confused with practical significance. If you think p>0.05 means no association, then you are confusing statistical and practical significance. 

Remember that p is due to two factors: sample size and effect size. Increasing one of these will decrease p. Merely increasing sample size will make p reach significance, but it doesn't tell anything about its practical significance.

Another point to be remembered. I usually see people (even some researchers) interpreting non significance as if it confirms the null. This is wrong. Dahiru for instance explains clearly:

Failure to reject null hypothesis leads to its acceptance. (No. When you fail to reject null hypothesis it means there is insufficient evidence to reject)

Claim that there is indeed "no association" merely based on p simply means that you accept the null. Failing to reject the null does not mean accepting it. Let's say I use a granger causality test, which evaluates the "null hypothesis of no causation" and I succeed to reject the null, can I conclude that there is indeed a causality? No, it means I reject the hypothesis of the proponents who argued that there was no causality. Likewise, if I can't reject the null it does not mean it is true, but that I fail to demonstrate it is false.

My point is : It is well known that the null can only be rejected, not proven. Anderson et al. (2000) goes even deeper and argues that null hypothesis testing is inconsistent and makes no sense with sample size changes. They advocate instead relying on: effect size and measure of its precision. The p is rather uninformative. Their conclusion read as follows: "We urge researchers to avoid using the words significant and nonsignificant as if these terms meant something of biological importance". Claim of "no association" is a claim of practical importance and this is wrong.

Anderson, D. R., Burnham, K. P., & Thompson, W. L. (2000). Null hypothesis testing: problems, prevalence, and an alternative. The journal of wildlife management, 912-923.

Reviewer
I've already approved this manuscript, but I can't help but agree with Reviewer #2. If p = .363, the chances are greater than one in three that there is no association at all. Occasionally, in the literature, the level of significance is set at p = .1, but that's really as far as one should go. One can get away with saying: "this association approaches significance." But even that wording cannot be used in this case.

 

Author | Admin

I must thank you for the feedback but the point I was discussing earlier with reviewer 2 is whether p>0.05 must be interpreted as implying there is "no association". The authors cited above disagree. I can cite more authors if need be, but I don't think there is an added value in doing so.

With respect to your specific point, for instance, Greenland et al. said that the number (be it larger or smaller) associated with the p value does not allow to make a clear inference about the probability of the null hypothesis. It's more about the probability of the model prediction.

The P value is the probability that the test hypothesis is true; for example, if a test of the null hypothesis gave P = 0.01, the null hypothesis has only a 1 % chance of being true; if instead it gave P = 0.40, the null hypothesis has a 40 % chance of being true. No! The P value assumes the test hypothesis is true—it is not a hypothesis probability and may be far from any reasonable probability for the test hypothesis. The P value simply indicates the degree to which the data conform to the pattern predicted by the test hypothesis and all the other assumptions used in the test (the underlying statistical model). Thus P = 0.01 would indicate that the data are not very close to what the statistical model (including the test hypothesis) predicted they should be, while P = 0.40 would indicate that the data are much closer to the model prediction, allowing for chance variation.

A large P value is evidence in favor of the test hypothesis. No! In fact, any P value less than 1 implies that the test hypothesis is not the hypothesis most compatible with the data, because any other hypothesis with a larger P value would be even more compatible with the data. A P value cannot be said to favor the test hypothesis except in relation to those hypotheses with smaller P values. Furthermore, a large P value often indicates only that the data are incapable of discriminating among many competing hypotheses (as would be seen immediately by examining the range of the confidence interval). For example, many authors will misinterpret P = 0.70 from a test of the null hypothesis as evidence for no effect, when in fact it indicates that, even though the null hypothesis is compatible with the data under the assumptions used to compute the P value, it is not the hypothesis most compatible with the data—that honor would belong to a hypothesis with P = 1. But even if P = 1, there will be many other hypotheses that are highly consistent with the data, so that a definitive conclusion of “no association” cannot be deduced from a P value, no matter how large.

The last sentence, which is another take on the wrong interpretation of "no association" claim, in the 2nd paragraph should clear any doubt.

Admin

 

Although the submission editor has not asked me to be one of the peer reviewers, one of the authors has asked if I could provide additional comments for the paper. As such my suggestions can be taken at the authors’ discretion.



 

The authors suppose that if structural racism is causing SES differences between races then socially-defined race should predict lower SES outcomes for groups discriminated against, independent of the effect of genetic ancestry. The authors test this prediction by regressing two SES variables (household income and educational attainment) on measures of genetic ancestry and socially identified race. After finding no significant effect size of socially identified race, or effects in the opposite direction from what theories of structural racism predict, the authors conclude that their result is inconsistent with the theory of structural racism. The use of ancestry regression to tease apart the effects of discrimination and genetics on SES is novel and commendable.

 

I would like to have seen more discussion of potential problems with the methodology, their effects and whether or not they are important. Genetic ancestry is not randomly assigned and socially identified race may correlate with the SES outcomes for reasons other than racial discrimination (SES causing racial identification, collider bias etc.) In particular the problem of assortative mating is not discussed – who selects into mixed race relationships. It is probably worth being explicit that these variables are not exogenous and how potential confounds and colliders may or may not complicate the test of structural racism.

 

An unusual aspect of the approach is that offspring genetic ancestry is used as a proxy for the parents’ genetic ancestry. The authors highlight this as a “major limitation”, but don’t discuss what the effect this approach might have. Yes, the authors use the household’s SES instead of the respondent parent’s, ensuring that self-identified race does not independently tell us anything much about the genetic ancestry of the parents, but there are still problems with the approach. 

 

Upon considering for the child’s genetic ancestry, the racial identification of the parent tells us whether or not they have miscegenated. I think will induce a bias on the estimated effect of socially identified race, especially since the respondent parent is (I suspect) most likely the mother. Thus self-reported Black mother, given mixed race offspring, may imply a higher SES of the household than if the mother reported being White. This might also be why the authors find black identification predicts higher SES conditional on genetic ancestry of the offspring.

 

Regardless of what causes black identification to predict higher SES, it is perhaps surprising and should be discussed e.g. do the authors think it shows structural racism in favour of Black people, or is it explained by some sort of bias like selection into mixed race relationships? An effect size of 0.24 does seem quite substantial and worth considering.

 

The authors do not seem to report the sex ratio of respondent parents. I suspect they will find it is mothers who respond to scientific surveys. This is important information since we know the correlates of mixed race relationships depends upon which partner is white and which is black. An interesting test would be to interact sex with identified race. This might be able to help rule out the weird biases created by using offspring genetic data when there is also selection into mixed race relationships. 

 

Do the authors have the racial identification of both partners? I think not, but if so this information should be incorporated into the design. I think this might help remove some biases from sorting into mixed race relationships.

 

To conclude, making potential biases clearer to the reader could be helpful for interpreting the strength and importance of the study. Moreover, a tested interaction between sex and identified race might help exclude biases from selection into mixed race relationships. Neveertheless, the paper is an important methodological step forwards despite limitations of the design. 

 

Below I list specific suggestions I had whilst going through the paper




 

Other Suggestions:

 

Analyses generally show that the association between socially-identified race/ethnicity and outcomes is mediated by genetic ancestry and that non-White race/ethnicity is unrelated to worse outcomes when controlling for genetic ancestry

 

I think the term confounded should be used here instead of mediated. Socially-identified race is a potential mediator of the effect of genetic ancestry rather than the other way around.



 

In families with multiple children, we used the genetic ancestry estimates of the first biological child

 

Only using data from one child per family is wasteful. If we are interested in the effect of the child’s ancestry it would be best to include each child and to cluster standard erros by family. If we are interested in using offspring ancestry as an indicator for the parents, I would suggest averaging genetic ancestry across individuals.



 

2.2.2 Educational attainment and income

 

In this section the authors describe their coding of education and income. If the coding scheme for education has been used before elsewhere, it may be useful to provide a citation to prove the coding is non-arbitrary. For example, stating that you have copied the coding scheme in the EA GWAS would provide more credibility. 

 

Do the authorise windsorize high and low values? I can see why we might expect low values to be erroneous (ie. no school or no income), but high values (+$200,000 or PhD education) are probably credible? If high values are windsorized I would consider changing that. It might also be helpful to know what levels of income and education correspond to + or - 3 SD

 

In general it’s a good idea to take the logarithm of income. Income is typically log-normally distributed and causes of income tend to be linearly related to log income. It is also nice to have normally distributed error terms if possible. But in practice this transformation probably will not lead to substantially different results. 


 

Firstly, as a robustness check, we reran the analyses excluding all cases with values of education and income 3 standard deviations (SDs) or more below the mean. This was done in order to ensure that our results were not primarily driven by the extremely low values of education and income

 

In the results it is suggested that windsorizing is only used as a robustness check. However, the methods suggest it is the standard approach.  This contradiction should be resolved. I like the idea of just using windsorizing as a robustness test since there is little reason to suppose that observations +3SDs above the mean are anomalous. 

 

A report of summary statistics or a histogram might help to justify whatever approach you take with unusual values.



 

This finding indicates that ancestry strongly mediates the effect of socially-defined race/ethnicity

 

Genetic ancestry is not caused by self-identification, thus genetic ancestry cannot be a mediator of the effect of racial identification. Racial identification is a mediator of genetic ancestry.



 

Table 2. Admixture Regression Results for Educational Attainment.

 

I would note that this is for parental educational attainment - like in table 3 where the dependent variable is written as “parental income”

 

Bot

Authors have updated the submission to version #3

Author | Admin

Thank you for the feedback. We carefully processed your comments and have incorporated changes into the manuscript. Please see our detailed reply to your specific points. 

You noted:

Genetic ancestry is not randomly assigned and socially identified race may correlate with the SES outcomes for reasons other than racial discrimination (SES causing racial identification, collider bias etc.) In particular the problem of assortative mating is not discussed – who selects into mixed race relationships.

We are now discussing this concern in the discussion section:

One limitation of the current research design is that factors other than discrimination, such as assortative mating and selective ethnic attrition, can potentially induce associations between social outcomes and socially defined race. For example, mixed-race couples could be socioeconomically selective, and this selectivity could lead to social race being correlated with SES independent of genetic ancestry.  Were these processes influencing our results, we would generally expect socially identified races to be associated with outcomes independent of ancestry. However, we generally do not find this to be the case. Nonetheless, it is theoretically possible that the effects of discrimination could be moderated by countervailing effects of assortative mating and selective ethnic attrition. Whether this is an empirically plausible scenario depends on the pattern of mating and identification in American race/ethnic groups and is a subject for future research.

You noted that:

An unusual aspect of the approach is that offspring genetic ancestry is used as a proxy for the parents’ genetic ancestry. The authors highlight this as a “major limitation”, but don’t discuss what the effect this approach might have. 

We agree that not having information on both parents' socially defined race may bias the results; we noted this in the methods and discussion. This is why we also ran robustness analyses using the child's race as identified by the parents, which should reflect the average of both parents' identified race. Using the child's race as a proxy for both parent's race did not change the results in an interpretively important way. This said we agree, as we noted in the paper, that it would be best to use adult genetic ancestry and adult social outcomes for this type of analysis.  

Regardless of what causes black identification to predict higher SES, it is perhaps surprising and should be discussed e.g. do the authors think it shows structural racism in favour of Black people, or is it explained by some sort of bias like selection into mixed race relationships? An effect size of 0.24 does seem quite substantial and worth considering.

Thank you for the comment, we now noted: 

As seen in Table 4, when ancestry is controlled for, Black social race is statistically significantly positively associated with outcomes in the majority of models. Since a large portion of the structural racism literature focuses on discrimination against individuals socially identified as Black it would be worthwhile to see if these results replicate on other samples. Given these counterintuitive results, before such replication is done (our Black sample is small in our analysis), we are reluctant to speculate on possible causes.

You asked about the sex of the parents: 

An interesting test would be to interact sex with identified race. This might be able to help rule out the weird biases created by using offspring genetic data when there is also selection into mixed race relationships. Do the authors have the racial identification of both partners?

As noted to a prior reviewer, we no longer have the original dataset and so can not run new analyses. As noted above, we additionally ran the analysis using parent identified race of the child which should account for the race of both parents. Doing so did not have an interpretively important impact on the results. Ultimately though such analyses should be run using adult genetic ancestry and outcomes. We now have specified clearly we don’t know the race of the second parent in the method section 2.2.5, and we also emphasize this point by noting in the discussion:

This leaves open the question as to whether the pattern of mating could have explained the positive coefficient of Black identity on socio-economic outcomes. It might be argued that black mothers who intermarry typically achieve higher SES levels than white mothers who intermarry, therefore weighing up the impact of Black identity. This requires a rate high enough to create a positive effect of  b=0.24 (as found in this study). At the same time, not having information about the racial identification of the spouse of the responding parent does not help to address this issue.

Thank you for the last comments. We have tried to be more clear about potential biases so to better guide future research.

Now responding to your specific points:

We have now changed the word "mediated" to "confounded" in the abstract and admit it may be more accurate. 

Only using data from one child per family is wasteful. If we are interested in the effect of the child’s ancestry it would be best to include each child and to cluster standard errors by family. If we are interested in using offspring ancestry as an indicator for the parents, I would suggest averaging genetic ancestry across individuals.”

Thank you for the comment. Since we no longer have the original dataset, we can not rerun the analyses. It should be noted that we used the ancestry of the first biological child of both parents. Were we to average ancestries, we would average the ancestry of full siblings, who are either MZ twins, DZ twins, or other fullsibs. The ancestry of these pairs is expected to be very similar, and so we do not expect the ancestry estimated based on only one biological child to depart substantially from the estimates based on two or more biological children. 

Regarding the coding scheme of our education variable, We now cite Fuerst, Hu, & Connor (2021) and Fuerst, Shibaev, & Kirkegaard (2023) regarding the coding scheme. These papers include detailed SM papers specifying the variables used. We additionally note in section 2.2.2.:

Detailed descriptions of the educational and income variables are provided by Fuerst, Hu, & Connor (2021) and Fuerst, Shibaev, & Kirkegaard (2023) who used the same coding scheme. 

You also commented on the usage of winsorization. We winsorized the SES values in our main analysis. We understand your concern about keeping the original values of the highest values. This is why in one of our robustness analyses, we remove extremely low values (3SD below the mean) while leaving the highest values untouched. 

Finally, we changed the wording "This finding indicates that ancestry strongly mediates the effect of socially-defined race/ethnicity" to "statistically explains". And we specified Parental Educational Attainment in the title of Table 2.

Author | Admin

I have forgotten to reply to your comment on log transformation. I agree that it makes interpreting income quite convenient, but I remember that the distribution of the income variable here was slightly negatively skewed (which makes it impossible to fix with log), but not really non-normal, so we didn't feel needed, at the time, to transform the variable.