Back to Submissions

1
Spearman’s g Explains Black-White but not Sex Differences in Cognitive Abilities in the Project Talent

Submission status
Reviewing

Submission Editor
Emil O. W. Kirkegaard

Author
Meng Hu

Title
Spearman’s g Explains Black-White but not Sex Differences in Cognitive Abilities in the Project Talent

Abstract

The weak form of Spearman’s Hypothesis, which states that the racial group differences are primarily due to differences in the general factor (g), was tested and confirmed in this analysis of the Project Talent data, based on 34 aptitude tests among 9th-12th grade students. Multi-Group Confirmatory Factor Analysis (MGCFA) detected small-modest bias with respect to race but strong bias with respect to within-race sex cognitive difference. After establishing partial measurement equivalence, SH was tested by comparing the model fit of correlated factors (non-g) model with a bifactor (g) model as well as the relative contribution of g factor means to that of the specific factors. While g was the main source of the Black-White differences, this wasn’t the case for within-race sex differences. The evidence of measurement bias in the sex analysis may cause ambiguity in interpreting SH for sex differences. Results from MGCFA were somewhat corroborated by the Method of Correlated Vectors, with high correlations of subtests’ loadings with Black-White differences but near-zero correlations with sex differences. This finding replicates earlier MGCFA studies supporting SH with respect to the Black-White cognitive gap as well as earlier MGCFA studies revealing stronger gender bias than racial bias.

Keywords
measurement invariance, MCV, Spearman’s Hypothesis, MGCFA, Black-White IQ gap, Project Talent, Sex IQ gap

Supplemental materials link
https://osf.io/qn67k/

Pdf

Paper

Reviewers ( 0 / 1 / 1 )
Reviewer 1: Considering / Revise
Reviewer 2: Accept

Sat 16 Sep 2023 22:27

Bot

Author has updated the submission to version #2

Reviewer

The paper here contains very important findings. However, at present it is written in a very technical way, to the extent that anyone without intimate knowledge of the literature would be entirely lost.

 

Some suggested edits: 

A very brief and simple definition of Spearman's Hypothesis should be provided in the Abstract

 

On page 2: “One comes from Scheiber (2016b) who found strong measurement bias in the analysis of the WISC-V between 777/830 […]” it is unclear what the fractions in this sentence mean.

 

On page 5: “When within-factor correlated residuals are misspecified, all fit indices favor the correlated factors model regardless of conditions, except for SRMR, show a bias in favor of the correlated factors model (Greene et al., 2019)”. This sentence needs rewording, it does not make sense at the minute.

 

In the analysis section: it would be useful to provide diagrams of what the CF, HOF and BF models look like. This will make it easier for a reader to understand what hypotheses are being tested.

 

Table 1: There could be another column which states, in plain english, what each of these models is used to test for.

 

Table 2: please indicate, for each fit measure what is considered a better fit. E.g: CFI higher is better, RSMEA lower is better.

 

As far as I can tell the model specification is the same wherever it is stated. In which case it should only be stated once, with a phrase along the lines of “for all of our models the model specification is:”

 

Providing a table of g-loadings for math, speed etc would be useful information.

 

The horizontal axis on figures should be “average g-loading (from Black and White male sample)” or similar. This is practically the most important part of the paper, as these graphs are easy to digest for laymen. They need to be as easy to understand as possible.

Bot

Author has updated the submission to version #3

Author

Thank you for the review.

I understand that the analysis is complex. In reality, MGCFA can be much easier if the data is ideal (clean factor structure, near equivalent group samples, no Heywood cases, no pro-bifactor bias, assumption of no cross loadings for computing effect sizes of bias). Unfortunately, the data usually does not fulfill most of these ideal conditions. And in the case of the Project Talent, the large number of subtests, subgroups and models complicate the situation even more. I wish I could simplify as much as possible, but at the same time it is necessary to explain and address the problems that are often ignored in MGCFA studies.

I modified my article according to your suggestions, clarifying and fixing whenever necessary. I also updated my supplementary file.

The weak form of Spearman’s Hypothesis, which states that the racial group differences are primarily due to differences in the general factor (g), was tested and confirmed in this analysis of the Project Talent data, based on 34 aptitude tests among 9th-12th grade students. 

...

One comes from Scheiber (2016b) who found strong measurement bias in the analysis of the WISC-V between 777 White males and 830 White females, 188 Black males and Black 221 females, and 308 Hispanic males and Hispanic 313 females.

...

When within-factor correlated residuals are misspecified, all fit indices correctly favor the correlated factors model regardless of conditions, except for SRMR, which incorrectly favors the bifactor model (Greene et al., 2019, Table 4).

I now provided a new Figure 1, along with the following text:

Figure 1 displays hypothetical competing CFA models that are investigated in the present analysis: 1) the correlated factors model which specifies that the first-order specific factors are correlated without the existence of a general factor, 2) the higher order factor model which specifies that the second-order general factor operates through the first-order specific factors and thus only indirectly influences the subtests, 3) the bifactor model which, unlike the higher order factor, specifies that both the general and specific factors, have direct influences on the subtests.

I also added a note under the model fit tables:

Note: higher values of CFI and Mc indicate better fit, while lower values of χ2, RMSEA, RMSEAD, SRMR indicate better fit.

I, however, found one of your request difficult to fulfill. Specifically, this one:

Table 1: There could be another column which states, in plain english, what each of these models is used to test for.

This is because, to summarize the purpose of the models in just 1 or 2 words is extremely difficult. Considering the column specification is already loaded with information, adding another column filled with more information will make the table more tedious to read, I believe.

The models in Table 1 have been somewhat summarized prior, but now expanded a bit more, with a reference to Table 1 as well. 

MGCFA starts by adding additional constraints to the initial configural model, with the following incremental steps: metric, scalar, strict. A rejection of configural invariance implies that the groups use different latent abilities to solve the same set of item variables. A rejection in metric (loading) invariance implies that the indicators of a latent factor are unequally weighted across groups. A rejection in scalar (intercept) invariance implies that the subtest scores differ across groups when their latent factor means is equalized. A rejection in strict (residual) invariance implies there is a group difference in specific variance and/or measurement error. When invariance is rejected, partial invariance must release parameters until acceptable fit is achieved and these freed parameters must be carried on in the next levels of MGCFA models. The variances of the latent factors are then constrained to be equal across groups to examine whether the groups use the same range of abilities to answer the subtests. The final step is to determine which latent factors can have their mean differences constrained to zero without deteriorating the model fit: a worsening of the model fit indicates that the factor is needed to account for the group differences. These model specifications will be presented in Table 1 further below. 

I provided the R output displaying all parameter values for the best model in Tables 2 through 13, in the supplementary file. The output is in fact so large that, even if I only display the group factor loadings, it will drastically increase the number of pages. The article is already very long. Notice that I did not originally display anywhere in the paper the general factor loadings. This is because, again, there were too many models and subgroups (loadings, both g and specific factors, would need to be displayed for each subgroups, black men, white men, white women, black women for each g models).

The X axis title in figures 2-5 has been modified as per your suggestion.

One final remark. I do not understand this sentence.

As far as I can tell the model specification is the same wherever it is stated. In which case it should only be stated once, with a phrase along the lines of “for all of our models the model specification is:”

Are you refering to the competing models (CF, HOF, BF) or rather to the model constraints (M1-M6)? I suspect the latter, though I'm not sure. If this is the case, each subgroups shows different pattern of non-invariance, so I have to discuss them separately rather than making general and rather unprecise statements about the results. I understand it may be tedious for the readers but I believe it is necessary.

Reviewer
Replying to Meng Hu

Thank you for the review.

I understand that the analysis is complex. In reality, MGCFA can be much easier if the data is ideal (clean factor structure, near equivalent group samples, no Heywood cases, no pro-bifactor bias, assumption of no cross loadings for computing effect sizes of bias). Unfortunately, the data usually does not fulfill most of these ideal conditions. And in the case of the Project Talent, the large number of subtests, subgroups and models complicate the situation even more. I wish I could simplify as much as possible, but at the same time it is necessary to explain and address the problems that are often ignored in MGCFA studies.

I modified my article according to your suggestions, clarifying and fixing whenever necessary. I also updated my supplementary file.

The weak form of Spearman’s Hypothesis, which states that the racial group differences are primarily due to differences in the general factor (g), was tested and confirmed in this analysis of the Project Talent data, based on 34 aptitude tests among 9th-12th grade students. 

...

One comes from Scheiber (2016b) who found strong measurement bias in the analysis of the WISC-V between 777 White males and 830 White females, 188 Black males and Black 221 females, and 308 Hispanic males and Hispanic 313 females.

...

When within-factor correlated residuals are misspecified, all fit indices correctly favor the correlated factors model regardless of conditions, except for SRMR, which incorrectly favors the bifactor model (Greene et al., 2019, Table 4).

I now provided a new Figure 1, along with the following text:

Figure 1 displays hypothetical competing CFA models that are investigated in the present analysis: 1) the correlated factors model which specifies that the first-order specific factors are correlated without the existence of a general factor, 2) the higher order factor model which specifies that the second-order general factor operates through the first-order specific factors and thus only indirectly influences the subtests, 3) the bifactor model which, unlike the higher order factor, specifies that both the general and specific factors, have direct influences on the subtests.

I also added a note under the model fit tables:

Note: higher values of CFI and Mc indicate better fit, while lower values of χ2, RMSEA, RMSEAD, SRMR indicate better fit.

I, however, found one of your request difficult to fulfill. Specifically, this one:

Table 1: There could be another column which states, in plain english, what each of these models is used to test for.

This is because, to summarize the purpose of the models in just 1 or 2 words is extremely difficult. Considering the column specification is already loaded with information, adding another column filled with more information will make the table more tedious to read, I believe.

The models in Table 1 have been somewhat summarized prior, but now expanded a bit more, with a reference to Table 1 as well. 

MGCFA starts by adding additional constraints to the initial configural model, with the following incremental steps: metric, scalar, strict. A rejection of configural invariance implies that the groups use different latent abilities to solve the same set of item variables. A rejection in metric (loading) invariance implies that the indicators of a latent factor are unequally weighted across groups. A rejection in scalar (intercept) invariance implies that the subtest scores differ across groups when their latent factor means is equalized. A rejection in strict (residual) invariance implies there is a group difference in specific variance and/or measurement error. When invariance is rejected, partial invariance must release parameters until acceptable fit is achieved and these freed parameters must be carried on in the next levels of MGCFA models. The variances of the latent factors are then constrained to be equal across groups to examine whether the groups use the same range of abilities to answer the subtests. The final step is to determine which latent factors can have their mean differences constrained to zero without deteriorating the model fit: a worsening of the model fit indicates that the factor is needed to account for the group differences. These model specifications will be presented in Table 1 further below. 

I provided the R output displaying all parameter values for the best model in Tables 2 through 13, in the supplementary file. The output is in fact so large that, even if I only display the group factor loadings, it will drastically increase the number of pages. The article is already very long. Notice that I did not originally display anywhere in the paper the general factor loadings. This is because, again, there were too many models and subgroups (loadings, both g and specific factors, would need to be displayed for each subgroups, black men, white men, white women, black women for each g models).

The X axis title in figures 2-5 has been modified as per your suggestion.

One final remark. I do not understand this sentence.

As far as I can tell the model specification is the same wherever it is stated. In which case it should only be stated once, with a phrase along the lines of “for all of our models the model specification is:”

Are you refering to the competing models (CF, HOF, BF) or rather to the model constraints (M1-M6)? I suspect the latter, though I'm not sure. If this is the case, each subgroups shows different pattern of non-invariance, so I have to discuss them separately rather than making general and rather unprecise statements about the results. I understand it may be tedious for the readers but I believe it is necessary.

Thank you for your reply. I will review the revised version. But I will quickly clarify that last part. I am refering to bit that states: "The model specification is displayed as follows:

english =~ S1 + S13 + S19 + S20 + S21 + S22 + S23 + S24 + S25 + S26 + S31 + S34

math =~ S5 + S25 + S32 + S33 + S34 speed =~ S19 + S34 + S35 + S36 + S37

info =~ S1 + S2 + S3 + S4 + S7 + S8 + S11 + S12 + S13 + S14 + S15 + S16 + S19 + S26

science =~ S1 + S6 + S7 + S8 + S9 + S10

spatial =~ S28 + S29 + S30 + S31 + S37"

This seems to be repeated exactly several times in the paper. e.g on page 16, page 21, page 24.

Author

I understand now. At first glance, it seems the models are identical across subgroups. In fact, they differ a little bit by subgroup. Usually, some subtests have additional cross loadings in some subgroups (e.g. S13 Health, S32 Arithmetic Reasoning, etc.).