Back to Submissions

1
Mixed evidence for Lynn's developmental theory of sex differences using aptitude tests

Submission status
Reviewing

Submission Editor
Emil O. W. Kirkegaard

Author
Meng Hu

Title
Mixed evidence for Lynn's developmental theory of sex differences using aptitude tests

Abstract

This study investigates sex differences in the general factor of intelligence and their interaction with age, using Multiple Group Confirmatory Factor Analysis (MGCFA). It aims at testing Lynn’s developmental theory of sex differences in intelligence, which states that the male advantage magnifies over the course of development, especially from age 16 onwards. The result provides some evidence for Lynn’s hypothesis in the NLSY79 and NLSY97 but not in the Project Talent. Results from the Higher Order Factor (HOF) model showed that in the NLSY79, the male advantage in g increases from 1.21 to 5.53 points in the entire sample, while in the NLSY97, the male advantage increases from 0.18 to 2.46 points in the entire sample. Similarly, results from the Bifactor (BF) model showed a greater increase in g scores across ages among males. However, the BF model often produced substantially different score gaps in g in all three datasets. This discrepancy between the HOF and BF models highlights the influence of test composition on latent scores. A sibling pair analysis in the NLSY datasets yielded ambiguous results. In the Project Talent, sex differences remained stable across ages 14-18 in the White sample, but a slight increase in female advantage was observed in the Black sample, contradicting Lynn’s hypothesis.

Keywords
IQ, measurement invariance, sex differences, Spearman’s Hypothesis, MGCFA, aptitude tests

Supplemental materials link
https://osf.io/892e3/

Pdf

Paper

Reviewers ( 0 / 1 / 0 )
Reviewer 1: Considering / Revise

Sun 09 Mar 2025 20:31

Reviewer | Admin | Editor

The issue of self-selection has been raised as a confounding factor for testing Lynn’s
hypothesis because women are more voluntary than men to take surveys.

Language issue.

Lynn, R. (2017). Sex Differences in Intelligence: The Developmental Theory. Mankind
Quarterly, 58(1), 9–42. doi: 10.46469/mq.2017.58.1.2

There is a more up to date review in:

Lynn, Richard (2021). Sex Differences in Intelligence: The Developmental Theory. Arktos Media Ltd. ISBN 978-1914208652.

In the ASVAB, tests of crystallized
ability are overly represented (Roberts et al., 2001), whereas in the Project Talent,
culture-loaded knowledge tests are overly represented (Jensen, 1985, p. 218).

Over-represented or overrepresented.

"Note: standard errors in parentheses, non-significant values are highlighted,"

Underlined.

"-fixed to 0- "

Better: 0 (fixed)

I think the best improvement to make here is to move the model fit tables to the appendix (Tables 2-7). The reader is not likely to care about various model fit statistics in detail for the numerous models. The reader is however very interested in the gap sizes in Tables A1-A3. I think you should make an overall figure from the gap tables (A1-A3). My takeaway from this study is that gap sizes are very difficult to estimate and depend on model decision making, as well as the composition. You may want to quote Jensen 1998:

Research on sex differences in mental abilities has generated hundreds of
articles in the psychological literature, with the number of studies and articles
increasing at an accelerating rate in the last decade. As there now exist many
general reviews of this literature, I will focus here on what has proved to be
the most problematic question in this field: whether, on average, males and
females differ in g.

It is noteworthy that this question, which is technically the most difficult to
answer, has been the least investigated, the least written about, and, indeed, even
the least often asked.

To examine composition effects, you could subset the tests in the batteries (e.g. 1 at a time), and notice how this changes gap sizes. Sometimes it may be difficult to do because a group factor would have fewer than 3 tests. E.g. in NLSY79, the speed factor only has 2 tests, so removing one of them would would render the latent factor just the same as the remaining single test.

Would you say your findings are more congruent with no gaps or with male advantage? If you take a Bayesian approach to this, you could analyze the male g advantage across all datasets and model specifications and look at the distribution. It looks like this distribution has non-zero tendency towards male advantage.

I did meta-analysis of your tables:

- Naive: mean = 0.19, weighted mean = 0.24, median = 0.12.
- Frequentist: 0.19, with covars = -0.16 and increases 0.027 per test included
- Bayesian: 0.28, with covars = -0.08 increases 0.03 with test number

https://rpubs.com/EmilOWK/meng_hu_2025_sex_diffs_g

Models with the full set of covariates are very underpowered (n=16, 4 predictors), but the one-at-a-time models are a bit more powered. Looks like HOF is much more consistent in general, and with higher means for men.

You can save more models with other different decisions and run a multiverse analysis. https://cran.r-project.org/web/packages/multiverse/readme/README.html

In general, though, my reading is that results are more congruent with a male advantage, but because it depends so strongly on the study covariates (model, race, sample, tests), it is hard to say anything for sure. Unfortunately, that means we didn't learn much from this major undertaking, but now we know that at least. :)