Back to Post-publication discussions

1
Diversity in STEM: Merit or Discrimination via Inaccurate Stereotype?

Submission status
Published

Submission Editor
Noah Carl

Authors
Joseph Bronski
Emil O. W. Kirkegaard

Title
Diversity in STEM: Merit or Discrimination via Inaccurate Stereotype?

Abstract

Leslie et al. (2015) advocated a model where a stereotype that a given field requires brilliance
to succeed scares women away from the field, thus resulting in a self-fulfilling prophecy similar to
stereotype threat. Leslie however ignored decades of findings in stereotype accuracy research, where
stereotypes are generally known to accurately track real existing differences. As such, a simpler
explanation for the data is that the brilliance stereotype results from real existing differences in
academic ability between fields of study, which is also the variable that explains the different
distribution of demographic groups in these fields due to differences in academic abilities. Chiefly,
men’s superior mathematical ability explains why they are overrepresented in fields that require
strong mathematical talent to succeed (e.g. physics). We present an analysis which suggests that
the proportion of a field that is female is better predicted by that field’s average math GRE score
(r = −0.79) than Leslie et al.’s Brilliance stereotype (r = −0.65), and the proportion of a field
that is Black is predicted equally well by both that field’s average GRE score (r = −0.49) and
Leslie et al.’s Brilliance stereotype (r = −0.53). We show that a field’s Brilliance stereotype is
furthermore closely associated with its average GRE score (r = 0.58). Additionally, we show that
a field’s scientificiness stereotype score is predicted by its GRE math tilt (r = 0.36) while a field’s
conservativeness stereotype score is associated with the actual percent of registered Republicans in
that field (r = 0.55). We conclude that Leslie et al.’s uncritical reliance on inaccurate stereotype to
explain disparities in racial and gender diversity by academic field is deeply flawed. Finally, their
results failed to replicate among the doctorate holding public; GRE Math was a better predictor of
the percent of a field that is female than brilliance stereotype among doctorate holders (r = −0.79
vs. r = −0.39).

Keywords
intelligence, sex differences, stereotype, female, brilliance, male

Pdf

Paper

Typeset Pdf

Typeset Paper

Reviewers ( 0 / 0 / 2 )
Adam Perkins: Accept
George Francis: Accept

Tue 25 Jul 2023 13:39

Reviewer

This manuscript describes a study investigating the cause of group disparities in academic fields. For example, groups such as women and blacks are underrepresented in STEM domains of academia such as physics, chemistry, biology, mathematics, and engineering. Moreover, the disparities tend to be larger in more egalitarian nations. Previous researchers have tried to explain away such disparities as the result of stereotypes about the abilities (e.g., brilliance) required for STEM academia scaring off under-represented groups. In the present research the authors test an alternative explanation, namely that “the brilliance stereotype results from real existing differences in academic ability between fields of study, which is also the variable that explains the different distribution of demographic groups in these fields due to differences in academic abilities.”

Results support the latter explanation. For example, the authors show a field’s Brilliance stereotype is closely associated with its average GRE score (r = 0.58) and a field’s scientificiness stereotype score is predicted by its GRE math tilt (r = 0.36).

The manuscript is a nice piece of work that is generally well written, and the study appears to be conducted competently with all the usual methodological and statistical requirements for high quality research. My only substantive comment is that the authors could perhaps blow their own trumpet a bit more. They could make more of their findings than they presently do, as I think they are too shy with regard to the importance of group differences in general IQ scores (aka the g-factor). For example, in their introduction they say that “Sex differences in systemization capacity and interest appear in infancy and persist to adulthood [8], indicating that women should be less likely to earn high math GRE scores and STEM graduate degrees in the absence of any discrimination.”

I am sure such sex differences in systematization and science interest are real and do have some influence on the success of a STEM career, but I would suggest they are dwarfed by a simpler and more important causal factor, namely that men are more intelligent than women on average. The difference is small at the mean (about 4 IQ points) but it translates to women being significantly rarer in the extreme righthand tail of the IQ bell curve, which is where outstanding scientists are usually located. In my opinion g-loaded IQ scores are far more important in determining of STEM talent than differences in relatively minor personality aspects such as science interest or systematising. The authors do cite Nyborg (2015) as proof that men are on average more intelligent than women, but I think it would improve the paper if they expanded upon this aspect of the topic. This seems particularly likely to me as there is evidence that personality’s effect on academic performance gradually shrinks as an individual’s education progresses from elementary school onwards, hence at the most rarefied levels of STEM academia, brilliance would seem likely to me to be almost entirely caused by possessing ridiculously high IQ scores, with personality differences washing out. https://onlinelibrary.wiley.com/doi/10.1111/jopy.12663

The same argument applies to the lack of black people in STEM academia, but to an even greater extent as on average black people score 15 points lower on IQ tests than white

people. If the g-factor of IQ is the most important determinant of STEM brilliance, we should see even fewer black people emerging as brilliant STEM academics in the Nobel Laureate pantheon etc. than women (at least before the era of diversity hiring).

Therefore, in summary I think this manuscript is already a good contribution to the scientific literature and could be made even better by some light touch revision to spell out even more clearly the hugely important role of general IQ differences in underpinning observed group differences in STEM brilliance, especially at the extreme right-hand end of the IQ bell curve where geniuses like Bill Hamilton and Paul Dirac can be found.

Reviewer | Admin

 

The paper documents and presents stereotypes and characteristics of academic fields, regarding their intellectual and political qualities. It was greatly enjoyable to look through the graphs and see what people believe about fields and what the truth is. The disparity between how liberal Theology is and what the people expect of the field is very interesting.

 

The authors frame their paper as a rebuttal of Leslie et al. 2015, who claim that perception that a field requires brilliance turns aways women and certain minorities. The authors show that a simpler model fits the data much better - women and minorities are underrepresented in field according to the mathematical ability required, which is consistent with our knowledge of these groups mathematical abilities. The authors nonetheless note that they cannot falsify Leslie’s hypothesis, since the correlation between the mathematical ability of fields and their composition can be consistent with both sorting by ability as well as stereotype threat and discrimination. 

 

The paper makes an important contribution and is a good rebuttal to Leslie et al. I welcome its publication after some minor changes to improve the explanation of some minor issues. Below I list my recommendations to the authors.



 

“across the academic spectrum, women are underrepresented in fields whose practitioners believe raw, innate that talent”

 

This is a misquotation of Leslia 2015. There is no “That” in the original quote



 

“Data was arranged by field and correlations between the metrics were computed. We expect to see that 1) GRE scores correlate more with percent female than brilliance; 2) percent female correlates negatively with percent Republican; and 3) the best predictor of the conservativeness stereotype is the proportion of Republicans in a field.” 

 

The method section mentions expected correlation between variables and Reublican proportion in the field. This is not mentioned in the hypotheses or the introduction or the conclusion, leaving the reader unsure of the reasoning for this expectation and why it is related to the rest of the paper. I think presenting the data and its correlations is interesting but not relevant and may be more suitable in an appendix. Alternatively it should be linked to the issues the authors are tackinling, stereotypes, stereotype threat and discrimination. What does it tell us if conservatives are in the smarter or more mathematical fields. Conservatives are not underrepresented just because of their intelligence? Perhaps stereotypes of brilliance can not explain their underrepresentation? Conservatives have a math tilt? 

 

The authors do not seem to explain how they measure conservative stereotype or scientificness stereotype in the methods section.




 

The introduction states regarding Leslie et al. “At no point did they consider the association between each field’s mean GRE score and each field’s proportion of women – instead, they chose to only consider the mean GREs of applicants to each subject, instead of the mean GRE scores of successful PhD earners by subject.” 

 

When I read this, I presumed the authors were then using GREs of PhD earners. But this wasn’t explicitly confirmed and my understanding is the ETS only has GRE scores by people’s intended subject, not average GRE scores for people who completed a degree in the subject. And even then the GRE applicants in the data include those who just get masters, right? The authors should make explicit whether the data comes from applicants or PhD holders



 

Model fit should probably use a capital R^2 rather than the lowercase r^2. More generally the coefficient of determination should be written as R^2



 

The squared semi-partial correlations, sr2 , for verbal GRE, Brilliance, and math GRE can be computed from the tables above. sr2 Brilliance = 0.75−0.63 = 0.12, sr2 GRE−M = 0.75−0.46 = 0.29, and sr2 GRE−V = 0.75 − 0.75 = 0.00, from table 1. Additionally, we can throw out the verbal GRE because its sr2 = 0. Doing so allows us to compute new sr2 for the remaining two factors: sr2 Brilliance = 0.75 − 0.63 = 0.12 (it stays the same), and sr2 GRE−M = 0.75 − 0.42 = 0.33. 

 

I can’t see the 0.42 figure in the table. 

 

I don’t understand the formula the author uses. Standardized betas are semi-partial correlations. To get the squared semi-partial correlation you would just square the regression beta and you do not do any subtraction. I think writing it as \beta^2 would also make sense more to readers who aren’t familiar with sr2. The author is instead subtracting the model fits of different models from each other. This would be the incremental variance explained if one model was nested in another. Instead, the different models use different control variables so the calculation doesn’t make much sense to me. 

 

The authors should make clear what they are doing and probably change their analysis here.

 

The constant in model three looks off of the first regression table. Are all the variables standardized before hand? Make it explicitly whether or not this was done. If they are all standardized then the intercept should be negligible.

 

Author | Admin

Reviewer 1: I have added a paragraph explaining that IQ differences between the sexes likely account for most of the achievement gap, with personality explaining the remainder.

Reviewer 2:

Here is a google doc link with my notes. I have fixed everything you commented on. https://docs.google.com/document/d/1Nt6bLdfg9XuEEcROB0NUz7wgh5iUNgTfydY4bQFGODA/edit?usp=sharing

 

Bot

Authors have updated the submission to version #2

Reviewer | Admin

I'm happy for the paper to be published with these edits. I would encourage the authors to post the comments of the Google Doc here for posterity. 

 

Replying to Joseph Bronski

Reviewer 1: I have added a paragraph explaining that IQ differences between the sexes likely account for most of the achievement gap, with personality explaining the remainder.

Reviewer 2:

Here is a google doc link with my notes. I have fixed everything you commented on. https://docs.google.com/document/d/1Nt6bLdfg9XuEEcROB0NUz7wgh5iUNgTfydY4bQFGODA/edit?usp=sharing

 

 

Reviewer | Admin

I'm happy for the paper to be published with these edits. I would encourage the authors to post the comments of the Google Doc here for posterity. 

 

Replying to Joseph Bronski

Reviewer 1: I have added a paragraph explaining that IQ differences between the sexes likely account for most of the achievement gap, with personality explaining the remainder.

Reviewer 2:

Here is a google doc link with my notes. I have fixed everything you commented on. https://docs.google.com/document/d/1Nt6bLdfg9XuEEcROB0NUz7wgh5iUNgTfydY4bQFGODA/edit?usp=sharing

 

 

Author | Admin

 

Here is the doc paste:

  1. “across the academic spectrum, women are underrepresented in fields whose practitioners believe raw, innate that talent” This is a misquotation of Leslia 2015. There is no “That” in the original quote

Fixed

 

  1. “Data was arranged by field and correlations between the metrics were computed. We expect to see that 1) GRE scores correlate more with percent female than brilliance; 2) percent female correlates negatively with percent Republican; and 3) the best predictor of the conservativeness stereotype is the proportion of Republicans in a field.” The method section mentions expected correlation between variables and Reublican proportion in the field. This is not mentioned in the hypotheses or the introduction or the conclusion, leaving the reader unsure of the reasoning for this expectation and why it is related to the rest of the paper. I think presenting the data and its correlations is interesting but not relevant and may be more suitable in an appendix. Alternatively it should be linked to the issues the authors are tackinling, stereotypes, stereotype threat and discrimination. What does it tell us if conservatives are in the smarter or more mathematical fields. Conservatives are not underrepresented just because of their intelligence? Perhaps stereotypes of brilliance can not explain their underrepresentation? Conservatives have a math tilt? 

 

Clarified this in the beginning of methods, the theme is stereotypes being accurate in general.

Basically that stereotypes exist because they reflect reality.

I would argue against moving the corr plot to the appendix. It reports key statistics about stereotype accuracy.

 

  1. The authors do not seem to explain how they measure conservative stereotype or scientificness stereotype in the methods section.

Added a sentence on this, it was just Likert scales

 

  1. The introduction states regarding Leslie et al. “At no point did they consider the association between each field’s mean GRE score and each field’s proportion of women – instead, they chose to only consider the mean GREs of applicants to each subject, instead of the mean GRE scores of successful PhD earners by subject.” When I read this, I presumed the authors were then using GREs of PhD earners. But this wasn’t explicitly confirmed and my understanding is the ETS only has GRE scores by people’s intended subject, not average GRE scores for people who completed a degree in the subject. And even then the GRE applicants in the data include those who just get masters, right? The authors should make explicit whether the data comes from applicants or PhD holders

I have updated this. They used the same data as us but ignored GRE math while also only testing on a small number of fields. I have made this clear in the text.



 

  1. Model fit should probably use a capital R^2 rather than the lowercase r^2. More generally the coefficient of determination should be written as R^2

Fixed

 

  1. The squared semi-partial correlations, sr2 , for verbal GRE, Brilliance, and math GRE can be computed from the tables above. sr2 Brilliance = 0.75−0.63 = 0.12, sr2 GRE−M = 0.75−0.46 = 0.29, and sr2 GRE−V = 0.75 − 0.75 = 0.00, from table 1. Additionally, we can throw out the verbal GRE because its sr2 = 0. Doing so allows us to compute new sr2 for the remaining two factors: sr2 Brilliance = 0.75 − 0.63 = 0.12 (it stays the same), and sr2 GRE−M = 0.75 − 0.42 = 0.33. I can’t see the 0.42 figure in the table. I don’t understand the formula the author uses. Standardized betas are semi-partial correlations. To get the squared semi-partial correlation you would just square the regression beta and you do not do any subtraction. I think writing it as \beta^2 would also make sense more to readers who aren’t familiar with sr2. The author is instead subtracting the model fits of different models from each other. This would be the incremental variance explained if one model was nested in another. Instead, the different models use different control variables so the calculation doesn’t make much sense to me.  The authors should make clear what they are doing and probably change their analysis here.

 

I just removed sr^2 talk since anyone who knows what they are can just figure them out and I keep getting complaints about it.

 

  1. The constant in model three looks off of the first regression table. Are all the variables standardized before hand? Make it explicitly whether or not this was done. If they are all standardized then the intercept should be negligible.

 I checked the code and the data was indeed standardized, I’m guessing the intercept on model 3 was just like that by chance. 10% chance of that happening to one of the models on the table according to the p value, not that rare.

 

 

Bot

The submission was accepted for publication.

Bot

Authors have updated the submission to version #4