Back to [Archive] Post-review discussions

[ODP] The international general socioeconomic factor: Factor analyzing international
I agree with you concerning the countries, sometimes naming them is awful. I always hated to work with national data. I once (2 years ago) gathered large amount of data notably GDP, economic freedom indices, and savings for all years and diverse sources, but screwed up at the end with the rows that don't match. I gave up completing the data set since then. It's lot of stress and I don't need that. However, can you send me (by mails, preferably) the data sets before they are matched using the procedure you describe ? I just want to see if the matching/merging are correct.

Moving one step up, one of us showed that among 71 Danish immigrant groups ranked on 4 different measures of socioeconomic variables (crime, use of social benefits, income, education attainment) there was a large (40% of variance explained) general socioeconomic factor[2].


I have problem with that. When you say "one of us" you refer to [2] a study by Kirkegaard and Fuerst. That seems confusing to me.

The general mental ability factor at the individual level has been termed "g" (often italicized "g"[3]), while the national-level group equivalent has been termed "G" ("big g factor").[4] Keeping in line with this terminology, one might refer to the general socioeconomic factor at the individual level as "s factor" and the group level version "S factor" (or "big s").


In that case shouldn't it be "national-level" instead of "group" level ?

He mentions that in 26% of a sample of studies using principal components in PsychINFO, the case-to-var ratio was between 2 and 5 as it is with our two datasets.


Do you know what is the recommended ratio ? I think you should say it explicitly, as it would help some people lost here (like me).

Since you use KMO, why not add a little sentence about what it is ? See Andy Field book "Discovering statistics using spss: Introducing statistical method" (p647).

Another alternative is to use the Kaiser–Meyer–Olkin measure of sampling adequacy (KMO) (Kaiser, 1970). The KMO can be calculated for individual and multiple variables and represents the ratio of the squared correlation between variables to the squared partial correlation between variables. The KMO statistic varies between 0 and 1. A value of 0 indicates that the sum of partial correlations is large relative to the sum of correlations, indicating diffusion in the pattern of correlations (hence, factor analysis is likely to be inappropriate). A value close to 1 indicates that patterns of correlations are relatively compact and so factor analysis should yield distinct and reliable factors. Kaiser (1974) recommends accepting values greater than 0.5 as barely acceptable (values below this should lead you to either collect more data or rethink which variables to include). Furthermore, values between 0.5 and 0.7 are mediocre, values between 0.7 and 0.8 are good, values between 0.8 and 0.9 are great and values above 0.9 are superb (Hutcheson & Sofroniou, 1999).

Since I found that regardless of method and dataset used, the first factor was a general factor accounting for about 40-47% of the variance, it was interesting to know how many components one needed to measure it well. To find out, I sampled subsets of components at random from the datasets, extracted the first factor, and then correlated this with the first factor using all the components. I repeated the sampling 1000 times to reduce sampling error to almost zero.


I do not understand what's in bold.

Often authors will also argue for a causal connection from national IQ/G to country-level variables. The typical example of this is wealth (e.g. [17, 18, 19, 20, 21]). Since I know that g causes greater wealth at the individual level, and that nations can generally be considered a large group of individuals, it would be very surprising, though not impossible, if there was no causation at the group level as well.


The word "causation" is too strong. Given the annoted references, only 20 and 21 talk a little bit about causation, but that's not even clear there is strong evidence for this pattern of causation. Instead, there is evidence that the causation wealth->IQ is not well established. Perhaps my best conclusion is that it would be best to argue, at least for now, that we don't have strong evidence for either of these two pattern of causation (i.e., wealth cause IQ or the reverse). You can argue there is probably some very indirect suggestion that IQ causes wealth more than the reverse, as your articles on immigrations, notably with John, seems to show this, but as i said, it's very indirect proof.

If population differences in G is a main cause of national differences in many socioeconomic areas, then aggregating measures should increase the correlation with G, since measurement specificity averages out.


Even if G is not causal here, aggregation would also improve the correlation, no ?

I think table 2 should better be made as the table 1.

Not related, but can you tell me how you manage to generate the nice graph at figures 2-5 ? Concerning the figures 2-5, again, have you contacted Major and his co-authors to ask for opinion ? I bet he (they) will be very interested.
Admin
Hi Meng Hu,

I agree with you concerning the countries, sometimes naming them is awful. I always hated to work with national data. I once (2 years ago) gathered large amount of data notably GDP, economic freedom indices, and savings for all years and diverse sources, but screwed up at the end with the rows that don't match. I gave up completing the data set since then. It's lot of stress and I don't need that. However, can you send me (by mails, preferably) the data sets before they are matched using the procedure you describe ? I just want to see if the matching/merging are correct.


It is easy to verify that it works. You can just create some datasets and merge them yourself. The code is open source, so it is available to anyone who can figure out how to install Python (use the Anaconda package).

You can also merge some actual datasets, e.g., different measures of inequality. Wikipedia has lots of these national rankings for all kinds of stuff.

https://en.wikipedia.org/wiki/List_of_international_rankings

I have problem with that. When you say "one of us" you refer to [2] a study by Kirkegaard and Fuerst. That seems confusing to me.


It was because when I wrote the paper, I had imagined that I would have a co-author. I ended up doing all the analyses myself, so there is no co-author. For that reason, some of the language was off, e.g. using "we", "our". You found another remnant of that. I have fixed it.

In that case shouldn't it be "national-level" instead of "group" level ?


The group level versions are capitalized. The national-level versions are a subset of group-level versions. For instance, in my previous paper on immigrant groups in Denmark, it was group-level but not national-level.

Do you know what is the recommended ratio ? I think you should say it explicitly, as it would help some people lost here (like me).

Since you use KMO, why not add a little sentence about what it is ? See Andy Field book "Discovering statistics using spss: Introducing statistical method" (p647).

Another alternative is to use the Kaiser–Meyer–Olkin measure of sampling adequacy (KMO) (Kaiser, 1970). The KMO can be calculated for individual and multiple variables and represents the ratio of the squared correlation between variables to the squared partial correlation between variables. The KMO statistic varies between 0 and 1. A value of 0 indicates that the sum of partial correlations is large relative to the sum of correlations, indicating diffusion in the pattern of correlations (hence, factor analysis is likely to be inappropriate). A value close to 1 indicates that patterns of correlations are relatively compact and so factor analysis should yield distinct and reliable factors. Kaiser (1974) recommends accepting values greater than 0.5 as barely acceptable (values below this should lead you to either collect more data or rethink which variables to include). Furthermore, values between 0.5 and 0.7 are mediocre, values between 0.7 and 0.8 are good, values between 0.8 and 0.9 are great and values above 0.9 are superb (Hutcheson & Sofroniou, 1999).


The answer is: There are lots of different recommendations. Read the source I refer to, you will like it.

[11] Nathan Zhao. The minimum sample size in factor analysis, 2009. URL https://www.encorewiki.org/display/~nzhao/The+Minimum+Sample+Size+in+Factor+Analysis

I have added a short explanation of KMO and Bartlett's test.

My rewording:

I performed principle components analyses (PCA) on both the reduced and the means-imputed datasets to examine the effect of the procedure. The correlation of factor loadings was 0.996 indicating that the procedure did not alter the structure of the data much. I performed KMO tests (a measure of sampling adequacy) on both samples which showed that reducing the sample reduced the KMO (0.899 to 0.809). In comparison, KMO in the DR dataset was 0.884. All values are considered ‘meritorious’.\cite[p. 225]{hutcheson1999multivariate}

Bartlett's test (tests whether the data is suitable for factor analysis) were extremely significant in all three datasets (p$<$0.00001).


I do not understand what's in bold.


It means that I did the random sampling 1000 times in my R code. Or in other words. 1000 times I picked N number of random variables, performed a factor analysis on the data with those variables, and then correlated the scores from the first factor from this analysis with the those found from the analysis of all variables.

The word "causation" is too strong. Given the annoted references, only 20 and 21 talk a little bit about causation, but that's not even clear there is strong evidence for this pattern of causation. Instead, there is evidence that the causation wealth->IQ is not well established. Perhaps my best conclusion is that it would be best to argue, at least for now, that we don't have strong evidence for either of these two pattern of causation (i.e., wealth cause IQ or the reverse). You can argue there is probably some very indirect suggestion that IQ causes wealth more than the reverse, as your articles on immigrations, notably with John, seems to show this, but as i said, it's very indirect proof.


It is well-established at the personal level that the causation is g→wealth (income), not much the other way. At the national level, it is more contentious. I don't have to argue for that here, since I'm not making such a claim in the paper.

Even if G is not causal here, aggregation would also improve the correlation, no ?


Yes. It is a statistical phenomenon (cf. Spearman Brown formula).

I think table 2 should better be made as the table 1.


What do you mean? There is a table called "Table 1". It is in section 5.

Not related, but can you tell me how you manage to generate the nice graph at figures 2-5 ? Concerning the figures 2-5, again, have you contacted Major and his co-authors to ask for opinion ? I bet he (they) will be very interested.


Yes. Look in the file Loadings analysis.ods. It is just made with LibreOffice.

I have not contacted Major et al. His email is J.T.Major@ed.ac.uk. Perhaps we should contact him. He could be an external reviewer (journal policy is that authors are allowed to recruit one external reviewer).

---

Here's a new version. I have made the small change noted above and fixed some other small issues. Due to a wrong setting in LibreOffice, the previous results figures were a little bit off. There were some slight errors with the names of some references (involving use of italics in their titles).
Admin
I contacted Major. No response so far.
1) I'd like to see more details of the factor/PC analyses. Were the sub-component intercorrelations explained by a single factor by the usual standards (e.g., only one factor with eigenvalue>1)? If not, how many other factors were there, can you give a substantive interpretation to them, and are they correlated with national IQ? KMO and Bartlett's test are pretty superfluous and aren't usually reported; I would mention them in a footnote only.

2) "individuals have a general socioeconomic factor"

This language is confusing. 'Factor' refers to a source of variance among individuals, so individuals cannot have factors. Individuals are located at different points on a factor, or have factor scores.

3) "national measures of country well-doing or well-being"

excise "well-doing"

4) "(see review national in [5]."

reword that

5) Figure 1 describes the structure of the SPI, why is there no corresponding figure describing the DP?

6) "principle components analyses", "principle axis factoring"

principal, not principle

7) "In some cases PCA can show a general factor where none exists (Jensen and Weng, 1994 [13]). For this reason, I compared the first factor extracted via PCA to the first factors using minimum residuals, weighted least squares, generalized least squares, principle axis factoring and maximum likelihood estimation"

I don't see how the similarity of factor loadings based on different extraction methods can tell us anything about the existence of a general factor. You don't say what a 'general factor' is, but I assume it means a factor with all-positive indicator loadings regardless of extraction method. There is no such factor in your data, as indicated by the many negative loadings listed in the Appendix. Even if you reverse coded the variables so that higher values on all variables would have positive valence (e.g, "Adequate nourishment" instead of "Undernourishment"), which I think would be a good thing to do, there'd still be negative loadings on the first factor/PC (e.g., suicide rate).

8) How did you compute the correlations between factor loadings? The congruence coefficient rather than Pearson's r should be used: http://en.wikipedia.org/wiki/Congruence_coefficient (Or did you use factor scores?)

9) Sections 4-6 have nice graphs, but I don't see their purpose. The fact that the correlation between two linear combinations of correlated elements gets higher the more there are shared elements is self-evident. The graphs might be of use if there was a practical need to estimate the S factor using only a limited number of components, but I don't see why anyone would want to do that.

I assume that the results in sections 4-6 are based on correlations of factor/component scores, but it's nowhere specified. Are the factors extracted with replacement?

Are the results from all the 54 components based on PCA? If so, the higher correlations with PCA components could be artefactual, due to common method variance.

10) MCV correlations of 0.99 are unusual in my experience. I'd like to see the scatter plots to ascertain that there's a linear association across the range. It's possible that the high correlations are due to outliers. What happens to the MCV correlations if you reverse score the variables with negative valence?

11) "The analyses carried out in this paper suggest that the S factor is not quite like g. Correlations between the first factor from different subsets did not reach unity, even when extracted from 10 non-overlapping randomly picked tests (mean r’s = .874 and .902)."

Your analysis is based on different sets of observed variables, while the g studies that found perfect or nearly perfect correlations were based on analyses of latent factors which contain no error or specific variance, or on analyses of the same data set with different methods. So the results aren't comparable.

12) The biggest problem in the paper is that it seems to be pretty pointless. Yes, most indicators of national well-being are highly correlated with each other and with national IQ, which means that the first PC from national well-being data must be highly correlated with national IQ, but so what? That was obvious from the outset.

What is the S factor? Do you believe it is a unitary causal factor influencing socioeconomic variables? That interpretation would give some meaning to the paper, but I think it's not a very promising idea. Socioeconomic status is normally thought of as a non-causal index reflecting the influence of various factors. Clark speaks of a "social competence" that is inherited across generations, but I don't think he views it as a unitary causal force but rather as a composite of different influences (such as IQ and personality).

I think national IQs and national socioeconomic indices are so thoroughly causally intermingled that attempting to say anything about causes and effects would require longitudinal data.

13) "It is worth noting that group-level correlations need not be the same or even in the same direction as individual-level correlations. In the case of suicide, there does appear to be a negative correlation at the individual level as well."

This means that the s and S factors aren't the same. Why is suicide an indicator of socioeconomic status anyway?

14) I couldn't find a correlation matrix of all the variables in the supplementary material. It would be useful (as an Excel file).

15) PCA and factor analysis are really different methods, and PCA shouldn't be called factor analysis.
Admin
Hi Dalliard. Thank you for a thorough review.

1) I'd like to see more details of the factor/PC analyses. Were the sub-component intercorrelations explained by a single factor by the usual standards (e.g., only one factor with eigenvalue>1)? If not, how many other factors were there, can you give a substantive interpretation to them, and are they correlated with national IQ? KMO and Bartlett's test are pretty superfluous and aren't usually reported; I would mention them in a footnote only.


You can view all the details in the R code file.

Number of factors to extract was in all cases set to 1 ("nfactors=1" in code), so no criteria for determining the number of factors to keep was used.

I report KMO and Bartlett's because a reviewer (Piffer) had previously requested that I include them. Clearly I cannot satisfy both of you. It seems best just to include them. They don't take up much space.

2) "individuals have a general socioeconomic factor"

This language is confusing. 'Factor' refers to a source of variance among individuals, so individuals cannot have factors. Individuals are located at different points on a factor, or have factor scores.


What I meant is that if one analyses the data at the individual level, one will also find a general socioeconomic factor (i.e. s factor).

Changed text to:
Gregory Clark argued that there is a general socioeconomic factor which underlies their socioeconomic performance at the individual-level.\cite{clark2014}

3) "national measures of country well-doing or well-being"

excise "well-doing"


I prefer to keep both because these variables measure all kinds of things, some related to well-being (e.g. health, longevity) others to well-doing (income, number of universities).

4) "(see review national in [5]."

reword that


Changed text to:
Previous studies have correlated some of these with national IQs but not in a systematic manner (see review in \cite{lynn2012intelligence}.

5) Figure 1 describes the structure of the SPI, why is there no corresponding figure describing the DP?


It did not seem necessary. It is less complicated and the user can find it in the manual referenced. Do you want me to include an overview of it?

http://democracyranking.org/?page_id=590

As far as I can tell, they have 6 'dimensions', which have the weights 50, 10, 10, 10, 10, 10. I'm not sure what they do within each 'dimension', probably they average the indexes. As with the DR, SPI, the HDI also has some idiosyncratic way of combining their variables to a higher construct. For reference, the HDI is based on a geometric mean (a what?) of three indicators.

6) "principle components analyses", "principle axis factoring"

principal, not principle


Fixed.

7) "In some cases PCA can show a general factor where none exists (Jensen and Weng, 1994 [13]). For this reason, I compared the first factor extracted via PCA to the first factors using minimum residuals, weighted least squares, generalized least squares, principle axis factoring and maximum likelihood estimation"

I don't see how the similarity of factor loadings based on different extraction methods can tell us anything about the existence of a general factor. You don't say what a 'general factor' is, but I assume it means a factor with all-positive indicator loadings regardless of extraction method. There is no such factor in your data, as indicated by the many negative loadings listed in the Appendix. Even if you reverse coded the variables so that higher values on all variables would have positive valence (e.g, "Adequate nourishment" instead of "Undernourishment"), which I think would be a good thing to do, there'd still be negative loadings on the first factor/PC (e.g., suicide rate).


A general factor need not be a perfectly general factor, just a very large one. As you mention, there are a few variables that load in the 'wrong' direction which is not found in the analysis of cognitive data.

A general factor would be disproved if there was a lot of variables that didn't load on the first factor (i.e. with very low loadings, say <.10). This isn't the case with these national data, mean abs. loading was high (.6-.65).

8) How did you compute the correlations between factor loadings? The congruence coefficient rather than Pearson's r should be used: http://en.wikipedia.org/wiki/Congruence_coefficient (Or did you use factor scores?)


Congruence coefficient has some bias (e.g.). I used Pearson r.

I found an error. I had forgotten to add PCA to the comparison with the 5 other methods. It made little difference.

One cannot compare loadings (easily) when using subset x whole/subset analysis, for those I used scores.

I compared scores in the full datasets before. They have a mean around .99 for both datasets using Pearson. I have now written more code to compare the loadings too, also with Pearson's. They are also .99 in both datasets.

Clearly, if the score correlations are very high, the loading correlations will also be, and vice versa. So in that sense doing both analyses is unnecessary.

I doubt using Spearman's or CC instead will change these results much. They are almost identical for every method in the full datasets.

However, I used the CC as requested. Results are rounded to three digits. I used this function.

Results:
> #for SPI
> factor.congruence(list(y_all.loadings),digits=3)
PC1 MR1 WLS1 GLS1 PA1 ML1
PC1 1.000 0.997 1.000 1.000 1.000 0.997
MR1 0.997 1.000 0.997 0.997 0.997 1.000
WLS1 1.000 0.997 1.000 1.000 1.000 0.997
GLS1 1.000 0.997 1.000 1.000 1.000 0.997
PA1 1.000 0.997 1.000 1.000 1.000 0.997
ML1 0.997 1.000 0.997 0.997 0.997 1.000
> #for DR
> factor.congruence(list(z_all.loadings),digits=3)
PC1 MR1 WLS1 GLS1 PA1 ML1
PC1 1.000 0.997 1.000 1.000 1.000 1.000
MR1 0.997 1.000 0.997 0.997 0.997 0.998
WLS1 1.000 0.997 1.000 1.000 1.000 1.000
GLS1 1.000 0.997 1.000 1.000 1.000 1.000
PA1 1.000 0.997 1.000 1.000 1.000 1.000
ML1 1.000 0.998 1.000 1.000 1.000 1.000


9) Sections 4-6 have nice graphs, but I don't see their purpose. The fact that the correlation between two linear combinations of correlated elements gets higher the more there are shared elements is self-evident. The graphs might be of use if there was a practical need to estimate the S factor using only a limited number of components, but I don't see why anyone would want to do that.

I assume that the results in sections 4-6 are based on correlations of factor/component scores, but it's nowhere specified. Are the factors extracted with replacement?

Are the results from all the 54 components based on PCA? If so, the higher correlations with PCA components could be artefactual, due to common method variance.


As you can see, there has been some methodological studies concerning the interpretation of loadings from small subsets of variables, as well as correlations between factors from different methods. These results are clearly relevant to these matters.

I changed the text in section 4 to:
Since I found that regardless of method and dataset used, the first factor was a general factor accounting for about 40-47\% of the variance, it was interesting to know how many components one needed to measure it well. To find out, I sampled subsets of components at random from the datasets, extracted the first factor, and then correlated the scores from it with the scores of the first factor using all the components. I repeated the sampling 1000 times to reduce sampling error to almost zero. Since recently there was interest in comparing g factors from different factor extraction methods, I used the 6 different methods mentioned before.

I don't know what you mean "by replacement". I used the regression method to get scores. I used the fa() function from the psych package. http://www.inside-r.org/packages/cran/psych/docs/fa

The result from all the 54 variables is based on the same method used to extract from the subset. So, when using maximum likelihood (ML) to extract S from the subset of vars, the comparison S is also extracted via ML.

The subset x subset analyses are only based on PCA, however. I can replicate these with the other 5 methods if necessary. I think the differences will be slight.

10) MCV correlations of 0.99 are unusual in my experience. I'd like to see the scatter plots to ascertain that there's a linear association across the range. It's possible that the high correlations are due to outliers. What happens to the MCV correlations if you reverse score the variables with negative valence?


The first thing I did when working this out was to plot them. You can do this yourself with the plot() command (the code is around lines 293-327). They are indeed very linear. I have attached one plot for each with national IQs. They are almost the same with Altinok's.

I don't know what you mean with the other question.

11) "The analyses carried out in this paper suggest that the S factor is not quite like g. Correlations between the first factor from different subsets did not reach unity, even when extracted from 10 non-overlapping randomly picked tests (mean r’s = .874 and .902)."

Your analysis is based on different sets of observed variables, while the g studies that found perfect or nearly perfect correlations were based on analyses of latent factors which contain no error or specific variance, or on analyses of the same data set with different methods. So the results aren't comparable.


I am referring to the two Johnson studies which found that g factors extracted from different IQ batteries without overlapping tests did reach near-unity (1 or .99). This wasn't the case for these data. One problem in interpretation is that IQ batteries are deliberately put together so as to sampled a broad spectrum of ability variance. Randomly chosen subsets of such tests are not. The best way to proceed is to obtain the MISTRA data and repeat my analyses on them. So it comes down to whether they want to share the data or not.

12) The biggest problem in the paper is that it seems to be pretty pointless. Yes, most indicators of national well-being are highly correlated with each other and with national IQ, which means that the first PC from national well-being data must be highly correlated with national IQ, but so what? That was obvious from the outset.

What is the S factor? Do you believe it is a unitary causal factor influencing socioeconomic variables? That interpretation would give some meaning to the paper, but I think it's not a very promising idea. Socioeconomic status is normally thought of as a non-causal index reflecting the influence of various factors. Clark speaks of a "social competence" that is inherited across generations, but I don't think he views it as a unitary causal force but rather as a composite of different influences (such as IQ and personality).

I think national IQs and national socioeconomic indices are so thoroughly causally intermingled that attempting to say anything about causes and effects would require longitudinal data.


Strangely, this was the criticism that Gould also offered of the g factor (cf. Davis review). I disagree the results are NOT obvious. Neither g, G, s or S are obvious. Especially not the MCR results.

I have no particular opinions to offer on the causal interpretations. I don't want my paper to get stuck in review due to speculative discussion of causation. I think the descriptive data are very interesting by themselves.

13) "It is worth noting that group-level correlations need not be the same or even in the same direction as individual-level correlations. In the case of suicide, there does appear to be a negative correlation at the individual level as well."

This means that the s and S factors aren't the same. Why is suicide an indicator of socioeconomic status anyway?


Not socioeconomic status. General socioeconomic factor. I'm not sure what the specific complaint is.

14) I couldn't find a correlation matrix of all the variables in the supplementary material. It would be useful (as an Excel file).


They would be very large (54x54 and 42x42). Anyone curious can easily obtain it in R by typing:

write.csv(cor(y),file="y_matrix.csv")
> write.csv(cor(z),file="z_matrix.csv")


For your ease, I have attached both files.

15) PCA and factor analysis are really different methods, and PCA shouldn't be called factor analysis.


Semantics. Some authors use "factor analysis" as a general term as I do in this paper. Others prefer "dimensional reduction method" or "latent trait analysis" or some other term and then limit "factor analysis" to non-PCA methods.

--

Attached is also a new PDF with the above fixes and a new code file with the changes I made.
Admin
It wasn't possible to attach more than 5 files to a post.
1) I'd like to see more details of the factor/PC analyses. Were the sub-component intercorrelations explained by a single factor by the usual standards (e.g., only one factor with eigenvalue>1)? If not, how many other factors were there, can you give a substantive interpretation to them, and are they correlated with national IQ? KMO and Bartlett's test are pretty superfluous and aren't usually reported; I would mention them in a footnote only.


You can view all the details in the R code file.

Number of factors to extract was in all cases set to 1 ("nfactors=1" in code), so no criteria for determining the number of factors to keep was used.


So the criterion used was to extract just one factor. You should state that explicitly in the paper and justify the decision. At the limit, given that your variance explained is <50%, it is possible (though extremely unlikely) that there's a second factor that explains almost as much. In that case, denoting one of them as a general factor would be arbitrary. At the very least, you should tell how many factors with eigenvalues>1 there are.

I report KMO and Bartlett's because a reviewer (Piffer) had previously requested that I include them. Clearly I cannot satisfy both of you. It seems best just to include them. They don't take up much space.


OK, but information on the decisions made on the number of factors to be extracted is much more important.

2) "individuals have a general socioeconomic factor"

This language is confusing. 'Factor' refers to a source of variance among individuals, so individuals cannot have factors. Individuals are located at different points on a factor, or have factor scores.


What I meant is that if one analyses the data at the individual level, one will also find a general socioeconomic factor (i.e. s factor).

Changed text to:
Gregory Clark argued that there is a general socioeconomic factor which underlies their socioeconomic performance at the individual-level.\cite{clark2014}


Still badly worded. Who are the 'they' referenced? Clark does not write about a "general socioeconomic factor." He writes about "social competence." Perhaps the Social Competence Factor would be a better label for your factor, given that some of its indicators are not usually thought of as indicating socioeconomic status.

3) "national measures of country well-doing or well-being"

excise "well-doing"


I prefer to keep both because these variables measure all kinds of things, some related to well-being (e.g. health, longevity) others to well-doing (income, number of universities).


'Well-doing' is archaic-sounding and does not mean what you think it does:

well-doing (uncountable)
1.The practice of doing good; virtuousness, good conduct.


'Well-being' implies material prosperity, too, and is quite sufficient for your purposes:

well-being (uncountable)
1.a state of health, happiness and/or prosperity


5) Figure 1 describes the structure of the SPI, why is there no corresponding figure describing the DP?


It did not seem necessary. It is less complicated and the user can find it in the manual referenced. Do you want me to include an overview of it?

http://democracyranking.org/?page_id=590

As far as I can tell, they have 6 'dimensions', which have the weights 50, 10, 10, 10, 10, 10. I'm not sure what they do within each 'dimension', probably they average the indexes. As with the DR, SPI, the HDI also has some idiosyncratic way of combining their variables to a higher construct. For reference, the HDI is based on a geometric mean (a what?) of three indicators.


If you are going to include an itemized description of SPI, you should tell more about DP, too.

7) "In some cases PCA can show a general factor where none exists (Jensen and Weng, 1994 [13]). For this reason, I compared the first factor extracted via PCA to the first factors using minimum residuals, weighted least squares, generalized least squares, principle axis factoring and maximum likelihood estimation"

I don't see how the similarity of factor loadings based on different extraction methods can tell us anything about the existence of a general factor. You don't say what a 'general factor' is, but I assume it means a factor with all-positive indicator loadings regardless of extraction method. There is no such factor in your data, as indicated by the many negative loadings listed in the Appendix. Even if you reverse coded the variables so that higher values on all variables would have positive valence (e.g, "Adequate nourishment" instead of "Undernourishment"), which I think would be a good thing to do, there'd still be negative loadings on the first factor/PC (e.g., suicide rate).


A general factor need not be a perfectly general factor, just a very large one. As you mention, there are a few variables that load in the 'wrong' direction which is not found in the analysis of cognitive data.

A general factor would be disproved if there was a lot of variables that didn't load on the first factor (i.e. with very low loadings, say <.10). This isn't the case with these national data, mean abs. loading was high (.6-.65).


Most factor analyses will produce first factors that are substantially larger than the subsequent ones. What is the standard for "a very large" factor? The g factor is a general factor because all cognitive abilities do load positively on it.

More than a third of the loadings on the SPI factor are negative, so it's not a general factor. Similarly, if you had a cognitive test battery where some subtests loaded positively on the first unrotated factor and other subtests loaded negatively on it, there'd be no general factor, no matter how much variance the first factor explained.

Of course, in the case of the SPI factor the negative loadings are mostly an artefact of your failing to reverse code the negatively valenced variables. It's possible that this decision has some effect on all loadings.

8) How did you compute the correlations between factor loadings? The congruence coefficient rather than Pearson's r should be used: http://en.wikipedia.org/wiki/Congruence_coefficient (Or did you use factor scores?)


Congruence coefficient has some bias (e.g.). I used Pearson r.


I cannot access that paper, but whatever bias the CC has is miniscule compared to the bias that Pearson's r can produce when it is used to compare factor loadings. Look at the example on p. 100 in The g Factor by Jensen. The CC is the standard method for comparing factor loadings in EFA, and you should use it.

One cannot compare loadings (easily) when using subset x whole/subset analysis, for those I used scores.


State explicitly in the paper what you are doing. Computing correlations between factors can be done in many different ways (e.g., factor scores, congruence coefficient, CFA latent factor correlations).

9) Sections 4-6 have nice graphs, but I don't see their purpose. The fact that the correlation between two linear combinations of correlated elements gets higher the more there are shared elements is self-evident. The graphs might be of use if there was a practical need to estimate the S factor using only a limited number of components, but I don't see why anyone would want to do that.

I assume that the results in sections 4-6 are based on correlations of factor/component scores, but it's nowhere specified. Are the factors extracted with replacement?

Are the results from all the 54 components based on PCA? If so, the higher correlations with PCA components could be artefactual, due to common method variance.


As you can see, there has been some methodological studies concerning the interpretation of loadings from small subsets of variables, as well as correlations between factors from different methods. These results are clearly relevant to these matters.


If you interpret a factor in a realist manner, as g is usually interpreted, then such studies are relevant, but your S factor seems completely artefactual.

I don't know what you mean "by replacement". I used the regression method to get scores. I used the fa() function from the psych package. http://www.inside-r.org/packages/cran/psych/docs/fa[/


Ignore the replacement issue, with 1000 iterations you of course have to reuse the variables.

The result from all the 54 variables is based on the same method used to extract from the subset. So, when using maximum likelihood (ML) to extract S from the subset of vars, the comparison S is also extracted via ML.

The subset x subset analyses are only based on PCA, however. I can replicate these with the other 5 methods if necessary. I think the differences will be slight.


I think that's OK as it is, but, again, you should make it explicit in the paper.

10) MCV correlations of 0.99 are unusual in my experience. I'd like to see the scatter plots to ascertain that there's a linear association across the range. It's possible that the high correlations are due to outliers. What happens to the MCV correlations if you reverse score the variables with negative valence?


The first thing I did when working this out was to plot them. You can do this yourself with the plot() command (the code is around lines 293-327). They are indeed very linear. I have attached one plot for each with national IQs. They are almost the same with Altinok's.

I don't know what you mean with the other question.


OK, it's very linear. I'd include at least one of the scatter plots in the paper.

The other question refers to the fact that the variables with negative loadings may have an outsized influence on the MCV correlations because they somewhat artificially increase the range of values analyzed.

11) "The analyses carried out in this paper suggest that the S factor is not quite like g. Correlations between the first factor from different subsets did not reach unity, even when extracted from 10 non-overlapping randomly picked tests (mean r’s = .874 and .902)."

Your analysis is based on different sets of observed variables, while the g studies that found perfect or nearly perfect correlations were based on analyses of latent factors which contain no error or specific variance, or on analyses of the same data set with different methods. So the results aren't comparable.


I am referring to the two Johnson studies which found that g factors extracted from different IQ batteries without overlapping tests did reach near-unity (1 or .99). This wasn't the case for these data. One problem in interpretation is that IQ batteries are deliberately put together so as to sampled a broad spectrum of ability variance. Randomly chosen subsets of such tests are not.


The Johnson studies used CFA and latent factors to test if g factors from different test batteries were equivalent. You did not use CFA and latent factors, so you're comparing apples and oranges.

The best way to proceed is to obtain the MISTRA data and repeat my analyses on them. So it comes down to whether they want to share the data or not.


The MISTRA IQ correlation matrices have been published: http://www.newtreedesign2.com/isironline.org/wp-content/uploads/2014/04/MISTRAData.pdf

12) The biggest problem in the paper is that it seems to be pretty pointless. Yes, most indicators of national well-being are highly correlated with each other and with national IQ, which means that the first PC from national well-being data must be highly correlated with national IQ, but so what? That was obvious from the outset.

What is the S factor? Do you believe it is a unitary causal factor influencing socioeconomic variables? That interpretation would give some meaning to the paper, but I think it's not a very promising idea. Socioeconomic status is normally thought of as a non-causal index reflecting the influence of various factors. Clark speaks of a "social competence" that is inherited across generations, but I don't think he views it as a unitary causal force but rather as a composite of different influences (such as IQ and personality).

I think national IQs and national socioeconomic indices are so thoroughly causally intermingled that attempting to say anything about causes and effects would require longitudinal data.


Strangely, this was the criticism that Gould also offered of the g factor (cf. Davis review). I disagree the results are NOT obvious. Neither g, G, s or S are obvious. Especially not the MCR results.


I agree that the MCV results are perhaps of some interest.

It is well known that nations that are richer also have better health care, less malnutrition, higher life expectancy, better educational systems, better infrastructure, etc., so the S factor is unsurprising. The g factor is surprising because many assume that various domains of intelligence are strongly differentiated. Moreover, the realist interpretation of g does not rely just on the positive manifold, but on many completely independent lines of evidence (e.g., from multivariate behavioral genetics). So my criticisms are quite unlike Gould's.

What is this general socioeconomic factor? Why is it worth studying? Is it a formative or reflective factor?

14) I couldn't find a correlation matrix of all the variables in the supplementary material. It would be useful (as an Excel file).


They would be very large (54x54 and 42x42). Anyone curious can easily obtain it in R by typing:

write.csv(cor(y),file="y_matrix.csv")
> write.csv(cor(z),file="z_matrix.csv")


For your ease, I have attached both files.


There are all kinds of stuff in your supplemental materials, but if someone wants to replicate or extend your analysis, correlation matrices are what they need. Although I think it's unfortunate that you have not reversed the scoring of the negative variables.

15) PCA and factor analysis are really different methods, and PCA shouldn't be called factor analysis.


Semantics. Some authors use "factor analysis" as a general term as I do in this paper. Others prefer "dimensional reduction method" or "latent trait analysis" or some other term and then limit "factor analysis" to non-PCA methods.


Methodologists are adamant about the fact that PCA and FA are quite different animals. But I'm not going to insist on this.
Yes. It is a statistical phenomenon (cf. Spearman Brown formula).


I understand, but my problem is with your wording, because it sounds like if G is not causal, aggregation will not increase correlation with G. That's the entire sentence that should be remade.

It is well-established at the personal level that the causation is g→wealth (income), not much the other way. At the national level, it is more contentious. I don't have to argue for that here, since I'm not making such a claim in the paper.


Ok with the last sentence, but i wanted to say it because even though at individual level, wealth variable may have moderate causal effect, things can be different at national level where you have more environmental variations, and so you can (should) expect higher effect of environmental factors.

Concerning congruence coefficient, after reading this article...

Davenport, E. C. (1990). Significance testing of congruence coefficients: A good idea?. Educational and psychological measurement, 50(2), 289-296.

... I am left with the impression it's very bad method. You should be careful with that. (The version of the paper I have can't allow copy paste, but check the pages 293-295.) The congruence coeff seems to constantly give you very high value even in situations where they should be (theoretically) small, or not high at all.

Regarding the question of negative loadings, i don't understand your discussion here, both of you. My opinion is that when you have small loadings (such as 0.20 or less) in the 1rst unrotated factor, regardless of the direction, you should remove it because it's a poor measure of this factor.

The same can be said about rotated factor analysis. For instance, you have 3 interpretable factors, and you have a 4th one, not interpretable at all. And one of your variable has meaningful loading only on the 4th but not the others. In that case, remove it and re-do the factor analysis, etc. etc... until you get something neat.

Concerning if having only one variable with no loading on the 1st non-rotated factor is a problem and should be removed (i.e., keeping only variable having large or (at least) modest loadings), I would appreciate if someone here can find me some articles that talk about that subject, because I don't remember if I have.

---
---

EMIL OW K :

Each time I post here, I get the following:

Please correct the following errors before continuing:
The subject is too long. Please enter a subject shorter than 85 characters (currently 88).


Can you try to fix that ?

edit2: I forget to thank you for the files.
Admin
Meng Hu. It means the title of your post is too long. I think the limit is hardcoded. It is easily solved by shortening the title of the post. It's useful to give them meaningful titles. My post above (#25) is called "Reply to Dalliard" since it's a reply to his criticism.
Admin
Dalliard,

I will reply to most of your criticism later. I am currently visiting my girlfriend in Leipzig and I'm working from my laptop which isn't well-suited for statistical analyses.

However, one point. Yes, it is possible that the 2nd factor is almost the same size as the first. I had actually checked this because I initially did some analyses in SPSS before moving to R (it's my first time using R for a project). Here's what one can do in R:

y_ml.2 = fa(y,nfactors=2,rotate="none",scores="regression",fm="ml") #FA with 2 factors
y_ml.2 #display results
plot(y_ml.2$loadings[1:54],y_ml$loadings) #plots first factors
cor(y_ml.2$loadings[1:54],y_ml$loadings) #correlation ^

y_ml.3 = fa(y,nfactors=3,rotate="none",scores="regression",fm="ml") #same as above just for 3 factors
y_ml.3
plot(y_ml.3$loadings[1:54],y_ml$loadings)
cor(y_ml.3$loadings[1:54],y_ml$loadings)


One will get the 2 factor and 3 factor solutions using max. likelihood. Apparently the first factor is not completely identical across the nfactors to extract, but almost so. ML1 (with nfactors=1) with ML1 from nfactors=2 and 3 was .999.

With nfactors=2, the 2nd factor was much smaller. Var% for ML1 is about 41%, for ML2 it is about 11%,

With 3, ML1=41%, ML2=10%, ML3=5%.
Concerning congruence coefficient, after reading this article...

Davenport, E. C. (1990). Significance testing of congruence coefficients: A good idea?. Educational and psychological measurement, 50(2), 289-296.

... I am left with the impression it's very bad method. You should be careful with that. (The version of the paper I have can't allow copy paste, but check the pages 293-295.) The congruence coeff seems to constantly give you very high value even in situations where they should be (theoretically) small, or not high at all.


Can you upload that paper or send it to me? There are other sources that are more sanguine about the CC, e.g., https://media.psy.utexas.edu/sandbox/groups/goslinglab/wiki/05604/attachments/549fc/Factor%20Congruence.pdf In any case, the problems with using Pearson's r in the analysis of factor loadings are even greater.

Regarding the question of negative loadings, i don't understand your discussion here, both of you. My opinion is that when you have small loadings (such as 0.20 or less) in the 1rst unrotated factor, regardless of the direction, you should remove it because it's a poor measure of this factor.


I don't think the size of the loadings is that important here, only the sign. In scale development, it makes sense to remove indicator variables with small loadings (e.g., <0.3) because the purpose is to come up with a reliable measurement instrument. However, the purpose of Emil's paper is not to develop a scale but to investigate the correlation structure of international socioeconomic differences.

However, one point. Yes, it is possible that the 2nd factor is almost the same size as the first. I had actually checked this because I initially did some analyses in SPSS before moving to R (it's my first time using R for a project). Here's what one can do in R:

y_ml.2 = fa(y,nfactors=2,rotate="none",scores="regression",fm="ml") #FA with 2 factors
y_ml.2 #display results
plot(y_ml.2$loadings[1:54],y_ml$loadings) #plots first factors
cor(y_ml.2$loadings[1:54],y_ml$loadings) #correlation ^

y_ml.3 = fa(y,nfactors=3,rotate="none",scores="regression",fm="ml") #same as above just for 3 factors
y_ml.3
plot(y_ml.3$loadings[1:54],y_ml$loadings)
cor(y_ml.3$loadings[1:54],y_ml$loadings)


One will get the 2 factor and 3 factor solutions using max. likelihood. Apparently the first factor is not completely identical across the nfactors to extract, but almost so. ML1 (with nfactors=1) with ML1 from nfactors=2 and 3 was .999.

With nfactors=2, the 2nd factor was much smaller. Var% for ML1 is about 41%, for ML2 it is about 11%,

With 3, ML1=41%, ML2=10%, ML3=5%.


Discuss that in the paper, and explain why you disregard the other factors. As to the number of factors, use Kaiser's rule (eigenvalue>1) or, if you can, parallel analysis. Are the other factors interpretable based on which variables load strongly on them?
Admin
I will update the paper with many changes based on your last substantive review. No worries. I will include a section on the number of factors to use, and their interpretability.

I also ran a oblimin rotation. With nfactors=5, all factors extracted correlate with each other in the right direction (note ML2 is reversed). This indicates a general, higher-order factor, yes?

Loadings:
ML1 ML3 ML4 ML2 ML5
Undernourishmentofpop 1.01
Depthoffooddeficitcaloriesundernourishedperson 1.01
Maternalmortalityratedeaths100000livebirths -0.47 0.24 -0.40
Stillbirthratedeaths1000livebirths -0.57 -0.14 0.15 -0.20
Childmortalityratedeaths1000livebirths -0.53 0.15 -0.43
Deathsfrominfectiousdiseasesdeaths100000 -0.38 0.16 0.22 -0.56
Accesstopipedwaterofpop 0.43 0.11 -0.26 0.36
Ruralvs.urbanaccesstoimprovedwatersourceabsolutediffer -0.33 -0.20 0.21 -0.20
Accesstoimprovedsanitationfacilitiesofpop 0.62 -0.11 0.10 -0.18 0.25
Availabilityofaffordablehousingsatisfied 0.17 0.14 0.18
Accesstoelectricityofpop 0.52 -0.13 -0.33 0.35
Qualityofelectricitysupply1low7high 0.36 0.45 0.18
Indoorairpollutionattributabledeathsdeaths100000 -0.17 -0.10 0.37 -0.30
Homiciderate12100000520100000 -0.11 0.26 -0.59 0.21 -0.21
Levelofviolentcrime1low5high -0.18 -0.72
Perceivedcriminality1low5high -0.16 -0.14 -0.56
Politicalterror1low5high -0.17 -0.57 -0.32 0.13 0.11
Trafficdeathsdeaths100000 -0.23 -0.19 -0.41 -0.19
Adultliteracyrateofpop.aged15 0.97
Primaryschoolenrollmentofchildren 0.52 0.11 0.21 0.35
Lowersecondaryschoolenrollmentofchildren 0.57 -0.17 0.21
Uppersecondaryschoolenrollmentofchildren 0.59 0.20 -0.23
Genderparityinsecondaryenrollmentgirlsboys 0.60 0.17 -0.24
Mobiletelephonesubscriptionssubscriptions100people 0.45 0.11 -0.31
Internetusersofpop 0.38 0.17 0.38 -0.24 0.15
PressFreedomIndex0mostfree100leastfree -0.81 -0.11 0.12
Lifeexpectancyyears 0.28 0.14 -0.19 0.60
Noncommunicablediseasedeathsbetweentheagesof30and70p 0.12 -0.30 -0.19 0.11 -0.69
Obesityrateofpop 0.46 -0.17 -0.36
Outdoorairpollutionattributabledeathsdeaths100000 0.41 -0.34 -0.24 -0.23
Suicideratedeaths100000 0.50 0.11 0.21 -0.20
GreenhousegasemissionsCO2equivalentsperGDP 0.21 -0.23
Waterwithdrawalsasapercentofresources 0.25 -0.38 -0.13 0.14
Biodiversityandhabitat0noprotection100highprotection 0.31 0.14
Politicalrights1fullrights7norights -0.70 0.14 -0.18
Freedomofspeech0low2high 0.58 0.11
Freedomofassemblyassociation0low2high 0.13 0.74
Freedomofmovement0low4high 0.13 0.69 -0.14
Privatepropertyrights0none100full 0.40 0.57 0.14
Freedomoverlifechoicessatisfied -0.13 0.41 0.27 0.19
Freedomofreligion1low4high 0.89 -0.19
Modernslaveryhumantraffickingandchildmarriage1low100 -0.41 -0.27
Satisfieddemandforcontraceptionofwomen 0.64 0.30
Corruption0high100low 0.41 0.63 0.11
Womentreatedwithrespect0low100high -0.14 -0.27 0.78 -0.16
Toleranceforimmigrants0low100high -0.33 0.53 0.23
Toleranceforhomosexuals0low100high 0.42 0.23 0.39
Discriminationandviolenceagainstminorities0low10high -0.19 -0.61 -0.27 0.11
Religioustolerance1low4high 0.15 0.55 -0.16 0.12 -0.10
Communitysafetynet0low100high 0.44 0.30 -0.13
Yearsoftertiaryschooling 0.27 0.31 -0.16 0.12
Womensaverageyearsinschool 0.87 0.10 -0.13
Inequalityintheattainmentofeducation0low1high -0.87 -0.13 -0.19 0.14
Numberofgloballyrankeduniversities0none550 0.11 0.50 0.25

ML1 ML3 ML4 ML2 ML5
SS loadings 8.23 6.25 4.21 3.47 3.14
Proportion Var 0.15 0.12 0.08 0.06 0.06
Cumulative Var 0.15 0.27 0.35 0.41 0.47

Factor intercorrelations:

[,1] [,2] [,3] [,4] [,5]
[1,] 1.00 0.19 0.33 -0.55 0.53
[2,] 0.19 1.00 0.27 -0.14 0.19
[3,] 0.33 0.27 1.00 -0.35 0.27
[4,] -0.55 -0.14 -0.35 1.00 -0.49
[5,] 0.53 0.19 0.27 -0.49 1.00


To reproduce use:

y.oblimin.ml = fa(y,nfactors=5,rotate="oblimin",scores="regression",fm="ml")
print(y.oblimin.ml$loadings,digits=2)
print(y.oblimin.ml$Phi,digits=2)
Concerning congruence coefficient, after reading this article...

Davenport, E. C. (1990). Significance testing of congruence coefficients: A good idea?. Educational and psychological measurement, 50(2), 289-296.

... I am left with the impression it's very bad method. You should be careful with that. (The version of the paper I have can't allow copy paste, but check the pages 293-295.) The congruence coeff seems to constantly give you very high value even in situations where they should be (theoretically) small, or not high at all.


Can you upload that paper or send it to me? There are other sources that are more sanguine about the CC, e.g., https://media.psy.utexas.edu/sandbox/groups/goslinglab/wiki/05604/attachments/549fc/Factor%20Congruence.pdf In any case, the problems with using Pearson's r in the analysis of factor loadings are even greater.


I can't access your link, it says :

gateway incorrect
error 502


Anway, I attach the documents you asked. The Davenport study, I have heard of it in this paper :

http://www.iapsych.com/iqmr/fe/LinkedDocuments/wicherts2004.pdf

Table 13 provides the fit indices of the various factor models. The baseline model (Model 1: configural invariance) fits sufficiently, as judged by the CFI, although RMSEA is somewhat on the high side. Moreover, it is apparent that the metric invariance model (Model 2) fits worse than the configural invariance model does. All fit measures, except the CAIC, show deteriorating fit. Therefore, factor loadings cannot be considered cohort invariant (i.e., Λ1≠Λ2). Note that this is in stark contrast with the high congruence coefficient of the first principal component found by Must et al. (2003). This is due to the different natures of principal component analysis (PCA) and confirmatory factor analysis. PCA is an exploratory analysis that does not involve explicit hypothesis testing, as is the case with MGCFA. In addition, the congruence coefficient has been criticized for sometimes giving unjustifiably high values (Davenport, 1990).


Thus, if you wish, you can recommend the use of MGCFA testing of MI at the factor loading level. It's the best alternative of CC I am aware of.
Admin
I am in Leipzig for the time being visiting my girlfriend who lives there. I have only brought my laptop with me. I have access to my files and have R installed, however I don't have LATEX installed, so I cannot update the PDF file while I am here. I will get home on the 8th.

In the meanwhile, we can discuss problems with the paper. All quotes are from Dalliard.

So the criterion used was to extract just one factor. You should state that explicitly in the paper and justify the decision. At the limit, given that your variance explained is <50%, it is possible (though extremely unlikely) that there's a second factor that explains almost as much. In that case, denoting one of them as a general factor would be arbitrary. At the very least, you should tell how many factors with eigenvalues>1 there are.


I answered this partly before. I was only looking for a general socioeconomic factor and did not have any specific model in mind. No hypotheses were made about any non-general factors, nor was any specific model of the data advanced beforehand. This is why I did not use CFA either.

Still badly worded. Who are the 'they' referenced? Clark does not write about a "general socioeconomic factor." He writes about "social competence." Perhaps the Social Competence Factor would be a better label for your factor, given that some of its indicators are not usually thought of as indicating socioeconomic status.


I will remove the ”they”.

'Well-doing' is archaic-sounding and does not mean what you think it does:

well-doing (uncountable)
1.The practice of doing good; virtuousness, good conduct.

'Well-being' implies material prosperity, too, and is quite sufficient for your purposes:

well-being (uncountable)
1.a state of health, happiness and/or prosperity


Fine. I will use that then and drop the other.

If you are going to include an itemized description of SPI, you should tell more about DP, too.


If you think it is necessary. Note that whichever structure of the index the author decided on, I am not using it. I am only using their indicators. I showed the SPI just to show that there is some elaborate structure decided upon by the authors, which may not be supported by the actual intercorrelations in the data.

Most factor analyses will produce first factors that are substantially larger than the subsequent ones. What is the standard for "a very large" factor? The g factor is a general factor because all cognitive abilities do load positively on it.

More than a third of the loadings on the SPI factor are negative, so it's not a general factor. Similarly, if you had a cognitive test battery where some subtests loaded positively on the first unrotated factor and other subtests loaded negatively on it, there'd be no general factor, no matter how much variance the first factor explained.

Of course, in the case of the SPI factor the negative loadings are mostly an artefact of your failing to reverse code the negatively valenced variables. It's possible that this decision has some effect on all loadings.


Say, >30% I'd consider a large factor. I would not use the word ”fail” as that implies some attempt was made (which didn't succeed). No such attempt was made nor is it necessary IMO. The S factor is general in both datasets because: 1) it is very large (40-47% of var), 2) it is general in that almost all socially valued indicators load so that the desirability pole is towards the same end. I discuss the exceptions to this in a section in the paper too. Think about how the results could have looked like. Imagine a two factor result instead, with about half of the indicators loading on the one factor but not the other, each factor perhaps accounting for 20% of the variance. Clearly, this would be a most interesting result and clear disproof of any general country well-being. However, this is not what was found.

I cannot access that paper, but whatever bias the CC has is miniscule compared to the bias that Pearson's r can produce when it is used to compare factor loadings. Look at the example on p. 100 in The g Factor by Jensen. The CC is the standard method for comparing factor loadings in EFA, and you should use it.


I will report both in the paper. And in any case, the results were virtually identical.

State explicitly in the paper what you are doing. Computing correlations between factors can be done in many different ways (e.g., factor scores, congruence coefficient, CFA latent factor correlations).


I will update it to be more clear.

If you interpret a factor in a realist manner, as g is usually interpreted, then such studies are relevant, but your S factor seems completely artefactual.


'Completely artefactual'?

Ignore the replacement issue, with 1000 iterations you of course have to reuse the variables.


If you look in the code, you can see that it does exactly this for the subset x subset analyses:
1. Pick 10 random numbers between 1 and the number of variables without replacement (no overlap).
2. Divide that (unordered) list into two.
3. Get the first factor from each set of variables.
4. Correlate the scores from each first factor.
5. Save this result.
6. Repeat steps 1-5 1000 times.
7. Average the results.
8. Output the results.

In retrospect, I wrote the code in a dumb way that made it both slower (because using loops) and didn't save all the information generated only the final results.

OK, it's very linear. I'd include at least one of the scatter plots in the paper.

The other question refers to the fact that the variables with negative loadings may have an outsized influence on the MCV correlations because they somewhat artificially increase the range of values analyzed.


One could view the negative codings (not my doing) as inflating the variance which inflates the correlation. But one might as well argue that reverse coding them on purpose to produce only positive loadings is artificially decreasing the variance and thus lowering the correlations.

I ran the correlations again but with absolute values. The r drops a bit to around .95 which may be somewhat inflated. This is because any variable with a positive cor with IQ and negative loading (or reversely) would get 'fixed' to both positive so it would create a spuriously high correlation. I have attached the plot of the MCV on SPI with national IQ and absolute values. r=.95.

The MISTRA IQ correlation matrices have been published: http://www.newtreedesign2.com/isironline...RAData.pdf


I am aware. But it is not enough for my analyses. I need the scores too. I can repeat the loadings analyses but not the score x score ones.

I agree that the MCV results are perhaps of some interest.

It is well known that nations that are richer also have better health care, less malnutrition, higher life expectancy, better educational systems, better infrastructure, etc., so the S factor is unsurprising. The g factor is surprising because many assume that various domains of intelligence are strongly differentiated. Moreover, the realist interpretation of g does not rely just on the positive manifold, but on many completely independent lines of evidence (e.g., from multivariate behavioral genetics). So my criticisms are quite unlike Gould's.

What is this general socioeconomic factor? Why is it worth studying? Is it a formative or reflective factor?


Unsurprising, maybe, but no one has showed it to be there before. Remember that just because GDP (”richer”) correlates with variables X1, ..., Xn, does not show that these also intercorrelate strongly to create a large general factor.

The S factor provides a framework to think about national g proxies x other variables of interest. Right now whenever such a correlation is found, people usually attempt some specific theory of why this specific variable correlates with national g proxies (e.g. institutions, wealth, freedom of the press, atheism).

I don't know what you mean regarding formative vs. reflective factor. Perhaps you can link to some material that covers these concepts or explain them briefly.

If it concerns causality, I think S is primarily caused by G, but that there is some backwards causation in poorer countries due to nutrition (vitamin deficiency, protein deficiency), health care (certain illnesses may lower g) and perhaps pollution (e.g. heavy metal poisoning). Political structure also has an influence on S, especially totalitarian regimes of the communist variety seem to make things worse (China, North Korea, Cuba, Venezuela). However, as I said earlier, I don't want to push this or that interpretation in the paper as this would 1) make it much longer, 2) make it take forever to get thru peer review. I think such discussion is better left for another paper (or a book).

There are all kinds of stuff in your supplemental materials, but if someone wants to replicate or extend your analysis, correlation matrices are what they need. Although I think it's unfortunate that you have not reversed the scoring of the negative variables.


Well, since I published all the data files, they can easily generate the correlation matrices if they want those. They are not optimal because one cannot do score x score analyses with them. I don't see any reason to specifically attach correlation matrices when all the data files are there already.

Methodologists are adamant about the fact that PCA and FA are quite different animals. But I'm not going to insist on this.


I can insert a note saying that they are treated as the same even tho some people think they should not be so. Sounds good?
Admin
I manged to install LATEX and get it working.

Changes:
- Got rid of “well-doing”, replaced either with “well-being” or nothing when appropriate.
- Added to introduction:
Since I was only concerned with the question of a general factor, all analyses used only the first factor.
- Removed “they” from the sentence in Introduction.
- Added more clarity as to when correlations were between factor scores and loadings, and added results from congruence factor analyses.
- Added a plot with MCV results.
- Added a note about PCA as a factor analytic method.
Admin
I found a mistake in my paper. When I calculated the mean correlation, I used mean() on the correlation matrix. However, the diagonals are always 1, so this inflates the results upwards. I think the differences are slight, but I will fix this in my next revision.
1) "I answered this partly before. I was only looking for a general socioeconomic factor and did not have any specific model in mind. No hypotheses were made about any non-general factors, nor was any specific model of the data advanced beforehand. This is why I did not use CFA either."

You should mention the size of the subsequent factors, i.e., variance explained.

2) 'General factor' is still not defined in the paper. The fact that lots of the variables have negative loadings on the first factor and the implications of this fact should be discussed straight away in section 3. How many loadings have the correct sign? It would make sense to merge sections 3 and 8, because you should discuss what the loadings look like before rather than after running all those analyses using the loadings.

3) "I also ran a oblimin rotation. With nfactors=5, all factors extracted correlate with each other in the right direction (note ML2 is reversed). This indicates a general, higher-order factor, yes?"

Yes, this is evidence in favor of a general factor unlike the analyses in the paper. However, the number of factors extracted should be decided based on accepted methods (Kaiser's rule, scree plot, parallel analysis...) rather than forced on the data.

4) "Similarly, the congruence factor was 1.0."

It's the congruence coefficient.

5) In sections 4-6 the method of calculating the factor/component scores should be mentioned, and it should be clarified that when different extraction methods are compared across different numbers of variables, PCA is always compared to PCA, ML to ML, etc.

6) "These very high correlations resulting from the use of method of correlated vectors in these two datasets give indirect support for researchers who have been arguing that the heterogeneous and lower than unity results are due to statistical artifacts, especially sampling error and restriction of range of subtests"

This sentence is unclear because it is not specified what the researchers have argued. Mention g loadings, IQ, or something.

7) "There is a question concerning whether it is proper to analyze do the MCV analyses without reversing the variables that have negative loadings on the S factor first. Using the non-reversed variables means the variance is higher which increases the correlation. Reversing them would decrease the correlations. I decided to use the data as they were given by the authors i.e. with no reversing of variables. As one can see from the plots, reversing them would not substantially change the results."

Say what the MCV results are with reversed variables.

8) "The analyses carried out in this paper suggest that the S factor is not quite like g. Correlations between the first factor from different subsets did not reach unity, even when extracted from 10 non-overlapping randomly picked tests (mean r’s = .874 and .902)."

You compared factor scores, Johnson et al. compared latent factors in CFA. Different methods, so no reason to expect similar results. It is unsurprising that factor scores based on 10 variables (many with measurement scale issues to boot) are not perfectly correlated with the underlying factor because such scores, especially PCA ones, have plenty of specific and error variance in them. In contrast, CFA factors based on, say, 10 or more variables are highly stable and contain no specific/error variance. When you increase the number of variables to 40 or 50 in EFA, factor scores will be much more highly correlated with the putative underlying factor than with 10 variables. The almost-unity correlation between the SPI and DP factors is an indicator of the S factor's stability while the lower correlations using 10 variables are what you would expect regardless of the nature of the underlying factor--for example, the correlation between g scores from two IQ batteries with 10 subtests each will not be 1.00.

9) "I don't know what you mean regarding formative vs. reflective factor. Perhaps you can link to some material that covers these concepts or explain them briefly."

In reflective factor models the factor causes differences in its indicators, while in formative models the factor is non-causal and the indicators cause variance in it. The g factor is typically seen as a reflective factor (general intellectual capacity causes performance differences in cognitive tests), while SES is typically seen as a formative factor (differences in income, occupation, etc. cause differences in SES).

Is your S factor just a stalking horse for the G factor? Then you could argue that different indicators of international well-being are just "items" in an international IQ test, cf. Gordon's discussion of life events as IQ items in his "Everyday life as an intelligence test." However, I think that this analogy is awkward when the cases are countries rather than individuals, and the psychometric quality of the international IQ data is as bad as it is.

---

Your other answers/alterations are OK.
[hr]
There are other sources that are more sanguine about the CC, e.g., https://media.psy.utexas.edu/sandbox/groups/goslinglab/wiki/05604/attachments/549fc/Factor%20Congruence.pdf In any case, the problems with using Pearson's r in the analysis of factor loadings are even greater.


I can't access your link, it says :

gateway incorrect
error 502


The CC paper is attached.
Admin
Good criticism. I will work on a new draft tomorrow

ETA. I didn't have time to do the reverse coding today. So I will reply later.
Admin
I am working on the reversing, but it is unclear how this should be done. For instance, in the DR dataset, there is a variable about electricity use per capita. Is that considered good or bad? On the one hand, it is a sign of a rich country that we are able to use some many resources per person. In the environmentalist view, it is bad since we are... using so many resources per person. There is no obvious way to deal with this.

Reversing the obvious cases in the SPI dataset (there are no obvious cases in DR), results in a MCV r=.98. Very similar.

I have reversed the following vars: 1:6,8,13:18,26,28:33,35,42,48,53
Admin
This one has a new section of oblimin rotated factor analysis. Results are much the same as using the first unrotated factor: r's = .97.

I have added more discussion of reversing variables, and results of MCV using those. Almost the same results too.

Rewrote some of the discussion concerning comparisons with the Johnson et al studies.