Hi Dalliard. Thank you for a thorough review.
1) I'd like to see more details of the factor/PC analyses. Were the sub-component intercorrelations explained by a single factor by the usual standards (e.g., only one factor with eigenvalue>1)? If not, how many other factors were there, can you give a substantive interpretation to them, and are they correlated with national IQ? KMO and Bartlett's test are pretty superfluous and aren't usually reported; I would mention them in a footnote only.
You can view all the details in the R code file.
Number of factors to extract was in all cases set to 1 ("nfactors=1" in code), so no criteria for determining the number of factors to keep was used.
I report KMO and Bartlett's because a reviewer (Piffer) had previously requested that I include them. Clearly I cannot satisfy both of you. It seems best just to include them. They don't take up much space.
2) "individuals have a general socioeconomic factor"
This language is confusing. 'Factor' refers to a source of variance among individuals, so individuals cannot have factors. Individuals are located at different points on a factor, or have factor scores.
What I meant is that if one analyses the data at the individual level, one will also find a general socioeconomic factor (i.e. s factor).
Changed text to:
Gregory Clark argued that there is a general socioeconomic factor which underlies their socioeconomic performance at the individual-level.\cite{clark2014}3) "national measures of country well-doing or well-being"
excise "well-doing"
I prefer to keep both because these variables measure all kinds of things, some related to well-being (e.g. health, longevity) others to well-doing (income, number of universities).
4) "(see review national in [5]."
reword that
Changed text to:
Previous studies have correlated some of these with national IQs but not in a systematic manner (see review in \cite{lynn2012intelligence}.5) Figure 1 describes the structure of the SPI, why is there no corresponding figure describing the DP?
It did not seem necessary. It is less complicated and the user can find it in the manual referenced. Do you want me to include an overview of it?
http://democracyranking.org/?page_id=590As far as I can tell, they have 6 'dimensions', which have the weights 50, 10, 10, 10, 10, 10. I'm not sure what they do within each 'dimension', probably they average the indexes. As with the DR, SPI, the HDI also has some idiosyncratic way of combining their variables to a higher construct. For reference,
the HDI is based on a geometric mean (
a what?) of three indicators.
6) "principle components analyses", "principle axis factoring"
principal, not principle
Fixed.
7) "In some cases PCA can show a general factor where none exists (Jensen and Weng, 1994 [13]). For this reason, I compared the first factor extracted via PCA to the first factors using minimum residuals, weighted least squares, generalized least squares, principle axis factoring and maximum likelihood estimation"
I don't see how the similarity of factor loadings based on different extraction methods can tell us anything about the existence of a general factor. You don't say what a 'general factor' is, but I assume it means a factor with all-positive indicator loadings regardless of extraction method. There is no such factor in your data, as indicated by the many negative loadings listed in the Appendix. Even if you reverse coded the variables so that higher values on all variables would have positive valence (e.g, "Adequate nourishment" instead of "Undernourishment"), which I think would be a good thing to do, there'd still be negative loadings on the first factor/PC (e.g., suicide rate).
A general factor need not be a
perfectly general factor, just a very large one. As you mention, there are a few variables that load in the 'wrong' direction which is not found in the analysis of cognitive data.
A general factor would be disproved if there was a lot of variables that didn't load on the first factor (i.e. with very low loadings, say <.10). This isn't the case with these national data, mean abs. loading was high (.6-.65).
8) How did you compute the correlations between factor loadings? The congruence coefficient rather than Pearson's r should be used: http://en.wikipedia.org/wiki/Congruence_coefficient (Or did you use factor scores?)
Congruence coefficient has some bias (
e.g.). I used Pearson r.
I found an error. I had forgotten to add PCA to the comparison with the 5 other methods. It made little difference.
One cannot compare loadings (easily) when using subset x whole/subset analysis, for those I used scores.
I compared scores in the full datasets before. They have a mean around .99 for both datasets using Pearson. I have now written more code to compare the loadings too, also with Pearson's. They are also .99 in both datasets.
Clearly, if the score correlations are very high, the loading correlations will also be, and vice versa. So in that sense doing both analyses is unnecessary.
I doubt using Spearman's or CC instead will change these results much. They are almost identical for every method in the full datasets.
However, I used the CC as requested. Results are rounded to three digits.
I used this function.
Results:
> #for SPI
> factor.congruence(list(y_all.loadings),digits=3)
PC1 MR1 WLS1 GLS1 PA1 ML1
PC1 1.000 0.997 1.000 1.000 1.000 0.997
MR1 0.997 1.000 0.997 0.997 0.997 1.000
WLS1 1.000 0.997 1.000 1.000 1.000 0.997
GLS1 1.000 0.997 1.000 1.000 1.000 0.997
PA1 1.000 0.997 1.000 1.000 1.000 0.997
ML1 0.997 1.000 0.997 0.997 0.997 1.000
> #for DR
> factor.congruence(list(z_all.loadings),digits=3)
PC1 MR1 WLS1 GLS1 PA1 ML1
PC1 1.000 0.997 1.000 1.000 1.000 1.000
MR1 0.997 1.000 0.997 0.997 0.997 0.998
WLS1 1.000 0.997 1.000 1.000 1.000 1.000
GLS1 1.000 0.997 1.000 1.000 1.000 1.000
PA1 1.000 0.997 1.000 1.000 1.000 1.000
ML1 1.000 0.998 1.000 1.000 1.000 1.000
9) Sections 4-6 have nice graphs, but I don't see their purpose. The fact that the correlation between two linear combinations of correlated elements gets higher the more there are shared elements is self-evident. The graphs might be of use if there was a practical need to estimate the S factor using only a limited number of components, but I don't see why anyone would want to do that.
I assume that the results in sections 4-6 are based on correlations of factor/component scores, but it's nowhere specified. Are the factors extracted with replacement?
Are the results from all the 54 components based on PCA? If so, the higher correlations with PCA components could be artefactual, due to common method variance.
As you can see, there has been some methodological studies concerning the interpretation of loadings from small subsets of variables, as well as correlations between factors from different methods. These results are clearly relevant to these matters.
I changed the text in section 4 to:
Since I found that regardless of method and dataset used, the first factor was a general factor accounting for about 40-47\% of the variance, it was interesting to know how many components one needed to measure it well. To find out, I sampled subsets of components at random from the datasets, extracted the first factor, and then correlated the scores from it with the scores of the first factor using all the components. I repeated the sampling 1000 times to reduce sampling error to almost zero. Since recently there was interest in comparing g factors from different factor extraction methods, I used the 6 different methods mentioned before.I don't know what you mean "by replacement". I used the regression method to get scores. I used the fa() function from the psych package.
http://www.inside-r.org/packages/cran/psych/docs/faThe result from all the 54 variables is based on the same method used to extract from the subset. So, when using maximum likelihood (ML) to extract S from the subset of vars, the comparison S is also extracted via ML.
The subset x subset analyses are only based on PCA, however. I can replicate these with the other 5 methods if necessary. I think the differences will be slight.
10) MCV correlations of 0.99 are unusual in my experience. I'd like to see the scatter plots to ascertain that there's a linear association across the range. It's possible that the high correlations are due to outliers. What happens to the MCV correlations if you reverse score the variables with negative valence?
The first thing I did when working this out was to plot them. You can do this yourself with the plot() command (the code is around lines 293-327). They are indeed very linear. I have attached one plot for each with national IQs. They are almost the same with Altinok's.
I don't know what you mean with the other question.
11) "The analyses carried out in this paper suggest that the S factor is not quite like g. Correlations between the first factor from different subsets did not reach unity, even when extracted from 10 non-overlapping randomly picked tests (mean r’s = .874 and .902)."
Your analysis is based on different sets of observed variables, while the g studies that found perfect or nearly perfect correlations were based on analyses of latent factors which contain no error or specific variance, or on analyses of the same data set with different methods. So the results aren't comparable.
I am referring to the two Johnson studies which found that g factors extracted from different IQ batteries without overlapping tests did reach near-unity (1 or .99). This wasn't the case for these data. One problem in interpretation is that IQ batteries are deliberately put together so as to sampled a broad spectrum of ability variance. Randomly chosen subsets of such tests are not. The best way to proceed is to obtain the MISTRA data and repeat my analyses on them. So it comes down to whether they want to share the data or not.
12) The biggest problem in the paper is that it seems to be pretty pointless. Yes, most indicators of national well-being are highly correlated with each other and with national IQ, which means that the first PC from national well-being data must be highly correlated with national IQ, but so what? That was obvious from the outset.
What is the S factor? Do you believe it is a unitary causal factor influencing socioeconomic variables? That interpretation would give some meaning to the paper, but I think it's not a very promising idea. Socioeconomic status is normally thought of as a non-causal index reflecting the influence of various factors. Clark speaks of a "social competence" that is inherited across generations, but I don't think he views it as a unitary causal force but rather as a composite of different influences (such as IQ and personality).
I think national IQs and national socioeconomic indices are so thoroughly causally intermingled that attempting to say anything about causes and effects would require longitudinal data.
Strangely, this was the criticism that Gould also offered of the g factor (
cf. Davis review). I disagree the results are NOT obvious. Neither g, G, s or S are obvious. Especially not the MCR results.
I have no particular opinions to offer on the causal interpretations. I don't want my paper to get stuck in review due to speculative discussion of causation. I think the descriptive data are very interesting by themselves.
13) "It is worth noting that group-level correlations need not be the same or even in the same direction as individual-level correlations. In the case of suicide, there does appear to be a negative correlation at the individual level as well."
This means that the s and S factors aren't the same. Why is suicide an indicator of socioeconomic status anyway?
Not socioeconomic status. General socioeconomic factor. I'm not sure what the specific complaint is.
14) I couldn't find a correlation matrix of all the variables in the supplementary material. It would be useful (as an Excel file).
They would be very large (54x54 and 42x42). Anyone curious can easily obtain it in R by typing:
write.csv(cor(y),file="y_matrix.csv")
> write.csv(cor(z),file="z_matrix.csv")
For your ease, I have attached both files.
15) PCA and factor analysis are really different methods, and PCA shouldn't be called factor analysis.
Semantics. Some authors use "factor analysis" as a general term as I do in this paper. Others prefer "dimensional reduction method" or "latent trait analysis" or some other term and then limit "factor analysis" to non-PCA methods.
--
Attached is also a new PDF with the above fixes and a new code file with the changes I made.