Back to [Archive] Post-review discussions

[ODP] Crime, income and employment among immigrant groups in Norway and Finland
Admin
Dear reviewers. A final revision.

https://osf.io/g2fsr/

I have added MCV analyses that I also used in the previous paper with international data. The results were similar (near unity). It is in a new section above the discussion. Please indicator whether this section is ok to you.
Emil,

I'm having trouble understanding the new section, especially the first few sentences. It might be better to rewrite it as follows, but I'm not really sure:

Arthur Jensen invented the method of correlated vectors (MCV) to find out whether a variable correlates with the g factor (general cognitive ability) or with the remaining non-g variance.[22] The same method can be applied to other latent variables. I have previously used it to test whether various predictors (e.g. national IQ) correlate with the latent factor of international rankings (international S factor) or with the remaining variance.
Admin
The thing is that it is not exactly checking for a correlation, but whether the latent trait is responsible for the observed correlation. For instance, when it is used on IQ gains over time (Flynn-Lynn effect), the effect is not found to be g-loaded (the subtests with the lowest g-loadings rise the most).

Here's a new version. The only change is in the MCV section (8). https://osf.io/g2fsr/ revision #6.
O.K. I would just change the last sentence of that paragraph to "The results strongly indicated that the S factor is responsible"
The thing is that it is not exactly checking for a correlation, but whether the latent trait is responsible for the observed correlation. For instance, when it is used on IQ gains over time (Flynn-Lynn effect), the effect is not found to be g-loaded (the subtests with the lowest g-loadings rise the most).

Here's a new version. The only change is in the MCV section (8). https://osf.io/g2fsr/ revision #6.


This sentence is still confusing:

"Arthur Jensen invented the method of correlated vectors (MCV) in 1983 to find out whether the g factor (general cognitive ability) is responsible for correlations of cognitive test scores with a variable of interest (e.g. head size) or whether it is due to other factors.[22, 23, 24]"

How about:

"Arthur Jensen invented the method of correlated vectors (MCV) in 1983 to find out if the general factor of intelligence is responsible for mean differences in measures of intelligence [22, 23, 24]. Today, the method is mostly used with g (e.g. [25, 26, 27, 28]) and in context to mean differences, but it can also be used for any latent variable and in context to correlations between a predictor and criterion. To apply the method, one correlates the indicator variables’ loading on the latent variable of
interest (the vector of latent variable loading) with either the between group mean differences in the indicator or the relevant correlation between the indicator and criterion (the vector of difference). If the latent variable is ’driving’ the
association, then the correlation will be positive and strong.

"The standard deviation of loadings in the Norwegian and Danish datasets are .83 and .75, respectively, so range restriction does not appear to be a problem."

"In every case, the result is close to unity in the expected direction (Islam prevalence is negatively related to S factor scores, while the others are positively)."

Note: Could you just reverse the Islam sign as it's standard practice to report results such that a positive Jensen effect indicates that differences are greater on more general-factor loaded variables.

Thanks. On condition that you make the necessary alterations, I approve.
Here's a new draft. #7, https://osf.io/g2fsr/


This sentence still has trouble: "If the latent variable is ’driving’ the association and is positively correlated to the criteria variable, then the correlation will be positive, if it is the remaining variance, it will be negative, and if it is both or neither it will be near zero."

(It's both unclear and grammatically challenged.)

Maybe:

"If the general factor is ’driving’ the association and is positively correlated with the criteria variable, then the correlation between factor loadings and the effect sizes of the predictor -criterion associations will be positive. However, if the association is driven by the variance not attributable to the general factor, the correlation will generally be negative. And it will generally be somewhere in between if the association is driven by a mix of general and non-general factors."
Admin
I have replaced my sentence with Chuck's version above. Revision #8. https://osf.io/g2fsr/
I have replaced my sentence with Chuck's version above. Revision #8. https://osf.io/g2fsr/


I approve this version. Who else does? For this latest version, we have:

Peter Frost
Chuck
I looked at version 8, and it's ok for me. However, and although it's my opinion, I think you should probably add "method of correlated vector" in the keywords.

EDIT :

Concerning this sentence here "Since the method relies on the indicator variables of the latent variable, it is susceptible to sampling error when the number of indicator variables is small.", I suppose you're referring to psychometric sampling error ? Jensen used this term to describe the situation when you have an unrepresentative sampling of test contents (e.g., battery of test with mainly verbal tests or not enough in one (or more) of the constructs). In that case you should probably write it as "psychometric sampling error". That will avoid confusions in the terms.
Admin
There are a few ways it can go wrong. E.g. if one uses only a specific type of subtest (e.g. 10 types of arithmetic tests), or there is not much variance between subtest factor-loadings (restriction of variance, common in normal IQ batteries because they selected only useful subtests on purpose), or just ordinary sampling error due to low number of subtests (also often the case, since N_subtest is often only 5-10). For these reasons, MCV is not a very robust metric (susceptible to artifacts). The term you propose seems to be an inclusive concerning errors in estimating the true correlation between factor loadings and the vector of interest.

Certainly, like any other statistic, a g factor based on a limited number of mental tests and a limited number of subjects will contain error. There are three main sources of such error: (1) subject sampling error, because a sample does not perfectly represent the population; (2) psychometric sampling error, because a limited number of diverse mental tests does not perfectly represent the total population of mental tests, actual or conceivable; and (3) all factor scores in a common factor analysis, including g factor scores, by any method of derivation, are only estimates of the true factor scores, which remain unknown, in the same sense that obtained scores are estimates of true scores, with some determinable margin of probable error, in classical measurement theory. It has been deter- mined mathematically that the average minimum correlation between estimated factor scores and their corresponding hypothetical true factor scores rapidly increases as a function of the ratio of the number of tests to the number of first- order factors (Gorsuch, 1983, p. 259). With 11 tests and two first-order factors, as in this psychometric battery, the minimum correlation between estimated and true factor scores would be + .84, and the actual correlation could be well above this value.


Kranzler, J. H., and Jensen, A. R. (1991). Unitary g: Unquestioned postulate or empirical fact? Intelligence, 15, 437—448.

Just as we can think statistically in terms of the sampling error of a statistic, when we randomly select a limited group of subjects from a population, or of measurement error, when we obtain a limited number of measurements of a particular variable, so too we can think in terms of a psychometric sampling error. In making up any collection of cognitive tests, we do not have a perfectly representative sample of the entire population of cognitive tests or of all possible cognitive tests, and so any one limited sample of tests will not yield exactly the same g as another limited sample. The sample values of g are affected by subject sampling error, measurement error, and psychometric sampling error. But the fact that g is very substantially correlated across different test batteries means that the variable values of g can all be interpreted as estimates of some true (but unknown) g, in the same sense that, in classical test theory, an obtained score is viewed as an estimate of a true score.


Jensen, A. R. (1993). Psychometric g and achievement. In B. R. Gifford (Ed.), Policy perspectives on educational testing. Norwell, MA: Kluwer Academic Publishers. Pp. 117-227.

The deviation from perfect construct validity in g attenuates the values of r(g× d). In making up any collection of cognitive tests, we do not have a perfectly representative sample of the entire universe of all possible cognitive tests. Therefore any one limited sample of tests will not yield exactly the same gas another such sample. The sample values of g are affected by psychometric sampling error, but the fact thatgis very substantially correlated across different test batteries implies that the differing obtained values of g can all be interpreted as estimates of a “true” g. The values of r (g× d) are attenuated by psychometric sampling error in each of the batteries from which a gfactor has been extracted. We carried out a separate study to empirically estimate the values for this correction.


Dragt, J. (2010). Causes of group differences studied with the method of correlated vectors: A psychometric meta-analysis of Spearman’s hypothesis.

Just the first three hits I got from googling "psychometric sampling error". Note, however, that the S factor is not a psychological construct, so calling it "psychometric" does not make sense. What subject-neutral term do you propose? It is sampling error involving the indicator variables of a latent variable/factor.
You don't necessarily need to make this change. If "psychometric" is a misleading term for your S factor, just leave it as it is (i.e., "sampling error"). I understood what you mean anyway.
Admin
There is another thing that has been bugging me and creating confusion. I use "N" to refer to both the number of individuals in a sample, and N for the number of indicator variables (IV) or subtests. Really I ought to use two different terms. Perhaps just use subscript for when talking about indicator variables.

"IV sampling error" is a pretty neutral term.

---

I have added a new revision, #9. The only changes are in the section concerning MCV, as well as the extra paragraph in the abstract summarizing the MCV results. I have replaced "indicator variable" with "IV". I have added "method of correlated vectors" to the list of key words as MH suggested.
There is another thing that has been bugging me and creating confusion. I use "N" to refer to both the number of individuals in a sample, and N for the number of indicator variables (IV) or subtests. Really I ought to use two different terms. Perhaps just use subscript for when talking about indicator variables.


Looks good. We're not all going to re-approve.

Publish.
Admin
It kinda is a waste of time to keep getting re-approval for small language changes like this. There really is no journal policy about it yet. Maybe we should make one.
It kinda is a waste of time to keep getting re-approval for small language changes like this. There really is no journal policy about it yet. Maybe we should make one.


Yes, it is a waste of our time. The policy should be: "Approval is not needed for minor alterations, especially language related ones". Given that policy, you have your approvals. You can publish.
Admin
Published.
http://openpsych.net/ODP/2014/10/crime-income-educational-attainment-and-employment-among-immigrant-groups-in-norway-and-finland/

OSF has all the supplementary files. I will update the Wiki there with a description of the files in the folder.
https://osf.io/emfag/