Back to [Archive] Post-review discussions

[ODP] The Scandinavian WAIS IV Matrices as a Test of Dutton, te Nijenhuis and Rovaine
The Scandinavian WAIS IV Matrices as a Test of Dutton, te Nijenhuis and Rovainen's Theory of Finnish High Intelligence

Abstract

The Scandinavian (Sweden, Norway, Denmark) standardization of the WAIS IV on the matrices is presented in English for the first time. The score of Scandinavia on the WAIS IV matrices is shown to be higher than Finland. However, sampling differences would seem to explain why Scandinavia's WAIS IV matrices result is higher than Finland's yet its PISA score on a similar test to the matrices, based on significantly larger samples, is lower.

Key Words: Intelligence, WAIS IV matrices, PISA.

The newest version is on Google Docs.

Fixed the title. Remember to indicate which journal it is submitted to. -Emil
The average OECD score was 500.
Britain's IQ should be used to give a Greenwich IQ. England's PISA CPS is 517. This is why your IQs for Scandinavia and Finland are inflated. Here, as you know, I converted PISA CPS into Greenwich IQ: https://docs.google.com/spreadsheets/d/16vdyElN4nwnfk0J6IKXq6Zf6RMdqbUUXKxYKgkxTW9A/edit?usp=sharing
Setting the Greenwich IQ as the OCSE average gives unlikely results, for example a fluid IQ of 105 for Northern Italy and around 104 for Finland and 108 for Japan! 101 is a better estimate of the Finnish fluid IQ.
Formula used for converting PISA CPS to IQ should be specified. I used this Y=((X-517)/96)*15+100, where Y is country IQ and X is the PISA CPS score.
I do not know which SD you used. I used 96 because the SD for OCSE is 96. It's better to use the OCSE SD average than to use the SD for each country, otherwise countries with lower SD would get an overestimated score. I adopted this procedure in the paper I recently published in Intelligence (Piffer & Lynn, 2014).
Also the matrices and PISA CPS test fluid intelligence, not full scale IQ.
Also this statement should be amended accordingly: "With a raw score of 526 and an SD of 100 the Canadians have an IQ score of 103.9 on the matrices-like subtest". As you can see from my table above, Canadian fluid IQ is 101.4
The PISA data also reports scores for Finns living in Sweden (or Swedes living in Finland, I cannot remember) and this is a useful comparison, so should probably be included.
Ninja Edit: Also as the WAIS includes scores for native only, it's probably better to compare Scandinavian and Finnish Natives. In my table, column F, I converted native PISA CPS into IQ. These should be preferably used instead of those in column C, which report total (immigrants + native) IQ.
Edit 2: The Finnish seem to have higher PISA School than PISA CPS scores, and this is likely due to better schooling in Finland. Finn's fluid intelligence is not much higher than in Britain, only 0.4 point according to PISA CPS and 1 point according to WAIS. There is evidence that schooling has bigger impact on scholastic intelligence than on fluid intelligence, as expected: http://www.unz.com/isteve/none-dare-call-it-iq/
This fact should also be mentioned in the discussion.
The average OECD score was 500.
Britain's IQ should be used to give a Greenwich IQ. England's PISA CPS is 517. This is why your IQs for Scandinavia and Finland are inflated. Here, as you know, I converted PISA CPS into Greenwich IQ: https://docs.google.com/spreadsheets/d/16vdyElN4nwnfk0J6IKXq6Zf6RMdqbUUXKxYKgkxTW9A/edit?usp=sharing
Setting the Greenwich IQ as the OCSE average gives unlikely results, for example a fluid IQ of 105 for Northern Italy and around 104 for Finland and 108 for Japan! 101 is a better estimate of the Finnish fluid IQ.
Formula used for converting PISA CPS to IQ should be specified. I used this Y=((X-517)/96)*15+100, where Y is country IQ and X is the PISA CPS score.
I do not know which SD you used. I used 96 because the SD for OCSE is 96. It's better to use the OCSE SD average than to use the SD for each country, otherwise countries with lower SD would get an overestimated score. I adopted this procedure in the paper I recently published in Intelligence (Piffer & Lynn, 2014).
Also the matrices and PISA CPS test fluid intelligence, not full scale IQ.
Also this statement should be amended accordingly: "With a raw score of 526 and an SD of 100 the Canadians have an IQ score of 103.9 on the matrices-like subtest". As you can see from my table above, Canadian fluid IQ is 101.4
The PISA data also reports scores for Finns living in Sweden (or Swedes living in Finland, I cannot remember) and this is a useful comparison, so should probably be included.
Ninja Edit: Also as the WAIS includes scores for native only, it's probably better to compare Scandinavian and Finnish Natives. In my table, column F, I converted native PISA CPS into IQ. These should be preferably used instead of those in column C, which report total (immigrants + native) IQ.
Edit 2: The Finnish seem to have higher PISA School than PISA CPS scores, and this is likely due to better schooling in Finland. Finn's fluid intelligence is not much higher than in Britain, only 0.4 point according to PISA CPS and 1 point according to WAIS. There is evidence that schooling has bigger impact on scholastic intelligence than on fluid intelligence, as expected: http://www.unz.com/isteve/none-dare-call-it-iq/
This fact should also be mentioned in the discussion.


Thank, Duxide. This is really helpful. I will send this to my colleague and submit a new version anon.
Admin
Hi Dutton/Barleymow

Thank you for submitting a paper to our journal.

--

Generally the paper needs some improvement but it has publication potential.

These draw upon very large and representative samples and are strongly (around 0.8) correlated with IQ.
...
Taking an IQ estimate from many years of PISA data is more reliable. PISA score correlates with IQ score at 0.82 (Rindermann, 2008)
...
PISA only correlates at 80% with IQ.


Three claims on the same number, three different numbers, two different formats. Correlations as percentages is nonsense. Worse, it could be confused with the https://en.wikipedia.org/wiki/Coefficient_of_determination (R^2) measure.

Further, I could not find the actual .82 number in the source cited. Some care must be taken here because Lynn and Vanhanen's 2012 numbers (“FINAL-IQ” are based on combined scholastic + IQ data. They cannot be used to measure the correlation between IQ tests and PISA scores.

One has to use the “Measured IQ” column from their Table 2.1 (page 19ff). These are found in my megadataset.

In fact, I ran the analysis for you.

R code is:

read = read.csv("Megadataset_v1.3.csv")

DF = cbind(read["LV2012measuredIQ"],
read["PISA09READ_Native"],
read["PISA12MATH_Native"],
read["PISA12CPS_Native"],
read["PISA00"],
read["PISA03"],
read["PISA06"],
read["PISA09"],
read["PISA12"])

DF.cor = rcorr(as.matrix(DF)) #create correlation matrix with pairwise miss data deleted


I ran the correlations both for natives only and for everybody. Results:

LV2012measuredIQ PISA09READ_Native PISA12MATH_Native PISA12CPS_Native PISA00 PISA03 PISA06 PISA09 PISA12
LV2012measuredIQ 1.00 0.93 0.94 0.89 0.94 0.96 0.95 0.95 0.92
PISA09READ_Native 0.93 1.00 0.93 0.90 0.93 0.94 0.96 0.97 0.94
PISA12MATH_Native 0.94 0.93 1.00 0.92 0.90 0.94 0.96 0.96 0.98
PISA12CPS_Native 0.89 0.90 0.92 1.00 0.86 0.86 0.89 0.92 0.91
PISA00 0.94 0.93 0.90 0.86 1.00 0.95 0.94 0.95 0.94
PISA03 0.96 0.94 0.94 0.86 0.95 1.00 0.99 0.98 0.96
PISA06 0.95 0.96 0.96 0.89 0.94 0.99 1.00 0.98 0.97
PISA09 0.95 0.97 0.96 0.92 0.95 0.98 0.98 1.00 0.98
PISA12 0.92 0.94 0.98 0.91 0.94 0.96 0.97 0.98 1.00


So the PISA x IQ correlations are around .944 (natives+immi.) or .92 (natives only).

-

We compared Scandinavia, Finland, and the USA (Wechsler, 2008) on the matrices sub-test. We used this sub-test because it is highly g-loaded and thus gives us the best picture of the population's general intelligence.


Matrix tests are highly g-loaded (or at least used to be), but they are not better measurements of g than is the combined results of multiple subtests. See: Johnson, W., Nijenhuis, J. T., & Bouchard Jr, T. J. (2008). Still just 1< i> g</i>: Consistent results from five test batteries. Intelligence, 36(1), 81-95.

You should use all of them if possible. The best method is to use an extracted g factor from all available subtests based on their loadings

-

I note from the tables that in the Scand. data there is a clear correlation between mean age of the subsample and the IQ score.

I have put the datasets here. The corrrelation in the Scand. sample is .73. This indicates either one of the following: 1) selective recruiting, 2) old people in Scand. are relatively smarter, 3) norms are off, 4) a late acting developmental effect, 5) something else.

The same correlation in Finnish sample is .01.

-

I note that you didn't use the weighted mean, but the unweighted mean. They give similar results, but you should probably use the weighted mean. It can be found in my calculations.

The two means are the same for FI sample (103.1), but slightly different for SC sample (105.1 vs. 105.3, respectively weighted and unweighted).

-

From the tables. I don't understand why -1sd is apparently 7 standard points, but +1sd is 13 standard points. Are the raw data super skewed?

-

In this article, we further test Dutton et al.'s theory with...
...
We compared Scandinavia
...
However, we do not regard these results as in conflict with Dutton et al. (2014) for two related reasons.


There is only one author, yet you write "we". That sounds silly to me.

-

Firstly, it can be seen the samples in each age group are relatively small and accordingly there is room for sampling error. Indeed, precisely this kind of error may explain the oddly low IQ score of Scandinavian 35-39 year-olds (98 on Greenwich norms).


You are arguing in words what should be argued with math. The sample sizes of individual age groups are not so important when we are interested in the total group vs. total group comparisons, which we are.

Use a t test to compare the means. I like R so I want to use R to test this. However, apparently no one wrote a t test that takes in summary statistics instead of raw data. How very silly.

So I found someone who wrote a function.

R code:
#t test from summary stats
# m1, m2: the sample means
# s1, s2: the sample standard deviations
# n1, n2: the same sizes
# m0: the null value for the difference in means to be tested for. Default is 0.
# equal.variance: whether or not to assume equal variance. Default is FALSE.
t.test2 <- function(m1,m2,s1,s2,n1,n2,m0=0,equal.variance=FALSE)
{
if( equal.variance==FALSE )
{
se <- sqrt( (s1^2/n1) + (s2^2/n2) )
# welch-satterthwaite df
df <- ( (s1^2/n1 + s2^2/n2)^2 )/( (s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1) )
} else
{
# pooled standard deviation, scaled by the sample sizes
se <- sqrt( (1/n1 + 1/n2) * ((n1-1)*s1^2 + (n2-1)*s2^2)/(n1+n2-2) )
df <- n1+n2-2
}
t <- (m1-m2-m0)/se
dat <- c(m1-m2, se, t, 2*pt(-abs(t),df))
names(dat) <- c("Difference of means", "Std Error", "t", "p-value")
return(dat)
}

#input numbers
mean.fi = 17.96583333
sd.fi = 3.8275
n.fi = 600
mean.sc = 18.15192308
sd.sc = 4.320705128
n.sc = 780

t.test2(mean.fi,mean.sc,sd.fi,sd.sc,n.fi,n.sc)


Result:
Difference of means Std Error t p-value
-0.186 0.220 -0.846 0.398


Difference in raw scores isn't statistically certain. Likely a fluke.

Note that for the above I used the weighted means which reduced the FI sample since the last age group was not used.

Note also that I used the Welch test because the variances seemed unlikely to be identical (3.83 vs. 4.32). The Levene test also does not work for summary statistics so I didn't actually calculate this.

In short, the difference between the samples is not statistically certain according to Welch test.

-

we have large, representative samples from each country for 15 year olds and we have 5 such samples between 1998 and 2012.


There are 6, no? PISA00, PISA03, PISA06, PISA09, PISA12, PISA12CPS.

-

The most recent sample, on the subtest closest to WAIS IV matrices, can be seen in Table 3 and Finns score higher than the Scandinavian countries on this.


This is the CPS, right? Unclear.

-

Secondly, the exclusion criteria for the WAIS IV would seem to explain the discrepancy between the WAIS IV and PISA results to a considerable degree. The Scandinavian WAIS IV specifically excludes 'persons with mother tongue other than Swedish (or Danish or Norwegian).' The Finnish WAIS IV likewise excludes those whose mother tongue is not Finnish. However, the PISA exclusion criteria are far less strict. A student who is a non-native speaker can only be excluded if he has been resident in the country for less than one year and even in this case a maximum of 5% of students may be excluded (OECD, 2013). Despite this, Sweden (5.4%), Denmark (6.18%), and Norway (6.11%) could not reach this standard and excluded over 5%, because so many of its school children had been born abroad and had lived in the country for less than a year. Finland was able to reach this standard.
Apart from this 5%, all other children whose native-language is not the language of instruction must be included in PISA. The Scandinavian countries have much larger non-European immigrant populations than Finland and these are typically from countries with significantly lower average IQs than Europe, such as those in North Africa and the Middle East (see Lynn & Vanhanen, 2012). In 2010, Norway was 13.1% non-European, Denmark was 11.4% non-European, and Sweden was 14.3% non-European. Finland was 1.8% non-European, as most of its immigrants (4.8% of the population are immigrant) are from Russia or Estonia, countries that typically have a similar average IQ to Europe (see Dutton & Lynn, 2013). PISA does not exclude on these grounds. As such, pronounced differences in the immigrant percentage of the population, in addition to sampling errors, may help to explain the discrepancy between PISA subtest and the WAIS IV matrices.


All this, while correct, is unnecessary as the author can use the Native only scores instead. They should result in the samples looking a lot more like each other with respect to first language / immigrant status.

-

In genetic terms, this minority are more similar


Change to "Genetically, this..."

-

Conscientiousness, in particular, is associated with performing well in tests and as PISA is a low-stakes test if Dutton et al are correct - and Finns are especially high in Conscientiousness for genetic reasons, due to adopting an extreme K-strategy - then this could partly explain superior Finnish performance.


Rewrite. Needs cite for claim about big C and scholastic tests. Furthermore, the CPS is not a scholastic test, so big C has little relevance.

-

5. Conclusion

An examination of the WAIS IV matrices presents us with a Scandinavian Greenwich IQ of 102.8 in contrast to a Finnish one of 101.9. This is inconsistent with Nordic scores on the problem solving subtest in PISA, which give Finland a higher IQ than the Scandinavian countries, and with Nordic scores on PISA overall. However, this discrepancy can be explained by a number of factors. (1) PISA is a larger and more reliable sample. (2) WAIS IV excluded non-native speakers of the test language while PISA does not. Thus, the significantly high percentages of low IQ immigrants in Scandinavia compared to Finland will help to explain the difference between the two sets of results. Accordingly, we conclude that Dutton et al.'s (2014) contention that the Finns are the most intelligent population in Europe stands up against this potential counter-evidence.


Why do you need a conclusion that basically just repeats earlier stuff? The abstract is for summarizing the content of an article.
Matrix tests are highly g-loaded (or at least used to be), but they are not better measurements of g than is the combined results of multiple subtests. See: Johnson, W., Nijenhuis, J. T., & Bouchard Jr, T. J. (2008). Still just 1< i> g</i>: Consistent results from five test batteries. Intelligence, 36(1), 81-95.

You should use all of them if possible. The best method is to use an extracted g factor from all available subtests based on their loadings


Conscientiousness, in particular, is associated with performing well in tests and as PISA is a low-stakes test if Dutton et al are correct - and Finns are especially high in Conscientiousness for genetic reasons, due to adopting an extreme K-strategy - then this could partly explain superior Finnish performance.



Higher Finnish C probably explains why their PISA Math, Reading and Science is exceptional, close to East Asian countries, whereas their PISA CPS is only 0.5-1 IQ pts. higher than UK. This discrepancy should be observed. You can explain the relatively lower fluid intelligence of Finns by their higher K/conscientiousness but also by better schooling which favorably impacts crystallized intelligence (in particular scholastic skills such as those measured by PISA M,R,S) but not fluid intelligence (see link http://www.unz.com/isteve/none-dare-call-it-iq/ Emil and I are also working on a paper with preliminary evidence for this).
The paper should be focused on fluid g. This would avoid the criticism leveled by Emil ("Matrix tests are highly g-loaded (or at least used to be), but they are not better measurements of g than is the combined results of multiple subtests"). However, PISA CPS and Matrix reasoning are an adequate measure of fluid g and I think it'd be better if this paper discussed fluid g, and made it clear they're not measuring full scale IQ or even g.
1) Roivainen's name is misspelled Rovainen in the title.

2) "PISA score correlates with IQ score at 0.82 (Rindermann, 2008)"

It should be specified that the correlation is at the level of national mean scores, not individual scores.

3) "Despite this, Sweden (5.4%), Denmark (6.18%), and Norway (6.11%) could not reach this standard and excluded over 5%, because so many of its school children had been born abroad and had lived in the country for less than a year. Finland was able to reach this standard."

"its school children" should be "their school children" or "the school children of these countries"

4) The PISA data allows for the exclusion of first- and second-generation immigrants. Why didn't the authors reanalyze the PISA data so as to test if differences in the levels of immigration explain the differences between the WAIS and PISA? This analysis could probably be easily done using the International Data Explorer: http://nces.ed.gov/surveys/pisa/
The paper has now been updated and is ready to be re-reviewed at the link in the post above.
The paper has now been updated and is ready to be re-reviewed at the link in the post above.


The version you are talking about, is it the one posted by Email at "10-13-2014, 01:25 AM" ?

I will have to look more closely at the analysis itself, but regarding this :

However, as the education systems are relatively similar (see Kananen, 2014) it is unclear how this might be the case. Even granting that the Finnish educational system is better, this may itself be caused by a higher genotypic g instead of being a cause of it (see Lynn & Vanhanen, 2012).


I probably have to disagree. If it's due to higher IQ, I expect finnish people to have a substantial cognitive advantage over countries such as Norway and Sweden, but that's not the case. The IQ advantage of Finland is just about 2-3 points. Also, I don't think it's the mean IQ that matters when it comes to educational system. It's more related with politics and the elites' decisions, and by the same token, the smartest portion of the IQ distribution.
There is a great deal of evidence (e.g. from Lynn and Vanhanen) that national IQ is a very strong predictor of national differences in educational attainment and so this would be our argument. Indeed, the difference between Britain and Japan is only about 3-4 points. I'd be happy to include the possibility you suggest. Is there a meta-analysis of the kind conducted by Lynn and Vanhanen showing that 'politics and the elite's decisions' has a stronger impact on national differences in cognitive test performance than national IQ: i.e. stronger than about 0.7-8? If so, what is it?
Admin
A note on terminology. LV's books summarize other studies and report findings, but generally do not meta-analyze studies. I don't know if V has published meta-analyses before, but Lynn has done a number (e.g. sex diffs in RPM).
A note on terminology. LV's books summarize other studies and report findings, but generally do not meta-analyze studies. I don't know if V has published meta-analyses before, but Lynn has done a number (e.g. sex diffs in RPM).


Sure. But the point is that the reviewers is suggesting that a difference of 3 or so points is not down to differences in genetics but policeman or education system. But the diffence between uk and japan is around 3 points and educatinal attainment is strongly underpinned by national iq differences, so I think our argument is reasonable. However, I am happy to mention an alternative possibility if the data is sufficiently persuasive.
I did not say that politics have an impact on IQ. I said that politics have an impact on educational system.
You're quite right. But the same study we cited found that IQ has a significant influence on political orientation. That is why I would suggest that sociological arguments can ultimately be reduced to psychology.
I'm extremely busy with my own work (because it's statistically complex) so given my priorities, my answers are not as fast as I would like to.

I don't think I have lot to comment, above what I said before. Two points, however :

...we have estimated them using the reverse engineering method described in Beaujean and Sheng (2014). This involves finding the raw scores equivalent to one standard deviation above and below the mean, calculating how many raw score points each score was from the mean and squaring it to get two estimates of the variance, averaging the two variances and then taking the square root of the average to get an average standard deviation.


In general, people use the geometric mean, rather than the arithmetic mean. I do not know which one is better. But I remember when I tried it the other day in excel. When the SDs are very different, the two methods produce dissimilar results. When SDs are similar, using either method has no effect on your result.

We used 96 as the SD of CPS scores as done by Piffer and Lynn (2014). 96 is the OCSE average SD. It is better than using country SDs because using individual countries SDs inflates the IQ score of countries with lower SD.


I would like the last sentence to be more elaborated.
Some grammatical errors:
"Although Finland has a higher fluid g than Scandinavia but its CPS native score is only slightly higher than the rest of north­western Europe, at 526 points, giving a Greenwich fluid g estimate of 101.4."
"This is possible, however a meta­analysis..." This should read, "possible; however, a meta-analysis...".
"the presence of significant Northeast Asian admixture in the Finnish population" Citation, please.

I prefer the term "fluid intelligence" (or, even better "fluid skills") to "fluid g". This is quibbling, however. (Also, g should be italicized.)
It should be made clear that Finland has a much bigger advantage in PISA M,R,S performance than in fluid g and this suggests that variables associated with educational attainment such as better schooling/higher C may be contributing factors.
Admin
Some grammatical errors:
"Although Finland has a higher fluid g than Scandinavia but its CPS native score is only slightly higher than the rest of north­western Europe, at 526 points, giving a Greenwich fluid g estimate of 101.4."
"This is possible, however a meta­analysis..." This should read, "possible; however, a meta-analysis...".
"the presence of significant Northeast Asian admixture in the Finnish population" Citation, please.

I prefer the term "fluid intelligence" (or, even better "fluid skills") to "fluid g". This is quibbling, however. (Also, g should be italicized.)


E.g. http://www.eupedia.com/europe/autosomal_maps_dodecad.shtml

1) In the Introduction, the meaning of "Greenwich IQ" should be clarified.

2) Looking at tables 1 and 2, the raw scores for Finland and Scandinavia are the same for ages 18-29, but Finnish IQs (US norm) are lower. Why is this?

3) "OCSE average SD"

Shouldn't it be OECD?
A note on terminology. LV's books summarize other studies and report findings, but generally do not meta-analyze studies. I don't know if V has published meta-analyses before, but Lynn has done a number (e.g. sex diffs in RPM).


LV's books average scores from many studies, so you can call it meta-analysis. There is no clear definition of meta-analysis, and not all meta-analyses analyze moderators, etc.