Back to [Archive] Post-review discussions

1
[OQSPS] Inequality across prefectures in Japan is different
Admin
Journal:
Open Quantitative Sociology and Political Science.

Authors:
Emil O. W. Kirkegaard

Title:
Inequality across prefectures in Japan is different

Abstract:
Two datasets of Japanese socioeconomic data for Japanese prefectures (N=47) were obtained and merged. After quality control, there were 44 variables for use in a factor analysis. Indicator sampling reliability analysis revealed poor reliability (54% of correlations were |r| > .50). Inspection of the factor loadings revealed no clear S factor with many indicators loading in opposite than expected directions.

A cognitive ability measure was constructed from three scholastic ability measures (all loadings > .90). Cognitive ability was not strongly related to 'S' factor scores, r = -.19 [CI95: -.45 to .19; N=47]. Jensen's method did not support the relationship between latent 'S' and cognitive ability (r = -.15; N=44). Cognitive ability was nevertheless related to some socioeconomic indicators in expected ways.

Results from Japanese prefectures are strongly in contrast with all previous studies. There does not seem to be an S factor for this level of analysis in Japan.

Key words:
general socioeconomic factor, S factor, Japan, prefectures, inequality, intelligence, IQ, cognitive ability, cognitive sociology

Length:
14 pages, 2963 words, excluding references.

Files:
https://osf.io/4bw8u/files/

External reviewers:
I will get Kura's official comments. He has been discussing the paper with my during research and writing. As for external reviewers, perhaps Gerhard Meisenberg.
Very interesting.

1. In terms of presentation, the patterns in Figure 3 are not obvious because -- to assess whether a particular variable reflects the expected pattern -- a reader must read the variable label and then consider whether the expected loading would be positive or negative. It might make the figure easier to interpret to instead plot the inverse of undesirable variables such as divorce so that the expected loadings are positive for all variables; then the reader can simply check whether the point for the variable is to the left or right of zero.

2. It might be worth considering suspicions about the accuracy of Japan's reported abortion rates and suicide rates (e.g., https://www.guttmacher.org/pubs/journals/25s3099.html, http://www.japantimes.co.jp/news/2013/02/03/national/media-national/japans-suicide-statistics-dont-tell-the-real-story/#.VnN3SFJRKDk). I'm not sure that misreporting would vary by prefecture, but combining homicides and suicides into an unnatural death measure might avoid at least part of any measurement problems with the suicide measure.

3. It's not clear that some of the variables are necessarily desirable. Museums and libraries negatively correlate with population, which might reflect big cities having fewer-but-bigger libraries; if so, it's not obvious that having a lot of small libraries is better than having fewer big libraries with more amenities.

4. I did not see this pattern in the Mexico S factor study, but it appears that the S factors for the Japan prefectures in Figure 4 correlate fairly highly with population or at least population density. I think that all but one of the top 15 prefectures by population have an S factor above zero, and the two prefectures of those 15 with the lowest S factors (Hokkaidō, and Niigata-ken) have the lowest population density of those 15. Moreover, all but one of the bottom 15 prefectures by population have an S factor below zero, and the one prefecture with an S factor in that set above zero (Kagawa-ken) has the highest population density of that set. I'm not sure what -- if anything -- to make of that, though.
Admin
Thanks for the comment.

Very interesting.

1. In terms of presentation, the patterns in Figure 3 are not obvious because -- to assess whether a particular variable reflects the expected pattern -- a reader must read the variable label and then consider whether the expected loading would be positive or negative. It might make the figure easier to interpret to instead plot the inverse of undesirable variables such as divorce so that the expected loadings are positive for all variables; then the reader can simply check whether the point for the variable is to the left or right of zero.

3. It's not clear that some of the variables are necessarily desirable. Museums and libraries negatively correlate with population, which might reflect big cities having fewer-but-bigger libraries; if so, it's not obvious that having a lot of small libraries is better than having fewer big libraries with more amenities.


These two points together are the reason why I do not just reverse the variables -- there is often disagreement about whether they reflect good or bad outcomes!

As you note, one can interpret the museums and libraries findings in a positive light by arguing that it is better to have a few large units than many small units per capita. This is of course assuming that the total size is held constant, something which is not given by the data. I had thought of the same interpretation.

But how much of this kind of thinking is hindsight bias or post hoc theorizing? I rather avoid possible bias by not 'encoding the results' using my interpretations.

As for divorce, some people are against marriage. For instance, divorce and lack of marriage could be seen as unnecessary government interference of civil matters (libertarians) or as a sign of female empowerment (feminists). According to Wikipedia, there seems to be some people like this: https://en.wikipedia.org/wiki/Marriage_privatization https://en.wikipedia.org/wiki/Criticism_of_marriage#Feminist_approach


2. It might be worth considering suspicions about the accuracy of Japan's reported abortion rates and suicide rates (e.g., https://www.guttmacher.org/pubs/journals/25s3099.html, http://www.japantimes.co.jp/news/2013/02/03/national/media-national/japans-suicide-statistics-dont-tell-the-real-story/#.VnN3SFJRKDk). I'm not sure that misreporting would vary by prefecture, but combining homicides and suicides into an unnatural death measure might avoid at least part of any measurement problems with the suicide measure.


Yes, but the results showed that the results were not dependent on the abortion measure (indicator sampling reliability). Results for the dataset without that indicator were also not in line with expectations.

I have also tried without the abortion variable at all (before you brought it up when I was trying to figure out what was wrong). Nothing much changes.

4. I did not see this pattern in the Mexico S factor study, but it appears that the S factors for the Japan prefectures in Figure 4 correlate fairly highly with population or at least population density. I think that all but one of the top 15 prefectures by population have an S factor above zero, and the two prefectures of those 15 with the lowest S factors (Hokkaidō, and Niigata-ken) have the lowest population density of those 15. Moreover, all but one of the bottom 15 prefectures by population have an S factor below zero, and the one prefecture with an S factor in that set above zero (Kagawa-ken) has the highest population density of that set. I'm not sure what -- if anything -- to make of that, though.


Population density is tricky. On the one hand, population density presumably increases crime rates (by presenting more opportunities for crime and more human conflict), but it is also often a sign of urbanicity, which usually has a positive loading. Thus, one could argue either for inclusion of population density as an indicator, or controlling the indicators for it.

Perhaps I could try both approaches: in the main results, include it as an indicator like the others. For a robustness section, try the analyses corrected for (with residualization like in the French departments study, coming out in Mankind Quarterly soon) population density.

Thoughts?
Hi Emil,

Thanks for the comments.

I like your idea of adjusting the variables for population density. Something else might be to conduct the analysis on only the high population or high population density prefectures and then on only the low population or low population density prefectures. Maybe also conduct the analyses on only the northern prefectures and then on only the southern prefectures. These will be only exploratory analyses, but these disaggregated analyses might provide a sense of what it is about the Japan prefectures that makes them different from other countries in terms of the S factor.

It's correct that some of the variables are not clearly desirable or undesirable (e.g., marriage), and some might even have complicated desirabilities (e.g., too little and too much are both undesirable). But I'm wondering whether the finding of the lack of an S factor across the Japan prefectures might be strengthened by a model that included only those factors that are clearly desirable or undesirable, reflecting the idea that -- if there is no S factor across Japan prefectures in that model -- then it's really clear that there is no S factor there.

Along those lines, the Mexico and Brazil S factor studies seemed to have a higher percentage of variables that were more clearly desirable or undesirable, and the Mexico and Brazil studies respectively had only 21 and 32 variables, so maybe an analysis with the 20 to 30 most obviously desirable and undesirable variables from Japan would be more convincing and provide a more even comparison. The analyses that you have already reported suggest that Japan is different than Mexico and Brazil in terms of the S factor, but I think it would strengthen the analyses to rule out the larger number of variables in the Japan analysis as a source of the difference between the Mexico/Brazil and Japan studies.

Something else to consider is whether a relatively low variation in some variables might make measurement error a larger problem in this dataset. For example, income per person might need adjusted for cost of living in a prefecture, and divorce rate might need adjusted for marriage rates: in the 2013 data, Shiga-ken and Nara-ken have the same divorce rate (1.64) but the marriage rate is 5.27 in Shiga-ken and 4.44 in Nara-ken, so it's possible that a higher percentage of persons get divorced in Nara-ken and thus that the 1.64 divorce rate in Nara-ken is worse than the 1.64 divorce rate in Shiga-ken. Maybe a divorce-to-marriage ratio might be a better measure than individual marriage and divorce rates. (I'm assuming that the divorce rate is measured per 1,000 persons and not per 1,000 married persons.)

One way to avoid post-hoc coding biases is to identify ahead of time in general terms the obviously-desirable-or-undesirable variables that should be included in an S factor analysis, and then limit the variables in the main analyses to that set of pre-identified variables, such as measures of health, crime, unemployment, education, income, and dependency. Something like percentage farmers in the Brazil study would not fit in one of those categories, so it's not necessary to consider whether percentage farmers is a desirable or undesirable measure. Infrastructure reflects another set of variables in your S factor studies; this would be a good variable for a cross-national study, but I'm not sure that is always a good idea for subnational studies if, for instance, the quality of infrastructure in a subnational region largely reflects decisions made at the national level.

Hope this is helpful. It's a really interesting study.
I am the person who provided the translated dataset available only in Japanese and I would like to endorse this manuscript as very interesting and valuable. There are several points of interest from my point of view.

1. First, the result of the S factor analysis of Japanese data was fairly different from the previous studies of datasets from worldwide (country-level), Norway, Finland, Denmark, U.S., U.K., Brazil, and Mexico, which have been reported by the author of this paper with much consistency. However, there is no rule without exception. This seems to be a curious case study, to which S factor analysis does not apply as we expected. For example, Figure 2 in this manuscript makes a stark contrast to Figure 1 in the author’s analysis of the 32 London boroughs (2015) cited in the reference.


2. Although the result shows that there seems to be no clear S factor extracted from Japanese data, there are some consistency for some important socioeconomic variables. If we look at variables in Table 1, which have correlation coefficient > | 0.5| (which means that they are highly correlated with cognitive ability in the positive or negative directions), they are Gini coefficient in the asset holding, Unemployment rate, dependency on welfare, Height, Divorce rate. Homicide and skin colors are also very close to this threshold. These relations have become very much stylized in this literature.


3. Apparently, one of the most notable observation in Japanese data is the author’s finding that there is no correlation between the infant mortality and cognitive ability. When I reported this unexpected nonexistence of the correlation three years ago, I was not sure if this was due to some detective error from the statistics, or there is indeed no relationship in Japanese regions. The author decisively resolved this question to be late. In other words, the sophisticated consistency analysis of the author proved that even annual infant mortality data from 47 prefectures do not correlate with each other and they show more or less random fluctuation. I should probably add some information after my publication of the Japanese data: many of my personal acquaintances with medical expertise, have already suggested me that infant mortality in Japan largely depends on the hospital system with advanced obstetrical facilities and experienced personnels.


4. In general, S factor is a very useful and also a substantive analytical tool. Most of the psychologists, including myself, routinely reported the inter-correlations among socioeconomic variables to show that they make a positive manifold. But as has been shown the simplest way to show this should be to factor analyze them in order to extract variable (S factor). This meta-variable is supposed to be the indicator for the r-K continuum of human (or primate) behavioral strategy.


To somewhat digress from the description of the manuscript, I have been wondering why there isn’t a clear S factor found in Japan. Surely, I can think of some supposedly sociological reasons, such as, massive migration from rural to urban regions after the war and its resulting differences in the demographic composition, ceiling effects due to the uniform central governmental system, or too small so far and not yet actualized statistical differences of the immigrants to Japan. Also, as ljzigerell and Emil discuss above, there are many factors affecting the sociological phenomena, such as urbanization seems to have made marriage rate decrease by female workforce participation, and the same factor made divorce rate increase with the spread of more liberal psyche among urban dwellers. However, these changes have occurred in different phases in time and space, and I am not very sure if any of them is compelling enough at this moment. I rather want to see how the result may change in twenty or thirty years later, when more immigrants are expected to settle in and mingle with the gentiles.


Anyway, I hope readers of this manuscript enjoy these findings as I did.
Admin
Hi Emil,

Thanks for the comments.

I like your idea of adjusting the variables for population density. Something else might be to conduct the analysis on only the high population or high population density prefectures and then on only the low population or low population density prefectures. Maybe also conduct the analyses on only the northern prefectures and then on only the southern prefectures. These will be only exploratory analyses, but these disaggregated analyses might provide a sense of what it is about the Japan prefectures that makes them different from other countries in terms of the S factor.


Re. running analysis on groups divided by latitude or population (density). This is a typological approach (subgroup analysis), which decreases sample size quite markedly. As you say, it is also purely exploratory and conclusions would not be drawn with much certainty.

Note that adjusting for population density is fairly ad hoc. Population density data was available (or readily calculateable from area and population) in many prior studies but no such correction was made. Usually, population density has a fairly strong positive loading because cities tend to be higher in S; urbanicity often has a strong positive loading.

However, I went ahead. I tried 6 controls: population density, log population density, sqrt pop. density, population, log population, sqrt population. Log/sqrt versions were used to create more equality in the data which had very large differences between prefectures.

The results are attached. I have sorted the loadings by population density log. As can be seen, this control apparently solves most of the problems!

[attachment=691]

It even fixes the indicator sampling reliability which increases to 94% |r|>.50. The others were also improved, but not quite as much.
Standard.0.5 Pop. density.0.5 Pop. density (log).0.5 Pop. density (sqrt).0.5
0.533 0.577 0.933 0.899
Population.0.5 Population (log).0.5
0.780 0.816

[attachment=688]

What about criteria analysis?

[attachment=692]

Jensen's method

[attachment=693]

So, with the correction, the results become like those in all other countries, more or less. Apparently, Japan is weird in so far as population density almost entirely makes the S factor indiscernible. Now this makes me wonder how controlling for population density in the other analyses affects results. I guess someone could re-analyze all the prior studies.


It's correct that some of the variables are not clearly desirable or undesirable (e.g., marriage), and some might even have complicated desirabilities (e.g., too little and too much are both undesirable). But I'm wondering whether the finding of the lack of an S factor across the Japan prefectures might be strengthened by a model that included only those factors that are clearly desirable or undesirable, reflecting the idea that -- if there is no S factor across Japan prefectures in that model -- then it's really clear that there is no S factor there.


My problem with this general approach is that it requires me to make these decisions about variables that are clearly desirable and not.


Along those lines, the Mexico and Brazil S factor studies seemed to have a higher percentage of variables that were more clearly desirable or undesirable, and the Mexico and Brazil studies respectively had only 21 and 32 variables, so maybe an analysis with the 20 to 30 most obviously desirable and undesirable variables from Japan would be more convincing and provide a more even comparison. The analyses that you have already reported suggest that Japan is different than Mexico and Brazil in terms of the S factor, but I think it would strengthen the analyses to rule out the larger number of variables in the Japan analysis as a source of the difference between the Mexico/Brazil and Japan studies.


The general approach is including whatever suitable variables I can find. Some datasets simply have more suitable variables than others. I would rather reserve in depth studies of variable composition across studies to be delayed until someone does a large meta or mega-analysis. I do not at present have time for this.


Something else to consider is whether a relatively low variation in some variables might make measurement error a larger problem in this dataset. For example, income per person might need adjusted for cost of living in a prefecture, and divorce rate might need adjusted for marriage rates: in the 2013 data, Shiga-ken and Nara-ken have the same divorce rate (1.64) but the marriage rate is 5.27 in Shiga-ken and 4.44 in Nara-ken, so it's possible that a higher percentage of persons get divorced in Nara-ken and thus that the 1.64 divorce rate in Nara-ken is worse than the 1.64 divorce rate in Shiga-ken. Maybe a divorce-to-marriage ratio might be a better measure than individual marriage and divorce rates. (I'm assuming that the divorce rate is measured per 1,000 persons and not per 1,000 married persons.)


Once one gets started on the "lets correct variables for this or that", and "make new measures from existing variables", it quickly escalates. Due to time constraints, I rather forego this option (for now). In my defense, I have shared all the data and code, so if someone thinks this is worthwhile doing, they are more than welcome. :)


One way to avoid post-hoc coding biases is to identify ahead of time in general terms the obviously-desirable-or-undesirable variables that should be included in an S factor analysis, and then limit the variables in the main analyses to that set of pre-identified variables, such as measures of health, crime, unemployment, education, income, and dependency. Something like percentage farmers in the Brazil study would not fit in one of those categories, so it's not necessary to consider whether percentage farmers is a desirable or undesirable measure. Infrastructure reflects another set of variables in your S factor studies; this would be a good variable for a cross-national study, but I'm not sure that is always a good idea for subnational studies if, for instance, the quality of infrastructure in a subnational region largely reflects decisions made at the national level.


One could do this, but as I mentioned above, usually I take whatever I can find that's suitable. Often this means that there is some overlap across studies. The benefit of including all kinds of variables is that it is more exploratory. Of course, conclusions based on such exploratory research should be somewhat restrained, but I think it is a worthwhile trade-off.


Hope this is helpful. It's a really interesting study.


Well, it looks I need to add a new section, re-do the abstract and discussion. :)

---

I have rewritten parts of the paper to fit the new results. Have a look. Changes to abstract, robustness section and discussion (+ title). Code and data also updated.
https://osf.io/zfw38/
Thanks for running the analyses controlling for population and population density.

I wish I knew why those controls matter so much for Japan. My initial thought was that there might be a population threshold under which the S factor is difficult to detect; however, [1] small population sizes did not prohibit detecting an S factor in Boston census tracts or among the states in the United States, some of which have relatively small populations, and [2] if there were a threshold under which the S factor is difficult to differentiate from noise, then the low population prefectures would not have been so consistently on one side of zero for the S factor.

Another thought was that it might be relatively easier to detect the S factor in areas in which there is a lot of ethnic or genetic diversity across regions. Based on Wikipedia, the 0-to-1 ethnic fractionalization index for Japan is 0.012, compared to other countries for which the S factor has been detected, such as 0.154 for China, 0.272 for France, 0.491 for the United States, 0.542 for Mexico, 0.656 for Colombia, and 0.811 for India. However, Italy had a detected S factor with a small ethnic fractionalization index (0.04), so maybe ethnic or genetic variation doesn't matter. Or maybe Kenya's idea about migration matters.

There's clearly enough theory and empirical evidence for the S factor, so it's an interesting puzzle why the S factor is not detectable in Japan without the population/population density controls.

I recommend publishing the manuscript after a bit of text cleanup, such as "The first line is the variable name given by me, the second is the descriptions are [sic] copied from the website" and "Nurses working at medical establishments (per 100,000 persons)establishments (per 100,000 persons)".
Admin
The ethnic fractionalization indexes are not always good measures of meaningful fractionalization (genetic or otherwise). It has long been known that Italy is a very divided country, with large differences between the north and the south (or the south and the rest). These differences go back a long time, see this post http://emilkirkegaard.dk/en/?p=5391 and Italy itself is a relatively new country (like Germany) https://en.wikipedia.org/wiki/Italian_unification so either 1971 or 1918, depending on one's tastes. However, because they all speak Italian (as far as I know) and don't regard themselves as different ethnic groups, Italy gets a low fractionalization score. At least, that's my half-informed guess. Another plausible correlate is national GINI of income. The more unequal countries should have stronger S factors.

The problem with comparing the strength of S factors across datasets (i.e. to correlate that with fractionalization) is that the strength of the S factor depends strongly on the included variables. Because every study uses a different set of variables (with much overlap), one cannot just compare the factor strengths. It is possible to do a large-scale reanalysis by finding a subset of variables common to all studies and then re-doing all the analyses. This however is something I don't have time for right now.

I will give it an edit for readability and writing errors.
Admin
I have updated the paper.

This version (ID=7) uses Kura's last values for the infant death variable. The only difference is that it averages over more years. This has the effect of increasing the correlations between the two versions of that variable to about .75 from about .55. Still below the .90 cutoff tho. This slight change in data meant that all figures must be updated as they were very slightly off.

I read over the entire paper and updated the language for readability in many places. I fixed the errors pointed out by Zigerell above.

I'll see if I can get some internet people to read it thru and find language problems.
I was also very surprised by the results after considering the log of population density. Let us look at the Figure2 and the figure after controlling the population density

http://openpsych.net/forum/attachment.php?aid=688.


Apparently, they make an eye-opening contrast. The latter is the kind of figure that we repeatedly saw in the previous analyses. Together with the results shown in Figure 6 and Figure 7, all of them seem to make a good sense.

I would strongly like the altered figure of indicator sampling reliability (, that Emil showed in the discussion section and I cited above) to be included in the main text.

__

I also still wonder what more exactly can be the meaning of the log of population density. As Emil stated in the comment section, I understand that the unban regions are in general higher in their S factor scores, but I am not quite sure about the reason for such phenomena. Could it be that the more densely people are dwelling or packed together, the more socially desirable behavior they have to obtain in order to live together?

I might also want to suggest that, at least within the history of these 150 years of modernization, the most intelligent people have constantly migrated to big cities. This tendency could have accumulated to the level that urban prefectures with higher population density show elevated levels of the S factor. However, one can wonder if this social phenomenon (migration of population) in Japan could be so different from the same kind of situations of the United States, or Brazil, and other countries.


Table 2 shows that controlling for the population density increases the proportion of correlations (more than .5) from .53 to .57, while controlling for their log increases it to .93, which obviously better-fits the dataset. As Emil pointed out, this implies that Tokyo (most densely populated region) has the level of the unknown potential variable as 100 times higher than that of Hokkaido (most sparse region). What factor could be so sensitive to the population density?

As L.J. Zigerell suggested, in general, I agree that the ethnic fractualization could magnify the S factor, or at least make the extraction of the S factor more easily. Japan has very small foreign born (and mixed) population to date and this unique situation may be making the Japanese picture quite different from other countries. These questions are left for the future research of the S factor. I will be looking forward to seeing it.
Admin
I have added the distribution of ISR correlations to the paper (version 8). The next version also has table borders for the table that is missing them.
Admin
Zigerell sent me some comments on the writing via email, attached.

---

Quotes here but not elsewhere for S factor.


I used the single quotes to talk about the first factor that would otherwise be S if it actually looked like S. They are not only used in the abstract, but also in the text (Sections 8 and 9) and on Figures (e.g. 4).

--

Municipal vs. municipality. Wikipedia uses the first. https://en.wikipedia.org/wiki/Administrative_divisions_of_Japan

--

Ratio of dwellings with flush toilet

Percentage instead of ratio?


The descriptions are copied verbatim from the Japanese website. Sometimes their translations are a bit odd. In this case, they talk about ratios, i.e. fractions (values 0-1), but a look at the data reveals them to be percentages (values 0-100). However, since these are equivalent for use in correlational analysis, there is little point to dwell on this minor problem.

--

this decimal place is dislocated from the 82


It is because LibreOffice does not understand that dots can be part of words and so they can be split across linebreaks. There is no automatic solution for this, but I have fixed the identified instances.

--

should there be a comma between "data" and "analysis"?


I meant to write "data or analysis error". Fixed.

--

can be avoided?


Meant to write "could be avoided".

--

Other minor errors fixed.

Table borders added.


---

Files updated. This is version 9 of the paper.

https://osf.io/4bw8u/files/

I have also written an explanation of the files for the project on OSF: https://osf.io/4bw8u/wiki/home/

This is to make it easier for re-use by other people including myself later!.
Newbie here. I assume this paper still needs at least one more reviewer. If so, I will download it and have some comments hopefully within a week. If there's another manuscript with more pressing need for review, please let me know.

B
Admin
Newbie here. I assume this paper still needs at least one more reviewer. If so, I will download it and have some comments hopefully within a week. If there's another manuscript with more pressing need for review, please let me know.

B


The paper needs 2 more reviewers because Kenya Kura was too closely related to the project to be an independent reviewer. I will try to get an external reviewer, perhaps another Japanese sociologist (Arikawa?)
This is well-written and rather-straightforward. I’m ok with accepting the version I read. Emil might consider any of the changes I suggest below, but these are not required. My comments on sub-domains, I think, would add to the paper, but the time involved to do this would likely not be worth the incremental benefit.

1. I’m uncomfortable adjusting for population size when looking specifically at aggregate level data—i.e., geo-political subdivisions such as states, prefects, nations, etc. The unit of analysis here is uniquely a prefecture, and there are important mean differences across prefectures, including population size. To treat the latter as unique and something that needs controlled for (versus something that might be a variable in the S factor) seems arbitrary. I get that the correction allows the data to make sense relative to other data sets, but the correction makes me uneasy in ways that I wish I could express better (and so I could be wrong on this point). Also, as far as I can tell, no explanation is offered for why density mediates the S/IQ relationship.

2. (Minor). Perhaps expand more on Jensen’s method, unless you think all readers here would be familiar with it.

3. Quantitatively, you do a fine job identifying and excluding redundant variables. Qualitatively, you do not. For example, I don’t see “In work male” and “unemployment male” to be unique enough to include in the same analyses. The sum of both define “male labor force participation,” and so I think these should be summed and collapsed into one variable. Labor participation, unemployment, and unemployment benefits seem like examples of this as well.

4. I suggest calculating S factors hierarchically—as we do with g. There are sub-domains of well-being (income, health, crime...) that could be derived. Thereafter, S could be extracted from the sub-domains. In the USA, for example, though a very strong S factor exists, the sub-domains show divergent validity with other variables. For example, health and religiosity correlate more strongly with IQ than does income and education (albeit the education factor for the USA contains relatively few variables—the downside to my suggestion is to make sure each sub-domain is equally construct valid).

5. I’ve considered IQ to be a central node in the S nexus for the USA. You instead use S to predict IQ. I wonder if IQ is just a sub-domain of S, or a cause/effect of S.

6. Right before Section 3, “the data are from Kura and the second that they are from myself.”
Admin
Bryan,

Thanks for reviewing.

Adjusting for population size
As you say, this adjustment is exploratory and there was no strong theoretical justification for doing this. For this reason, it is clearly labelled as so in the paper, e.g. I write "However, in an exploratory (unplanned) analysis ...".

The question of what the unit of analysis is (in this case prefectures) and what one is really interested in is a common conundrum for multi-level data analysis. Should we give more importance to results from schools if they have more students? My preference is to weigh units by their importance if possible, while yours seem to be not to do so. Presumably, in a country-level analysis, you would give equal weight to Monaco as you would to the USA despite the latter having about 8500 times as many citizens. Many would probably find that strange. I note that not using weights has been questioned in the cognitive sociology literature (Hunt and Sternberg), so it is (sort of) a case of: damned if I do, damned if I don't.

In this case, the control was not made for population size itself, but for (log) population density (in the final analysis). Population density has (I think) been used as an indicator in other S factor analyses and usually has a positive loading (closely related to urbanity which loads positively). Thus, using it as a control is somewhat strange at least.

However, one could make similar criticism of immigrant % as done in the recently published paper on French departments. http://mankindquarterly.org/archive/paper.php?p=803 (ungated https://www.researchgate.net/publication/290430496_IQ_and_Socioeconomic_Variables_in_French_Departements_Reanalysis_and_New_Data?ev=prf_pub) Perhaps immigrant % should be seen as another indicator of S, not something to control for. After all, the higher S areas tend to attract more immigrants if one looks at a fairly zoomed-out level (immigrants live in cities, tho in the poor areas). Here I am talking about European style, recent immigration (since about 1970).

I don't offer any particular reason for why one would need to control for population density in Japan, but not other places because I can't think of any such reason. It is quite the mystery. In future S factor studies I will try including controls for population density to see what it does. Perhaps one should do a big meta-analysis. Since I have published all my data from all studies, anyone with the time and competence can do this.

Jensen's method
You are right that the paper is light on this method. I have used it so much that it doesn't seem special to me any more (the curse of knowledge).

I have added a footnote with a brief explanation:

Jensen's method (method of correlated vectors, named after the great psychologist Arthur Jensen) consists of correlating the factor loadings of indicators with their relationships to some criterion variable. The reasoning is that if the relationship between the factor scores and the criterion variable is real, then (everything else equal) the indicators that have stronger loadings on the factor (i.e. are better measures of it) should be more strongly related to the criterion variable than those with weaker loadings. For more details, see [1], [19].


Data-driven vs. theory-driven research
The difference in methods that you identify concerns one aspect of data-driven research vs. theory-driven research. You suggest using researcher judgment to classify variables into groups and aggregate the results. This however requires that choices be made, choices that could be questioned. I prefer the more agnostic approach. Often, these employment variables were not that strongly correlated, owing to complex definitions of who is and isn't included in the categories. For instance, unemployment might only include those that can work, thus excluding the pensioned and the disabled. Or it might include only those receiving benefits (and thus not housewives/husbands). Combining variables risks missing importance differences between the variables.

Hierarchical extraction
One could use hierarchical extraction of S. In fact I have been experimenting with this but not published much on it. The topic opens up a large number of methodological questions that require quite a bit of work to answer. I have not had the time yet to seriously examine all of them and thus I prefer not to use this approach. It may change in the future.

See, however: http://openpsych.net/forum/showthread.php?tid=264 This paper is mostly a reply to your comments on this topic in another paper (commentary to our target article in Mankind Quarterly; the paper is currently not publicly available).

Causality
One could make a broader factor "well-being" as you did in the 2010 paper (Pesta et al). However, I want to align the integrate the research with that from behavioral genetics and differential psychology looking at the causality from individual differences to educational, economic, criminologic and medical outcomes. The reasoning is that the same general causality that holds in that area holds at the aggregate level: countries, states, communes, departments, prefectures, cities, etc.. This is my working hypothesis, not something that is definitely established.

For this reason I classify S as an outcome variable and cognitive ability as a causal predictor. It is likely that some backwards causation exists for the poorer regions of the world, but most will be forwards (my guess).

Language
The entire paragraph is "A composite dataset was created by merging the two datasets, yielding 63 variables in total. For identification, “_A” and “_B” were added to the variables names, where the first indicates that the data is from Kura and the second that it is from myself.".

This seems grammatical and understandable to me.

References
Hunt, E., & Sternberg, R. J. (2006). Sorry, wrong numbers: An analysis of a study of a correlation between skin color and IQ. Intelligence, 34(2), 131–137. http://doi.org/10.1016/j.intell.2005.04.004

Pesta, B. J., McDaniel, M. A., & Bertsch, S. (2010). Toward an index of well-being for the fifty US states. Intelligence, 38(1), 160-168.

Update
Version 10 (2016-01-30 10:47 AM) has been uploaded to OSF. It features the footnote mentioned above and no other changes.

https://osf.io/zfw38/
Admin
Having already been commented upon by several reviewers, the paper's analysis seems pretty comprehensive at this point. In addition, the paper reads relatively well; there is no need for any major grammatical or stylistic changes. However, I would ask Emil to address the following points, following which I believe the paper will be ready for publication.

1. Table 1 would be easier for the reader to interpret if the variables were ordered by strength of correlation (from largest negative to largest positive, say).

2. Analogous to Emil's finding that many of the correlations between S variables and cognitive ability became more sensical after conditioning on population density, Lynn (1979) reported a positive correlation between IQ and crime rates across regions of the British Isles, which then fell to zero when conditioning on urbanisation. He noted:

"The positive correlation between crimes rates and mean population IQ (r = + 0.51) is surprising in view of the many findings of a negative relation among individuals... When urbanization is partialled out the correlation between crime rates and mean population IQ drops to zero. Perhaps this is the true relationship between crime and intelligence."

Perhaps Emil would like to mention this.

3. Underneath Figure 7, Emit notes, "Okinawa is an outlier, but it is reasonably close to the regression line”. However, eyeballing the graph suggests to me that Okinawa may not only be an outlier, but also an influential point––at least to some extent. It would be worth reporting the correlation with and without Okinawa included in the sample.
Admin
Noah,

Thank you for reviewing the paper.

1)
Table 1 would be easier for the reader to interpret if the variables were ordered by strength of correlation (from largest negative to largest positive, say).


I have updated the table so that they are sorted (highest first). The rationale for the other setup was to make it easier to see which variable was from which dataset. I guess that has little interest to the reader.

2)
Analogous to Emil's finding that many of the correlations between S variables and cognitive ability became more sensical after conditioning on population density, Lynn (1979) reported a positive correlation between IQ and crime rates across regions of the British Isles, which then fell to zero when conditioning on urbanisation. He noted:

"The positive correlation between crimes rates and mean population IQ (r = + 0.51) is surprising in view of the many findings of a negative relation among individuals... When urbanization is partialled out the correlation between crime rates and mean population IQ drops to zero. Perhaps this is the true relationship between crime and intelligence."

Perhaps Emil would like to mention this.


I have added a note about this in the Discussion:

Finally, during the review, Noah Carl pointed out that Lynn (1979) employed a similar control and observed that this can have large effects (see also Kirkegaard (2015g) for a reanalysis that study).

3)
Underneath Figure 7, Emit notes, "Okinawa is an outlier, but it is reasonably close to the regression line”. However, eyeballing the graph suggests to me that Okinawa may not only be an outlier, but also an influential point––at least to some extent. It would be worth reporting the correlation with and without Okinawa included in the sample.


Added:

Okinawa is an outlier, but it is reasonably close to the regression line. If Okinawa is excluded, the correlation decreases to .54 [95CI: .29 to .72].

So, yes, influential, but not solely responsible for the result.

--

John Fuerst went over the paper and suggested formulation changes. I have followed his advise in most of the cases. The paper should now read somewhat better.

I have changed the reference system to APA.

New version uploaded: https://osf.io/zfw38/
Version #11.
Admin
I am happy with the revisions Emil has implemented, and approve the paper for publication.
1