[OQSPS] Inequality across prefectures in Japan is different
2015-Dec-17, 20:39:57, (This post was last modified: 2015-Dec-17, 20:42:48 by Emil.)
#1
[OQSPS] Inequality across prefectures in Japan is different
Journal:
Open Quantitative Sociology and Political Science.

Authors:
Emil O. W. Kirkegaard

Title:
Inequality across prefectures in Japan is different

Abstract:
Two datasets of Japanese socioeconomic data for Japanese prefectures (N=47) were obtained and merged. After quality control, there were 44 variables for use in a factor analysis. Indicator sampling reliability analysis revealed poor reliability (54% of correlations were |r| > .50). Inspection of the factor loadings revealed no clear S factor with many indicators loading in opposite than expected directions.

A cognitive ability measure was constructed from three scholastic ability measures (all loadings > .90). Cognitive ability was not strongly related to 'S' factor scores, r = -.19 [CI95: -.45 to .19; N=47]. Jensen's method did not support the relationship between latent 'S' and cognitive ability (r = -.15; N=44). Cognitive ability was nevertheless related to some socioeconomic indicators in expected ways.

Results from Japanese prefectures are strongly in contrast with all previous studies. There does not seem to be an S factor for this level of analysis in Japan.

Key words:
general socioeconomic factor, S factor, Japan, prefectures, inequality, intelligence, IQ, cognitive ability, cognitive sociology

Length:
14 pages, 2963 words, excluding references.

Files:
https://osf.io/4bw8u/files/

External reviewers:
I will get Kura's official comments. He has been discussing the paper with my during research and writing. As for external reviewers, perhaps Gerhard Meisenberg.
Reply
2015-Dec-18, 06:40:56,
#2
RE: [OQSPS] Inequality across prefectures in Japan is different
Very interesting.

1. In terms of presentation, the patterns in Figure 3 are not obvious because -- to assess whether a particular variable reflects the expected pattern -- a reader must read the variable label and then consider whether the expected loading would be positive or negative. It might make the figure easier to interpret to instead plot the inverse of undesirable variables such as divorce so that the expected loadings are positive for all variables; then the reader can simply check whether the point for the variable is to the left or right of zero.

2. It might be worth considering suspicions about the accuracy of Japan's reported abortion rates and suicide rates (e.g., https://www.guttmacher.org/pubs/journals/25s3099.html, http://www.japantimes.co.jp/news/2013/02...nN3SFJRKDk). I'm not sure that misreporting would vary by prefecture, but combining homicides and suicides into an unnatural death measure might avoid at least part of any measurement problems with the suicide measure.

3. It's not clear that some of the variables are necessarily desirable. Museums and libraries negatively correlate with population, which might reflect big cities having fewer-but-bigger libraries; if so, it's not obvious that having a lot of small libraries is better than having fewer big libraries with more amenities.

4. I did not see this pattern in the Mexico S factor study, but it appears that the S factors for the Japan prefectures in Figure 4 correlate fairly highly with population or at least population density. I think that all but one of the top 15 prefectures by population have an S factor above zero, and the two prefectures of those 15 with the lowest S factors (Hokkaidō, and Niigata-ken) have the lowest population density of those 15. Moreover, all but one of the bottom 15 prefectures by population have an S factor below zero, and the one prefecture with an S factor in that set above zero (Kagawa-ken) has the highest population density of that set. I'm not sure what -- if anything -- to make of that, though.
Reply
2015-Dec-18, 07:43:46,
#3
RE: [OQSPS] Inequality across prefectures in Japan is different
Thanks for the comment.

(2015-Dec-18, 06:40:56)ljzigerell Wrote: Very interesting.

1. In terms of presentation, the patterns in Figure 3 are not obvious because -- to assess whether a particular variable reflects the expected pattern -- a reader must read the variable label and then consider whether the expected loading would be positive or negative. It might make the figure easier to interpret to instead plot the inverse of undesirable variables such as divorce so that the expected loadings are positive for all variables; then the reader can simply check whether the point for the variable is to the left or right of zero.

3. It's not clear that some of the variables are necessarily desirable. Museums and libraries negatively correlate with population, which might reflect big cities having fewer-but-bigger libraries; if so, it's not obvious that having a lot of small libraries is better than having fewer big libraries with more amenities.

These two points together are the reason why I do not just reverse the variables -- there is often disagreement about whether they reflect good or bad outcomes!

As you note, one can interpret the museums and libraries findings in a positive light by arguing that it is better to have a few large units than many small units per capita. This is of course assuming that the total size is held constant, something which is not given by the data. I had thought of the same interpretation.

But how much of this kind of thinking is hindsight bias or post hoc theorizing? I rather avoid possible bias by not 'encoding the results' using my interpretations.

As for divorce, some people are against marriage. For instance, divorce and lack of marriage could be seen as unnecessary government interference of civil matters (libertarians) or as a sign of female empowerment (feminists). According to Wikipedia, there seems to be some people like this: https://en.wikipedia.org/wiki/Marriage_privatization https://en.wikipedia.org/wiki/Criticism_...t_approach


Quote:2. It might be worth considering suspicions about the accuracy of Japan's reported abortion rates and suicide rates (e.g., https://www.guttmacher.org/pubs/journals/25s3099.html, http://www.japantimes.co.jp/news/2013/02...nN3SFJRKDk). I'm not sure that misreporting would vary by prefecture, but combining homicides and suicides into an unnatural death measure might avoid at least part of any measurement problems with the suicide measure.

Yes, but the results showed that the results were not dependent on the abortion measure (indicator sampling reliability). Results for the dataset without that indicator were also not in line with expectations.

I have also tried without the abortion variable at all (before you brought it up when I was trying to figure out what was wrong). Nothing much changes.

Quote:4. I did not see this pattern in the Mexico S factor study, but it appears that the S factors for the Japan prefectures in Figure 4 correlate fairly highly with population or at least population density. I think that all but one of the top 15 prefectures by population have an S factor above zero, and the two prefectures of those 15 with the lowest S factors (Hokkaidō, and Niigata-ken) have the lowest population density of those 15. Moreover, all but one of the bottom 15 prefectures by population have an S factor below zero, and the one prefecture with an S factor in that set above zero (Kagawa-ken) has the highest population density of that set. I'm not sure what -- if anything -- to make of that, though.

Population density is tricky. On the one hand, population density presumably increases crime rates (by presenting more opportunities for crime and more human conflict), but it is also often a sign of urbanicity, which usually has a positive loading. Thus, one could argue either for inclusion of population density as an indicator, or controlling the indicators for it.

Perhaps I could try both approaches: in the main results, include it as an indicator like the others. For a robustness section, try the analyses corrected for (with residualization like in the French departments study, coming out in Mankind Quarterly soon) population density.

Thoughts?
Reply
2015-Dec-18, 18:38:52,
#4
RE: [OQSPS] Inequality across prefectures in Japan is different
Hi Emil,

Thanks for the comments.

I like your idea of adjusting the variables for population density. Something else might be to conduct the analysis on only the high population or high population density prefectures and then on only the low population or low population density prefectures. Maybe also conduct the analyses on only the northern prefectures and then on only the southern prefectures. These will be only exploratory analyses, but these disaggregated analyses might provide a sense of what it is about the Japan prefectures that makes them different from other countries in terms of the S factor.

It's correct that some of the variables are not clearly desirable or undesirable (e.g., marriage), and some might even have complicated desirabilities (e.g., too little and too much are both undesirable). But I'm wondering whether the finding of the lack of an S factor across the Japan prefectures might be strengthened by a model that included only those factors that are clearly desirable or undesirable, reflecting the idea that -- if there is no S factor across Japan prefectures in that model -- then it's really clear that there is no S factor there.

Along those lines, the Mexico and Brazil S factor studies seemed to have a higher percentage of variables that were more clearly desirable or undesirable, and the Mexico and Brazil studies respectively had only 21 and 32 variables, so maybe an analysis with the 20 to 30 most obviously desirable and undesirable variables from Japan would be more convincing and provide a more even comparison. The analyses that you have already reported suggest that Japan is different than Mexico and Brazil in terms of the S factor, but I think it would strengthen the analyses to rule out the larger number of variables in the Japan analysis as a source of the difference between the Mexico/Brazil and Japan studies.

Something else to consider is whether a relatively low variation in some variables might make measurement error a larger problem in this dataset. For example, income per person might need adjusted for cost of living in a prefecture, and divorce rate might need adjusted for marriage rates: in the 2013 data, Shiga-ken and Nara-ken have the same divorce rate (1.64) but the marriage rate is 5.27 in Shiga-ken and 4.44 in Nara-ken, so it's possible that a higher percentage of persons get divorced in Nara-ken and thus that the 1.64 divorce rate in Nara-ken is worse than the 1.64 divorce rate in Shiga-ken. Maybe a divorce-to-marriage ratio might be a better measure than individual marriage and divorce rates. (I'm assuming that the divorce rate is measured per 1,000 persons and not per 1,000 married persons.)

One way to avoid post-hoc coding biases is to identify ahead of time in general terms the obviously-desirable-or-undesirable variables that should be included in an S factor analysis, and then limit the variables in the main analyses to that set of pre-identified variables, such as measures of health, crime, unemployment, education, income, and dependency. Something like percentage farmers in the Brazil study would not fit in one of those categories, so it's not necessary to consider whether percentage farmers is a desirable or undesirable measure. Infrastructure reflects another set of variables in your S factor studies; this would be a good variable for a cross-national study, but I'm not sure that is always a good idea for subnational studies if, for instance, the quality of infrastructure in a subnational region largely reflects decisions made at the national level.

Hope this is helpful. It's a really interesting study.
Reply
2015-Dec-21, 17:43:59,
#5
RE: [OQSPS] Inequality across prefectures in Japan is different
I am the person who provided the translated dataset available only in Japanese and I would like to endorse this manuscript as very interesting and valuable. There are several points of interest from my point of view.

1. First, the result of the S factor analysis of Japanese data was fairly different from the previous studies of datasets from worldwide (country-level), Norway, Finland, Denmark, U.S., U.K., Brazil, and Mexico, which have been reported by the author of this paper with much consistency. However, there is no rule without exception. This seems to be a curious case study, to which S factor analysis does not apply as we expected. For example, Figure 2 in this manuscript makes a stark contrast to Figure 1 in the author’s analysis of the 32 London boroughs (2015) cited in the reference.


2. Although the result shows that there seems to be no clear S factor extracted from Japanese data, there are some consistency for some important socioeconomic variables. If we look at variables in Table 1, which have correlation coefficient > | 0.5| (which means that they are highly correlated with cognitive ability in the positive or negative directions), they are Gini coefficient in the asset holding, Unemployment rate, dependency on welfare, Height, Divorce rate. Homicide and skin colors are also very close to this threshold. These relations have become very much stylized in this literature.


3. Apparently, one of the most notable observation in Japanese data is the author’s finding that there is no correlation between the infant mortality and cognitive ability. When I reported this unexpected nonexistence of the correlation three years ago, I was not sure if this was due to some detective error from the statistics, or there is indeed no relationship in Japanese regions. The author decisively resolved this question to be late. In other words, the sophisticated consistency analysis of the author proved that even annual infant mortality data from 47 prefectures do not correlate with each other and they show more or less random fluctuation. I should probably add some information after my publication of the Japanese data: many of my personal acquaintances with medical expertise, have already suggested me that infant mortality in Japan largely depends on the hospital system with advanced obstetrical facilities and experienced personnels.


4. In general, S factor is a very useful and also a substantive analytical tool. Most of the psychologists, including myself, routinely reported the inter-correlations among socioeconomic variables to show that they make a positive manifold. But as has been shown the simplest way to show this should be to factor analyze them in order to extract variable (S factor). This meta-variable is supposed to be the indicator for the r-K continuum of human (or primate) behavioral strategy.


To somewhat digress from the description of the manuscript, I have been wondering why there isn’t a clear S factor found in Japan. Surely, I can think of some supposedly sociological reasons, such as, massive migration from rural to urban regions after the war and its resulting differences in the demographic composition, ceiling effects due to the uniform central governmental system, or too small so far and not yet actualized statistical differences of the immigrants to Japan. Also, as ljzigerell and Emil discuss above, there are many factors affecting the sociological phenomena, such as urbanization seems to have made marriage rate decrease by female workforce participation, and the same factor made divorce rate increase with the spread of more liberal psyche among urban dwellers. However, these changes have occurred in different phases in time and space, and I am not very sure if any of them is compelling enough at this moment. I rather want to see how the result may change in twenty or thirty years later, when more immigrants are expected to settle in and mingle with the gentiles.


Anyway, I hope readers of this manuscript enjoy these findings as I did.
Reply
2015-Dec-23, 06:06:58, (This post was last modified: 2016-Jul-05, 02:07:31 by Emil.)
#6
RE: [OQSPS] Inequality across prefectures in Japan is different
(2015-Dec-18, 18:38:52)ljzigerell Wrote: Hi Emil,

Thanks for the comments.

I like your idea of adjusting the variables for population density. Something else might be to conduct the analysis on only the high population or high population density prefectures and then on only the low population or low population density prefectures. Maybe also conduct the analyses on only the northern prefectures and then on only the southern prefectures. These will be only exploratory analyses, but these disaggregated analyses might provide a sense of what it is about the Japan prefectures that makes them different from other countries in terms of the S factor.

Re. running analysis on groups divided by latitude or population (density). This is a typological approach (subgroup analysis), which decreases sample size quite markedly. As you say, it is also purely exploratory and conclusions would not be drawn with much certainty.

Note that adjusting for population density is fairly ad hoc. Population density data was available (or readily calculateable from area and population) in many prior studies but no such correction was made. Usually, population density has a fairly strong positive loading because cities tend to be higher in S; urbanicity often has a strong positive loading.

However, I went ahead. I tried 6 controls: population density, log population density, sqrt pop. density, population, log population, sqrt population. Log/sqrt versions were used to create more equality in the data which had very large differences between prefectures.

The results are attached. I have sorted the loadings by population density log. As can be seen, this control apparently solves most of the problems!

   

It even fixes the indicator sampling reliability which increases to 94% |r|>.50. The others were also improved, but not quite as much.
Standard.0.5 Pop. density.0.5 Pop. density (log).0.5 Pop. density (sqrt).0.5
0.533 0.577 0.933 0.899
Population.0.5 Population (log).0.5
0.780 0.816

   

What about criteria analysis?

   

Jensen's method

   

So, with the correction, the results become like those in all other countries, more or less. Apparently, Japan is weird in so far as population density almost entirely makes the S factor indiscernible. Now this makes me wonder how controlling for population density in the other analyses affects results. I guess someone could re-analyze all the prior studies.


Quote:It's correct that some of the variables are not clearly desirable or undesirable (e.g., marriage), and some might even have complicated desirabilities (e.g., too little and too much are both undesirable). But I'm wondering whether the finding of the lack of an S factor across the Japan prefectures might be strengthened by a model that included only those factors that are clearly desirable or undesirable, reflecting the idea that -- if there is no S factor across Japan prefectures in that model -- then it's really clear that there is no S factor there.

My problem with this general approach is that it requires me to make these decisions about variables that are clearly desirable and not.


Quote:Along those lines, the Mexico and Brazil S factor studies seemed to have a higher percentage of variables that were more clearly desirable or undesirable, and the Mexico and Brazil studies respectively had only 21 and 32 variables, so maybe an analysis with the 20 to 30 most obviously desirable and undesirable variables from Japan would be more convincing and provide a more even comparison. The analyses that you have already reported suggest that Japan is different than Mexico and Brazil in terms of the S factor, but I think it would strengthen the analyses to rule out the larger number of variables in the Japan analysis as a source of the difference between the Mexico/Brazil and Japan studies.

The general approach is including whatever suitable variables I can find. Some datasets simply have more suitable variables than others. I would rather reserve in depth studies of variable composition across studies to be delayed until someone does a large meta or mega-analysis. I do not at present have time for this.


Quote:Something else to consider is whether a relatively low variation in some variables might make measurement error a larger problem in this dataset. For example, income per person might need adjusted for cost of living in a prefecture, and divorce rate might need adjusted for marriage rates: in the 2013 data, Shiga-ken and Nara-ken have the same divorce rate (1.64) but the marriage rate is 5.27 in Shiga-ken and 4.44 in Nara-ken, so it's possible that a higher percentage of persons get divorced in Nara-ken and thus that the 1.64 divorce rate in Nara-ken is worse than the 1.64 divorce rate in Shiga-ken. Maybe a divorce-to-marriage ratio might be a better measure than individual marriage and divorce rates. (I'm assuming that the divorce rate is measured per 1,000 persons and not per 1,000 married persons.)

Once one gets started on the "lets correct variables for this or that", and "make new measures from existing variables", it quickly escalates. Due to time constraints, I rather forego this option (for now). In my defense, I have shared all the data and code, so if someone thinks this is worthwhile doing, they are more than welcome. :)


Quote:One way to avoid post-hoc coding biases is to identify ahead of time in general terms the obviously-desirable-or-undesirable variables that should be included in an S factor analysis, and then limit the variables in the main analyses to that set of pre-identified variables, such as measures of health, crime, unemployment, education, income, and dependency. Something like percentage farmers in the Brazil study would not fit in one of those categories, so it's not necessary to consider whether percentage farmers is a desirable or undesirable measure. Infrastructure reflects another set of variables in your S factor studies; this would be a good variable for a cross-national study, but I'm not sure that is always a good idea for subnational studies if, for instance, the quality of infrastructure in a subnational region largely reflects decisions made at the national level.

One could do this, but as I mentioned above, usually I take whatever I can find that's suitable. Often this means that there is some overlap across studies. The benefit of including all kinds of variables is that it is more exploratory. Of course, conclusions based on such exploratory research should be somewhat restrained, but I think it is a worthwhile trade-off.


Quote:Hope this is helpful. It's a really interesting study.

Well, it looks I need to add a new section, re-do the abstract and discussion. :)

---

I have rewritten parts of the paper to fit the new results. Have a look. Changes to abstract, robustness section and discussion (+ title). Code and data also updated.
https://osf.io/zfw38/
Reply
2015-Dec-23, 09:31:09,
#7
RE: [OQSPS] Inequality across prefectures in Japan is different
Thanks for running the analyses controlling for population and population density.

I wish I knew why those controls matter so much for Japan. My initial thought was that there might be a population threshold under which the S factor is difficult to detect; however, [1] small population sizes did not prohibit detecting an S factor in Boston census tracts or among the states in the United States, some of which have relatively small populations, and [2] if there were a threshold under which the S factor is difficult to differentiate from noise, then the low population prefectures would not have been so consistently on one side of zero for the S factor.

Another thought was that it might be relatively easier to detect the S factor in areas in which there is a lot of ethnic or genetic diversity across regions. Based on Wikipedia, the 0-to-1 ethnic fractionalization index for Japan is 0.012, compared to other countries for which the S factor has been detected, such as 0.154 for China, 0.272 for France, 0.491 for the United States, 0.542 for Mexico, 0.656 for Colombia, and 0.811 for India. However, Italy had a detected S factor with a small ethnic fractionalization index (0.04), so maybe ethnic or genetic variation doesn't matter. Or maybe Kenya's idea about migration matters.

There's clearly enough theory and empirical evidence for the S factor, so it's an interesting puzzle why the S factor is not detectable in Japan without the population/population density controls.

I recommend publishing the manuscript after a bit of text cleanup, such as "The first line is the variable name given by me, the second is the descriptions are [sic] copied from the website" and "Nurses working at medical establishments (per 100,000 persons)establishments (per 100,000 persons)".
Reply
2015-Dec-23, 11:17:58, (This post was last modified: 2015-Dec-23, 13:16:08 by Emil.)
#8
RE: [OQSPS] Inequality across prefectures in Japan is different
The ethnic fractionalization indexes are not always good measures of meaningful fractionalization (genetic or otherwise). It has long been known that Italy is a very divided country, with large differences between the north and the south (or the south and the rest). These differences go back a long time, see this post http://emilkirkegaard.dk/en/?p=5391 and Italy itself is a relatively new country (like Germany) https://en.wikipedia.org/wiki/Italian_unification so either 1971 or 1918, depending on one's tastes. However, because they all speak Italian (as far as I know) and don't regard themselves as different ethnic groups, Italy gets a low fractionalization score. At least, that's my half-informed guess. Another plausible correlate is national GINI of income. The more unequal countries should have stronger S factors.

The problem with comparing the strength of S factors across datasets (i.e. to correlate that with fractionalization) is that the strength of the S factor depends strongly on the included variables. Because every study uses a different set of variables (with much overlap), one cannot just compare the factor strengths. It is possible to do a large-scale reanalysis by finding a subset of variables common to all studies and then re-doing all the analyses. This however is something I don't have time for right now.

I will give it an edit for readability and writing errors.
Reply
2015-Dec-23, 15:57:40,
#9
RE: [OQSPS] Inequality across prefectures in Japan is different
I have updated the paper.

This version (ID=7) uses Kura's last values for the infant death variable. The only difference is that it averages over more years. This has the effect of increasing the correlations between the two versions of that variable to about .75 from about .55. Still below the .90 cutoff tho. This slight change in data meant that all figures must be updated as they were very slightly off.

I read over the entire paper and updated the language for readability in many places. I fixed the errors pointed out by Zigerell above.

I'll see if I can get some internet people to read it thru and find language problems.
Reply
2015-Dec-23, 19:09:56,
#10
RE: [OQSPS] Inequality across prefectures in Japan is different
I was also very surprised by the results after considering the log of population density. Let us look at the Figure2 and the figure after controlling the population density

http://openpsych.net/forum/attachment.php?aid=688.


Apparently, they make an eye-opening contrast. The latter is the kind of figure that we repeatedly saw in the previous analyses. Together with the results shown in Figure 6 and Figure 7, all of them seem to make a good sense.

I would strongly like the altered figure of indicator sampling reliability (, that Emil showed in the discussion section and I cited above) to be included in the main text.

__

I also still wonder what more exactly can be the meaning of the log of population density. As Emil stated in the comment section, I understand that the unban regions are in general higher in their S factor scores, but I am not quite sure about the reason for such phenomena. Could it be that the more densely people are dwelling or packed together, the more socially desirable behavior they have to obtain in order to live together?

I might also want to suggest that, at least within the history of these 150 years of modernization, the most intelligent people have constantly migrated to big cities. This tendency could have accumulated to the level that urban prefectures with higher population density show elevated levels of the S factor. However, one can wonder if this social phenomenon (migration of population) in Japan could be so different from the same kind of situations of the United States, or Brazil, and other countries.


Table 2 shows that controlling for the population density increases the proportion of correlations (more than .5) from .53 to .57, while controlling for their log increases it to .93, which obviously better-fits the dataset. As Emil pointed out, this implies that Tokyo (most densely populated region) has the level of the unknown potential variable as 100 times higher than that of Hokkaido (most sparse region). What factor could be so sensitive to the population density?

As L.J. Zigerell suggested, in general, I agree that the ethnic fractualization could magnify the S factor, or at least make the extraction of the S factor more easily. Japan has very small foreign born (and mixed) population to date and this unique situation may be making the Japanese picture quite different from other countries. These questions are left for the future research of the S factor. I will be looking forward to seeing it.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)