Title: Educational attainment, income, use of social benefits, crime rate and the general socioeconomic factor among 71 immmigrant groups in Denmark
Abstract:
We obtained data from Denmark for the largest 71 immigrants by country of origin. We show that three important social-economic variables are highly predictable from the Islam rate, IQ, GDP and height of the countries of origin. We further show that there is a general socio-economic factor and this too is very predictable.
Key words: National IQs, group differences, country of origin, Denmark, immigration
PDF attached. Source files attached.
Back to [Archive] Post-review discussions
The PISA standard deviation is generally 100; you could use that in the paper. The paper is pretty good and should be published.
The PISA SD was standardized as 100 between countries, not within a population. It is smaller there. The question is how small. The DA immigrant population is about ~90 IQ, so everything else equal, one might expect a .67 SD difference in the PISA scores. That fits well with a difference of about 60 points if the within population SD is somewhat smaller (perhaps 80).
No, the PISA SD is 100 within a population. An individual with a PISA score of 700 has an equivalent IQ of 130.
No, the PISA SD is 100 within a population. An individual with a PISA score of 700 has an equivalent IQ of 130.
The reporting scales that were developed for each of reading, mathematics and science in PISA 2000 were linear transformations of the natural logit metrics that result from the scaling as described above. The transformations were chosen so that the mean and standard deviation of the PISA 2000 scores was 500 and 100 respectively, for the equally weighted 27 OECD countries that participated in PISA 2000 that had acceptable response rates (Wu and Adams, 2002).
http://www.oecd.org/pisa/pisaproducts/50036771.pdf
No, the PISA SD is 100 within a population. An individual with a PISA score of 700 has an equivalent IQ of 130.
You can check the average intranational SD by using PISA data explorer. Use the edit option to select statistics. For comparison groups you can use "by immigrant generations" as this will eliminate between relevant subgroup variance. In this case the SD is about 80.
http://nces.ed.gov/surveys/international/ide/
I didn't know that. Good find, Chuck.
I looked up the total Math score for PISA 2012 for Denmark.
Year Country Avg. SD
2012 Denmark 500 82
Danish origin, 508
Second gen. immigrant (western and non-western), 447
Difference, 61.
In SD units, 61/82=0.74
The estimated IQ for total immigrant population for 2013Q2 was 89.9, so about .67 SD under Danish origin.
It's rather close at this aggregate level.
I looked up the total Math score for PISA 2012 for Denmark.
Year Country Avg. SD
2012 Denmark 500 82
Danish origin, 508
Second gen. immigrant (western and non-western), 447
Difference, 61.
In SD units, 61/82=0.74
The estimated IQ for total immigrant population for 2013Q2 was 89.9, so about .67 SD under Danish origin.
It's rather close at this aggregate level.
I didn't know that. Good find, Chuck.
I looked up the total Math score for PISA 2012 for Denmark.
Year Country Avg. SD
2012 Denmark 500 82
The pooled 2012 math SD for "natives" and "migrants" was 79. (With explorer, you can break PISA scores down by generations e.g,. "native", "second", "first".) Usually when computing between groups differences one uses pooled SDs, not total, since one wants to eliminate variance due to between group differences.
The dataset for my estimate of the immigrant population is here: https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdFAzeDczeHFXazZuS0hQZFB2d1Z4V2c&usp=sheets_web#gid=0
It is based on LV's NIQ's and my additions as well as census 2013Q2 data. Text is in Danish, but it should be guessable from context.
I used the total SD to calculate the danish-immigrant d from the army sample. Perhaps we should calculate the pooled SD to see if that makes a difference. Dataset here: https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdGx5eXlySWhmdExUT1VjSUtvSXVOTGc&usp=sheets_web#gid=0
It is based on LV's NIQ's and my additions as well as census 2013Q2 data. Text is in Danish, but it should be guessable from context.
I used the total SD to calculate the danish-immigrant d from the army sample. Perhaps we should calculate the pooled SD to see if that makes a difference. Dataset here: https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdGx5eXlySWhmdExUT1VjSUtvSXVOTGc&usp=sheets_web#gid=0
I used the total SD to calculate the danish-immigrant d from the army sample. Perhaps we should calculate the pooled SD to see if that makes a difference. Dataset here: https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdGx5eXlySWhmdExUT1VjSUtvSXVOTGc&usp=sheets_web#gid=0
No. The difference is marginal and not worth the time if you are dealing with more than a couple of groups.
The dataset for my estimate of the immigrant population is here: https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdFAzeDczeHFXazZuS0hQZFB2d1Z4V2c&usp=sheets_web#gid=0
I didn't know that you had scores broken down by ethnic groups. Why didn't you compute a correlation coefficient? The problem I saw with your paper is that it didn't indicate if NIQ actually predicted migrant IQ.
I don't have scores broken down by groups. I merely have population sizes by national origin, and I calculated a weighted mean using LV's NIQs. There are no national origin scores unfortunately. However I have contacted the army, and perhaps they can give me the data. I have to call them back to see.
A few comments:
The wording in this paper is sometimes abrupt, excessive, or simply wrong. Here are a few examples:
"We obtained data from Denmark for the largest 71 immigrants by country of origin." - replace "immigrants" with "immigrant groups" or "the 71 largest sources of immigrants by ..." You might want to mention that some of these groups are heterogeneous. The term "Ugandan" probably includes not only indigenous Africans but also people of South Asian origin.
"Yugoslavia's IQ" - the state of Yugoslavia has not existed for some time. You should write "the former Yugoslavia"
"they immigrated too late to the country and so have not taken advantage of the tax-funded education system." - Why not simply say: "and so have not attended Danish schools". Schools are tax-funded in most countries. Why insist on that point?
"Perhaps some immigrant groups are criminal" - Wouldn't it be more correct to say that some immigrant groups have a high incidence of criminality?
"Islam", "Islam", "Islam", etc. etc. etc. As I pointed out in a previous comment, most Muslims in Western Europe are only nominally so. This is particularly true among the young men who contribute the most to the high crime rates we hear about. I knew a young Moroccan man who would brag about the people he had beaten up ("You see that blood on my jacket? It's not mine!"). He had no interest at all in religion and was probably more influenced by African American hip hop than anything else. Insofar as "Islam" plays a role in all this, it is simply as an identity tag and not as a belief system.
If we look at immigrants to Denmark, they fall into three groups:
1. Other Europeans.
2. People from North Africa, the Middle East, Afghanistan, and Pakistan, i.e., the Muslim world.
3. East Asia, Southeast Asia, and non-Muslim South Asia.
On the face of it, religion (i.e., Islam) is the main difference between group 2 and groups 1 and 3. I would argue that there is another difference. Groups 1 and 3 have lived in societies where the State has exercised an effective monopoly over the use of violence, with the result that people have much higher thresholds for expression of personal violence. Group 2 corresponds to societies where the state has never monopolized the use of violence and where every adult male is expected to use violence as a legitimate means of resolving personal disputes. Thresholds for the expression of personal violence are thus correspondingly lower.
I don't expect to convince you. All I'm saying is that you shouldn't prejudge things by positing "Islam" as a causal factor when it probably isn't.
The wording in this paper is sometimes abrupt, excessive, or simply wrong. Here are a few examples:
"We obtained data from Denmark for the largest 71 immigrants by country of origin." - replace "immigrants" with "immigrant groups" or "the 71 largest sources of immigrants by ..." You might want to mention that some of these groups are heterogeneous. The term "Ugandan" probably includes not only indigenous Africans but also people of South Asian origin.
"Yugoslavia's IQ" - the state of Yugoslavia has not existed for some time. You should write "the former Yugoslavia"
"they immigrated too late to the country and so have not taken advantage of the tax-funded education system." - Why not simply say: "and so have not attended Danish schools". Schools are tax-funded in most countries. Why insist on that point?
"Perhaps some immigrant groups are criminal" - Wouldn't it be more correct to say that some immigrant groups have a high incidence of criminality?
"Islam", "Islam", "Islam", etc. etc. etc. As I pointed out in a previous comment, most Muslims in Western Europe are only nominally so. This is particularly true among the young men who contribute the most to the high crime rates we hear about. I knew a young Moroccan man who would brag about the people he had beaten up ("You see that blood on my jacket? It's not mine!"). He had no interest at all in religion and was probably more influenced by African American hip hop than anything else. Insofar as "Islam" plays a role in all this, it is simply as an identity tag and not as a belief system.
If we look at immigrants to Denmark, they fall into three groups:
1. Other Europeans.
2. People from North Africa, the Middle East, Afghanistan, and Pakistan, i.e., the Muslim world.
3. East Asia, Southeast Asia, and non-Muslim South Asia.
On the face of it, religion (i.e., Islam) is the main difference between group 2 and groups 1 and 3. I would argue that there is another difference. Groups 1 and 3 have lived in societies where the State has exercised an effective monopoly over the use of violence, with the result that people have much higher thresholds for expression of personal violence. Group 2 corresponds to societies where the state has never monopolized the use of violence and where every adult male is expected to use violence as a legitimate means of resolving personal disputes. Thresholds for the expression of personal violence are thus correspondingly lower.
I don't expect to convince you. All I'm saying is that you shouldn't prejudge things by positing "Islam" as a causal factor when it probably isn't.
Thank you for good suggestions.
Fixed.
Fixed.
Fixed.
We did the regression again but for europeans only. It still holds, added a section about this. It does not hold for Asian countries though but the samples are getting awfully small to be doing this type of moderator analysis.
As for Muslims being only nominally Muslims, have you seen this? http://www.wzb.eu/sites/default/files/u6/koopmans_englisch_ed.pdf
If someone is wondering why many of the correlations have changed slightly, it is because we found some errors in the national IQ variable and have corrected them. Also, we have switched to using GDP from 2013 instead of 2012.
New draft attached.
A few comments:
The wording in this paper is sometimes abrupt, excessive, or simply wrong. Here are a few examples:
"We obtained data from Denmark for the largest 71 immigrants by country of origin." - replace "immigrants" with "immigrant groups" or "the 71 largest sources of immigrants by ..." You might want to mention that some of these groups are heterogeneous. The term "Ugandan" probably includes not only indigenous Africans but also people of South Asian origin.
"Yugoslavia's IQ" - the state of Yugoslavia has not existed for some time. You should write "the former Yugoslavia"
Fixed.
"they immigrated too late to the country and so have not taken advantage of the tax-funded education system." - Why not simply say: "and so have not attended Danish schools". Schools are tax-funded in most countries. Why insist on that point?
Fixed.
"Perhaps some immigrant groups are criminal" - Wouldn't it be more correct to say that some immigrant groups have a high incidence of criminality?
Fixed.
"Islam", "Islam", "Islam", etc. etc. etc. As I pointed out in a previous comment, most Muslims in Western Europe are only nominally so. This is particularly true among the young men who contribute the most to the high crime rates we hear about. I knew a young Moroccan man who would brag about the people he had beaten up ("You see that blood on my jacket? It's not mine!"). He had no interest at all in religion and was probably more influenced by African American hip hop than anything else. Insofar as "Islam" plays a role in all this, it is simply as an identity tag and not as a belief system.
If we look at immigrants to Denmark, they fall into three groups:
1. Other Europeans.
2. People from North Africa, the Middle East, Afghanistan, and Pakistan, i.e., the Muslim world.
3. East Asia, Southeast Asia, and non-Muslim South Asia.
On the face of it, religion (i.e., Islam) is the main difference between group 2 and groups 1 and 3. I would argue that there is another difference. Groups 1 and 3 have lived in societies where the State has exercised an effective monopoly over the use of violence, with the result that people have much higher thresholds for expression of personal violence. Group 2 corresponds to societies where the state has never monopolized the use of violence and where every adult male is expected to use violence as a legitimate means of resolving personal disputes. Thresholds for the expression of personal violence are thus correspondingly lower.
I don't expect to convince you. All I'm saying is that you shouldn't prejudge things by positing "Islam" as a causal factor when it probably isn't.
We did the regression again but for europeans only. It still holds, added a section about this. It does not hold for Asian countries though but the samples are getting awfully small to be doing this type of moderator analysis.
As for Muslims being only nominally Muslims, have you seen this? http://www.wzb.eu/sites/default/files/u6/koopmans_englisch_ed.pdf
If someone is wondering why many of the correlations have changed slightly, it is because we found some errors in the national IQ variable and have corrected them. Also, we have switched to using GDP from 2013 instead of 2012.
New draft attached.
"We did the regression again but for europeans only. It still holds"
You're missing my point. Historically, most Muslims (including those of the former Ottoman lands of southeastern Europe) have lived in societies where the State has only recently imposed a monopoly on violence. It was normal for every adult male to use personal violence as a way to settle disputes or to protect his family from real or imagined threats. The threshold for expression of violence was thus much lower than it is in Western Europe, where only the police and the army are supposed to use violence. This has nothing to do with Islam as a belief system. One might argue that Islamic culture hindered, in various ways, the development of the State, but that would be another debate.
As for those surveys you referred to, you'll find that adherence to fundamentalist Islam tends to be stronger among women than among men and among older age groups than among younger age groups. Yet personal violence is most common among young men (as is the case in all human populations). The correlation between the presumed cause and the presumed effect is negative.
There seems to be an assumption here that personal violence among young men of Muslim origin will decline if Muslim women are forced to remove their head scarves. This is delusional thinking.
You're missing my point. Historically, most Muslims (including those of the former Ottoman lands of southeastern Europe) have lived in societies where the State has only recently imposed a monopoly on violence. It was normal for every adult male to use personal violence as a way to settle disputes or to protect his family from real or imagined threats. The threshold for expression of violence was thus much lower than it is in Western Europe, where only the police and the army are supposed to use violence. This has nothing to do with Islam as a belief system. One might argue that Islamic culture hindered, in various ways, the development of the State, but that would be another debate.
As for those surveys you referred to, you'll find that adherence to fundamentalist Islam tends to be stronger among women than among men and among older age groups than among younger age groups. Yet personal violence is most common among young men (as is the case in all human populations). The correlation between the presumed cause and the presumed effect is negative.
There seems to be an assumption here that personal violence among young men of Muslim origin will decline if Muslim women are forced to remove their head scarves. This is delusional thinking.
Kinda like the difference between southern and northern states in the US? At least according to Nisbett. See: http://vimeo.com/19921232
There seems to be no way to test your hypothesis with these data. But Islam is a really great predictor even controlling for IQ, GDP, height. There has to be something about Islamic countries, it could be the religion and its teachings, it could be a culture of self-delivered justice as you say. I can add this to the discussion. Will that satisfy you? :)
There seems to be no way to test your hypothesis with these data. But Islam is a really great predictor even controlling for IQ, GDP, height. There has to be something about Islamic countries, it could be the religion and its teachings, it could be a culture of self-delivered justice as you say. I can add this to the discussion. Will that satisfy you? :)
"We did the regression again but for europeans only. It still holds"
Hi Peter,
What type of specific variable would you ideally like us to control for? We can dig up state antiquity ones, but I'm not sure if that's what you want.
http://www.econ.brown.edu/fac/louis_putterman/antiquity%20index.htm
I want first to react to this comment :
A regression or partial correlational thing is just telling you that when x is controlled, y is still predicting z, there is something in common with y that is related with y. There's nothing wrong with regression. Unless you interpret the output above what it shows you.
For the article, now...
I said it before but I like the tables when they look like the one you have in table 13. The numbers appearing in your correlation matrix are just too small and I must zoom in constantly. And worse, they don't fit the text. That it, the numbers in your tables are too small compared to the text.
The first thing I find very shocking is that you have not explained what the highest values in your educational variables are. Worse outcome or better outcome ? Because anyone looking at your table 1, with basic school correlating negatively with GDP and IQ, everyone will not interpret it correctly. If that variable measures the % of people who had only basic schooling, it's obvious that the expected correlation will be negative. You must precise it below the table 1. Or change the name of your variable in "percentage having only basic school". For example.
For your Islam variable, I look at your .sav file, and hopefully, it's not a dichotomy variable. Otherwise, i would have questioned the relevance of that data.
"The corollary of this is that the known correlates of g are also retained such as average education levels."
This sentence is not clear to me.
"All four predictors performed similarly for this variable which is surprising given that height seems to have no conceptual relevance to education."
It's curious because I think it has relevance. Height is correlated with education because it correlates with physical health, SES, and perhaps IQ (if I remember well).
Also, when you use PC2EdAtt, you should probably precise that whatever the correlation may look like, none of these correlations make sense because the PC2 does not make sense, and you admitted it. In fact, there is no necessity to correlate that PC2 if it is not interpretable, if there is no theoretical basis for PC2.
"IQs (NIQ) did not predict income after age 60,"
That's wrong. The r was 0.225. Not impressive maybe, but not small either.
For tables 6 & 7, I know that early income is PC2, but perhaps you should precise it in your text, and make sure everyone understand it. For example the sentence may look like "The second was interpretable as an early income and we label it as such, e.g., in table 7.". And for that same reason, you also expect negative r between use of social benefit and height. And it's exactly what you have in table 8 and 10.
"Perhaps some immigrant groups high relatively high crime rates but also have relatively high average incomes"
There is surely one word or more that is missing here.
When you jump from table 9 to 10. It's too abrupt. You simply write that the correlations are shown below. But you don't discuss the output. I think it's recommended to do this. For example, height is negatively correlated with Islam and social benefit. Why this pattern ? for example, you can say (but it's my interpretation, not necessarily yours) that taller people have higher SES levels, and because social benefits correlate negatively with GDP and IQ, this pattern was also expected. Why Islam has a negative sign for height, well, I don't know.
Concerning table 11, I want to make sure about something. Can you reverse the numbers in "long-tert-edu" and "income" ? And then, re-do the PC ? If everything is right, the PCs with and without "long-tert-edu" reversed should be correlated at 100%.
To be honest I recommend you to reverse the variables of income and long-tert-edu or the others. Usually, in PC analysis, all the loadings must be positive. In your situation, you have signs in every direction, and it's truly confusing. It's even more complicated to interpret PC2, PC3, and PC4. I'm pretty sure most practioners will not like your table 11. And, for example, I don't like it.
"We tested whether predictors could be combined to improve the prediction of the general socioeconomic factor with multiple regression."
To be honest, I don't like R². The practice of squaring correlation is really so wrong. I don't think it's needed for your analyses. It really adds nothing, just more confusion for people who trust the R². If any predictors add something, the best way to look at it is through its beta coefficient (standardized and/or unstandardized). The r² is difficult to interpret, especially for the individual predictors. That does not even tell you how much they change when more independent var are added in the model.
"Islam was an very good predictor"
There is an error with that word.
"was due to Islamic countries have low IQs"
Must be "having".
Concerning your series of plot, I'm not sure everyone will understand what you do. You have not introduced the method of the regression's predicted value and what you attempt to show here in the different plots.
"If anything Islam is a better predictor within the sample of European countries of origin although this was due to two countries Bosnia Herzegovina and Macedonia."
Ok, but that simply means your correlation depends only on these two countries. Remove them, and perhaps the correlation will fall. At 1rst shot I don't see any clear trend for european countries. However, I'm not sure that the removal of the MENAPs is the correct thing to do. As Templer said in Can't see the forest because of the trees, "To omit the Black-African countries in considering the geographical distribution of HIV/AIDS is like omitting the Northwestern European countries in a study of Nobel laureates." So you'll surely remove some relevant information belonging to these countries, and some variances as well.
"We have shown that how well an immigrant group"
I'm not very good in english but i'm sure this sentence is wrong. The word "that" can be removed.
By the way... the introduction seems a little bit short, no ? Perhaps I would suggest a little description about what you attempt to show, what you expect to find in your series of correlations/regressions. I feel the readers may be at loss otherwise.
This has nothing to do with Islam as a belief system.
A regression or partial correlational thing is just telling you that when x is controlled, y is still predicting z, there is something in common with y that is related with y. There's nothing wrong with regression. Unless you interpret the output above what it shows you.
For the article, now...
I said it before but I like the tables when they look like the one you have in table 13. The numbers appearing in your correlation matrix are just too small and I must zoom in constantly. And worse, they don't fit the text. That it, the numbers in your tables are too small compared to the text.
The first thing I find very shocking is that you have not explained what the highest values in your educational variables are. Worse outcome or better outcome ? Because anyone looking at your table 1, with basic school correlating negatively with GDP and IQ, everyone will not interpret it correctly. If that variable measures the % of people who had only basic schooling, it's obvious that the expected correlation will be negative. You must precise it below the table 1. Or change the name of your variable in "percentage having only basic school". For example.
For your Islam variable, I look at your .sav file, and hopefully, it's not a dichotomy variable. Otherwise, i would have questioned the relevance of that data.
"The corollary of this is that the known correlates of g are also retained such as average education levels."
This sentence is not clear to me.
"All four predictors performed similarly for this variable which is surprising given that height seems to have no conceptual relevance to education."
It's curious because I think it has relevance. Height is correlated with education because it correlates with physical health, SES, and perhaps IQ (if I remember well).
Also, when you use PC2EdAtt, you should probably precise that whatever the correlation may look like, none of these correlations make sense because the PC2 does not make sense, and you admitted it. In fact, there is no necessity to correlate that PC2 if it is not interpretable, if there is no theoretical basis for PC2.
"IQs (NIQ) did not predict income after age 60,"
That's wrong. The r was 0.225. Not impressive maybe, but not small either.
For tables 6 & 7, I know that early income is PC2, but perhaps you should precise it in your text, and make sure everyone understand it. For example the sentence may look like "The second was interpretable as an early income and we label it as such, e.g., in table 7.". And for that same reason, you also expect negative r between use of social benefit and height. And it's exactly what you have in table 8 and 10.
"Perhaps some immigrant groups high relatively high crime rates but also have relatively high average incomes"
There is surely one word or more that is missing here.
When you jump from table 9 to 10. It's too abrupt. You simply write that the correlations are shown below. But you don't discuss the output. I think it's recommended to do this. For example, height is negatively correlated with Islam and social benefit. Why this pattern ? for example, you can say (but it's my interpretation, not necessarily yours) that taller people have higher SES levels, and because social benefits correlate negatively with GDP and IQ, this pattern was also expected. Why Islam has a negative sign for height, well, I don't know.
Concerning table 11, I want to make sure about something. Can you reverse the numbers in "long-tert-edu" and "income" ? And then, re-do the PC ? If everything is right, the PCs with and without "long-tert-edu" reversed should be correlated at 100%.
To be honest I recommend you to reverse the variables of income and long-tert-edu or the others. Usually, in PC analysis, all the loadings must be positive. In your situation, you have signs in every direction, and it's truly confusing. It's even more complicated to interpret PC2, PC3, and PC4. I'm pretty sure most practioners will not like your table 11. And, for example, I don't like it.
"We tested whether predictors could be combined to improve the prediction of the general socioeconomic factor with multiple regression."
To be honest, I don't like R². The practice of squaring correlation is really so wrong. I don't think it's needed for your analyses. It really adds nothing, just more confusion for people who trust the R². If any predictors add something, the best way to look at it is through its beta coefficient (standardized and/or unstandardized). The r² is difficult to interpret, especially for the individual predictors. That does not even tell you how much they change when more independent var are added in the model.
"Islam was an very good predictor"
There is an error with that word.
"was due to Islamic countries have low IQs"
Must be "having".
Concerning your series of plot, I'm not sure everyone will understand what you do. You have not introduced the method of the regression's predicted value and what you attempt to show here in the different plots.
"If anything Islam is a better predictor within the sample of European countries of origin although this was due to two countries Bosnia Herzegovina and Macedonia."
Ok, but that simply means your correlation depends only on these two countries. Remove them, and perhaps the correlation will fall. At 1rst shot I don't see any clear trend for european countries. However, I'm not sure that the removal of the MENAPs is the correct thing to do. As Templer said in Can't see the forest because of the trees, "To omit the Black-African countries in considering the geographical distribution of HIV/AIDS is like omitting the Northwestern European countries in a study of Nobel laureates." So you'll surely remove some relevant information belonging to these countries, and some variances as well.
"We have shown that how well an immigrant group"
I'm not very good in english but i'm sure this sentence is wrong. The word "that" can be removed.
By the way... the introduction seems a little bit short, no ? Perhaps I would suggest a little description about what you attempt to show, what you expect to find in your series of correlations/regressions. I feel the readers may be at loss otherwise.
"Islam is a really great predictor even controlling for IQ, GDP, height. There has to be something about Islamic countries, it could be the religion and its teachings, it could be a culture of self-delivered justice as you say."
I'm saying that if you exclude immigrants from Europe and East Asia (two culture areas whose social relations have long been pacified by the State), you're left with a pool of immigrants who are mainly Muslim. Hence the "really great" correlation.
I could speculate on the reasons why the right to personal violence has survived to a greater extent in the Muslim world. One reason was the problem of succession: deaths of emperors or princes were typically followed by long periods of civil conflict and anarchy. This was largely because polygamy produced large numbers of rival male heirs. Another reason was the romantic ideal of nomadic pastoralism and a corresponding denigration of civilized urban living. Perhaps there are other reasons.
You mentioned that there is a high correlation between Muslim origin and criminality even among native-born Europeans. This correlation, however, is largely driven by the Muslim minorities of southeastern Europe (Albanians and Bosniaks, for the most part). Most of those "Muslims" are non-observant. The relevant factor is not Islam but the fact of living in a familialist clan-based society where the good of the family takes precedence over the good of society, which is at best an abstract concept of recent origin.
I'm saying that if you exclude immigrants from Europe and East Asia (two culture areas whose social relations have long been pacified by the State), you're left with a pool of immigrants who are mainly Muslim. Hence the "really great" correlation.
I could speculate on the reasons why the right to personal violence has survived to a greater extent in the Muslim world. One reason was the problem of succession: deaths of emperors or princes were typically followed by long periods of civil conflict and anarchy. This was largely because polygamy produced large numbers of rival male heirs. Another reason was the romantic ideal of nomadic pastoralism and a corresponding denigration of civilized urban living. Perhaps there are other reasons.
You mentioned that there is a high correlation between Muslim origin and criminality even among native-born Europeans. This correlation, however, is largely driven by the Muslim minorities of southeastern Europe (Albanians and Bosniaks, for the most part). Most of those "Muslims" are non-observant. The relevant factor is not Islam but the fact of living in a familialist clan-based society where the good of the family takes precedence over the good of society, which is at best an abstract concept of recent origin.
I'm saying that if you exclude immigrants from Europe and East Asia (two culture areas whose social relations have long been pacified by the State), you're left with a pool of immigrants who are mainly Muslim. Hence the "really great" correlation.
Thanks. Emil will note the points which you made. We can not statistically explore the issue more, since we currently have no good variable with which we can separate the effect of "Islam" from that of "region of origin".
I said it before but I like the tables when they look like the one you have in table 13. The numbers appearing in your correlation matrix are just too small and I must zoom in constantly. And worse, they don't fit the text. That it, the numbers in your tables are too small compared to the text.
You must be looking at the 1st ed.. In the 2nd ed. Table 13 has large fonts because it is a native LATEX table.
The reason why many of the tables have different font sizes is that they are copied as a pic from a spreadsheet where we created them. We can enlarge and rotate the size of the pic of course. I have tried keeping them readable as well as not overly large. If they get above a certain size, I will need to rotate them 90 degrees.
The first thing I find very shocking is that you have not explained what the highest values in your educational variables are. Worse outcome or better outcome ? Because anyone looking at your table 1, with basic school correlating negatively with GDP and IQ, everyone will not interpret it correctly. If that variable measures the % of people who had only basic schooling, it's obvious that the expected correlation will be negative. You must precise it below the table 1. Or change the name of your variable in "percentage having only basic school". For example.
The beginning of Section 3 notes exactly what they are: "Correlational analysis of the proportion who have only basic schooling is shown in Table \ref{basic_school}, while Table \ref{tertiary_ed} shows the similar analysis for proportion who have long tertiary educational degrees." These are Tables 1-2.
I have added more explanatory text to the table captions.
For your Islam variable, I look at your .sav file, and hopefully, it's not a dichotomy variable. Otherwise, i would have questioned the relevance of that data.
It is not. It is based on the Pew Research survey, which you can find on Wikipedia, as mentioned in the paper. https://en.wikipedia.org/wiki/Islam_by_country As you can see, countries are not coded as either 0 or 1. E.g. Benin has 24.5% Muslims while Bangladesh is 90.4%.
"The corollary of this is that the known correlates of g are also retained such as average education levels."
This sentence is not clear to me.
I'm not sure how to make it more clear. It seems clear to me especially in the context:
In our previous paper\cite{fuerstkirkegaard2014} we introduced the spatial transferability hypothesis, which is the proposition that when people migrate to other countries, they retain their traits, whether personality, cognitive or other. The corollary of this is that the known correlates of \textit{g} are also retained such as average education levels. We have previously shown this to be true for fertility and crime rates in Denmark\cite{kirkegaard2014DK}, crime rates in Norway\cite{kirkegaard2014NO}, GMAT, TOEFL, GRE, PISA and GPA in the U.S\cite{JFuerst1,JFuerst2}. In this paper we examine new data from Denmark about educational level, income and use of social benefits.
"All four predictors performed similarly for this variable which is surprising given that height seems to have no conceptual relevance to education."
It's curious because I think it has relevance. Height is correlated with education because it correlates with physical health, SES, and perhaps IQ (if I remember well).
Yes, but we wrote "conceptual relevance", not just relevance as a correlate with something else.
Also, when you use PC2EdAtt, you should probably precise that whatever the correlation may look like, none of these correlations make sense because the PC2 does not make sense, and you admitted it. In fact, there is no necessity to correlate that PC2 if it is not interpretable, if there is no theoretical basis for PC2.
I have written some more in the text about PC2, but kept it in the matrix to show that it is a nonsense factor.
"IQs (NIQ) did not predict income after age 60,"
That's wrong. The r was 0.225. Not impressive maybe, but not small either.
You were looking at the Spearman rho only. The Pearson r is .064. The Spearman rho has p=.072, so perhaps a fluke.
For tables 6 & 7, I know that early income is PC2, but perhaps you should precise it in your text, and make sure everyone understand it. For example the sentence may look like "The second was interpretable as an early income and we label it as such, e.g., in table 7.". And for that same reason, you also expect negative r between use of social benefit and height. And it's exactly what you have in table 8 and 10.
But the variables are named "latent_adult_income" and "latent_early_income" in Table 7, so I'm not sure what you are criticizing.
"Perhaps some immigrant groups high relatively high crime rates but also have relatively high average incomes"
There is surely one word or more that is missing here.
Fixed.
When you jump from table 9 to 10. It's too abrupt. You simply write that the correlations are shown below. But you don't discuss the output. I think it's recommended to do this. For example, height is negatively correlated with Islam and social benefit. Why this pattern ? for example, you can say (but it's my interpretation, not necessarily yours) that taller people have higher SES levels, and because social benefits correlate negatively with GDP and IQ, this pattern was also expected. Why Islam has a negative sign for height, well, I don't know.
I have added a bit more text about the results in Table 10.
But these e.g. Islam r height are the country level relationships, which are not the topic of this paper. The predictive ability of height seems to be mostly due to its association with NIQs and GDP at the national levels. If it was not, it should add predictive value in multiple regression, but it did not.
I did a partial correlation just now with height x PC1SocialBenefits. It is -0.09 controlling for NIQ, GDP and Islam. No independent predictiveness. It is -.211 with just NIQ controlled and -.086 with GDP controlled, and -.314 with Islam controlled. So the relationship is due to GDPs. GDPs boosts heights it seems.
Concerning table 11, I want to make sure about something. Can you reverse the numbers in "long-tert-edu" and "income" ? And then, re-do the PC? If everything is right, the PCs with and without "long-tert-edu" reversed should be correlated at 100%.
I can but the direction makes no difference.
To be honest I recommend you to reverse the variables of income and long-tert-edu or the others. Usually, in PC analysis, all the loadings must be positive. In your situation, you have signs in every direction, and it's truly confusing. It's even more complicated to interpret PC2, PC3, and PC4. I'm pretty sure most practioners will not like your table 11. And, for example, I don't like it.
It is because some variables measure good things and others negative things (in the context of well-doing on the group in Denmark). We can reverse variables so that positive values are always better and negative always worse, but it makes no difference for the math.
What you are proposing is actually reversing the two positive ones so that more is always worse. I think it is better like it is now with the variables facing the direction that makes sense for them rather than so they all come out negative or positive in the PCA table.
"We tested whether predictors could be combined to improve the prediction of the general socioeconomic factor with multiple regression."
To be honest, I don't like R². The practice of squaring correlation is really so wrong. I don't think it's needed for your analyses. It really adds nothing, just more confusion for people who trust the R². If any predictors add something, the best way to look at it is through its beta coefficient (standardized and/or unstandardized). The r² is difficult to interpret, especially for the individual predictors. That does not even tell you how much they change when more independent var are added in the model.
The reason to use R² in this case is that SPSS calculates the adjusted R² but not the adjusted R. In multiple regression, just adding a variable generally increases the R value, even when it is a nonsense, randomly distributed variable. This is because the regression abuses random fluctuation in the data.
In Table 13 we report both R, R² and adjusted R², so researchers who like R more can just read that column.
It takes up too much space if we were to report all the beta coefficients. Anyone curious can easily calculate them as they like with the dataset. We think most people will not care much.
"Islam was an very good predictor"
There is an error with that word.
Fixed.
"was due to Islamic countries have low IQs"
Must be "having".
Fixed.
Concerning your series of plot, I'm not sure everyone will understand what you do. You have not introduced the method of the regression's predicted value and what you attempt to show here in the different plots.
It is mentioned on the plot "PC1GeneralSES63". However, I have added some more text.
"If anything Islam is a better predictor within the sample of European countries of origin although this was due to two countries Bosnia Herzegovina and Macedonia."
Ok, but that simply means your correlation depends only on these two countries. Remove them, and perhaps the correlation will fall. At 1rst shot I don't see any clear trend for european countries. However, I'm not sure that the removal of the MENAPs is the correct thing to do. As Templer said in Can't see the forest because of the trees, "To omit the Black-African countries in considering the geographical distribution of HIV/AIDS is like omitting the Northwestern European countries in a study of Nobel laureates." So you'll surely remove some relevant information belonging to these countries, and some variances as well.
We neither, but Peter Frost wanted to see. There will surely be more discussion of Islam as a predictor in later datasets, so it seems right to include this.
"We have shown that how well an immigrant group"
I'm not very good in english but i'm sure this sentence is wrong. The word "that" can be removed.
It is not wrong, but it can be removed in most cases. I think it is better to keep it in this case.
By the way... the introduction seems a little bit short, no ? Perhaps I would suggest a little description about what you attempt to show, what you expect to find in your series of correlations/regressions. I feel the readers may be at loss otherwise.
I think introductions are generally too long in scientific papers and for this reason most people tend to skim or skip them. I want to get to the point and not write a half textbook introducing the reader to the subject. The reader can read the references given.
---
I have attached a new version with the changes made based on Meng Hu's commentary.