I used only SSA countries but Islam is still a good predictor. R = .914. Even removing Somalia does not get rid of the correlation. It is .686 without Somalia.
I also tried dividing countries into MENAP and not. The correlation is positive for both groups. R = .593 in non-MENAP and R = .314 in MENAP. This does not seem to fit well with Peter's hypothesis.
The predictive ability of Islam seems quite robust.
Back to [Archive] Post-review discussions
I entered countries with dichotomous variables for whether they were European, SSAfrican or MENAP. Then I used multiple regression along with the Islam variable to see whether these variables has any effect on predicting performance in Denmark. SSA and MENAP had no effects in the model, but European had a high effect still lower than Islam and in the opposite direction.
It also indicates that being MENAP in itself is not associated with explanatory power.
It also indicates that being MENAP in itself is not associated with explanatory power.
We seem to be talking past each other. I'm not challenging your correlation. I'm saying it's an artefact of factors that are tangentially related to Islam. To be brief, Islamic polities have been less effective at monopolizing the use of violence than polities in Europe and East Asia.
When you limit your correlation to European societies, you should keep in mind that much of southeastern Europe was still part of the Ottoman Empire until just before World War I. This is why Albanians are still oriented toward amoral familialism, clannism, and increased readiness to use personal violence for defence of "honour" and "face." A major criticism of the Ottoman Empire was that it could not maintain security of life and property. Banditry was very common and the State was unable or unwilling to put an end to it. The preferred strategy was to co-opt bandits and warlords.
In the case of sub-Saharan Africa, you should keep in mind that the non-Muslim immigrants come mainly from formerly British East Africa (Uganda, Kenya, Tanzania). Many if not most of these "Africans" are of South Asian origin.
When you limit your correlation to European societies, you should keep in mind that much of southeastern Europe was still part of the Ottoman Empire until just before World War I. This is why Albanians are still oriented toward amoral familialism, clannism, and increased readiness to use personal violence for defence of "honour" and "face." A major criticism of the Ottoman Empire was that it could not maintain security of life and property. Banditry was very common and the State was unable or unwilling to put an end to it. The preferred strategy was to co-opt bandits and warlords.
In the case of sub-Saharan Africa, you should keep in mind that the non-Muslim immigrants come mainly from formerly British East Africa (Uganda, Kenya, Tanzania). Many if not most of these "Africans" are of South Asian origin.
Emil O.K., I think I will give you my vote, because I don't have see any particular flaws on the articles, just some points I don't recommend and i explained that in my earlier comment. I replied to your comment below. Let's see if you disagree or not but whatever the case, I don't see any reason to disapprove the publication.
I will never trust p-value if i were you. It's not significant because the sample is not large enough. But the size is not small. I will not recommend you to be mistaken on what a significance test is. It seems to me a lot of people don't know what it is. I have seen a lot of people having correlation of, say, between 0.1 and 0.3 but with p larger than 0.05 and conclude in the end "no correlation". That's wrong. Large p means that whatever your correlation is, your N is not large enough to have lot of confidence in your result. Why I dislike p value is because the p value cannot add much new information. p value (or X²) is based on two things : sample size and effect size. You already have these two information, so p value does not add anything worthy of consideration.
And yes, you note there's a difference between r and rho. That means, maybe, that an outlier is killing your correlation.
I know. But it's not what I said. No, I said that it's not necessary to correlate the PC2 with the other variables.
It will make a big difference for the interpretation, i'm sure that I'm not the only one who find your table 11 difficult to read. Generally, a "g" factor has all positive loadings on it. Sometimes, even practioners remove the variables which have zero or negative loadings on PC1. They want it to be all positive.
Well. A lot of researchers interpret it like that. They add a new variable in the regression, and this variable has good correlation with dependent var. And yet R² is low. They conclude the new variable is not important. Given that, I don't see why R² should be trusted. To get the best picture of an effect of any given variable, the best way is to examine the regression coefficient, either standardized or not. It's better than R² or R.
P.S. regarding the small size of the numbers in your tables, I was reading the 2nd version, not the 1rst. Even on the 3rd draft, the numbers are all smaller than the letters in your text.
You were looking at the Spearman rho only. The Pearson r is .064. The Spearman rho has p=.072, so perhaps a fluke.
I will never trust p-value if i were you. It's not significant because the sample is not large enough. But the size is not small. I will not recommend you to be mistaken on what a significance test is. It seems to me a lot of people don't know what it is. I have seen a lot of people having correlation of, say, between 0.1 and 0.3 but with p larger than 0.05 and conclude in the end "no correlation". That's wrong. Large p means that whatever your correlation is, your N is not large enough to have lot of confidence in your result. Why I dislike p value is because the p value cannot add much new information. p value (or X²) is based on two things : sample size and effect size. You already have these two information, so p value does not add anything worthy of consideration.
And yes, you note there's a difference between r and rho. That means, maybe, that an outlier is killing your correlation.
I have written some more in the text about PC2, but kept it in the matrix to show that it is a nonsense factor.
I know. But it's not what I said. No, I said that it's not necessary to correlate the PC2 with the other variables.
It is because some variables measure good things and others negative things (in the context of well-doing on the group in Denmark). We can reverse variables so that positive values are always better and negative always worse, but it makes no difference for the math.
It will make a big difference for the interpretation, i'm sure that I'm not the only one who find your table 11 difficult to read. Generally, a "g" factor has all positive loadings on it. Sometimes, even practioners remove the variables which have zero or negative loadings on PC1. They want it to be all positive.
The reason to use R² in this case is that SPSS calculates the adjusted R² but not the adjusted R. In multiple regression, just adding a variable generally increases the R value, even when it is a nonsense, randomly distributed variable. This is because the regression abuses random fluctuation in the data.
Well. A lot of researchers interpret it like that. They add a new variable in the regression, and this variable has good correlation with dependent var. And yet R² is low. They conclude the new variable is not important. Given that, I don't see why R² should be trusted. To get the best picture of an effect of any given variable, the best way is to examine the regression coefficient, either standardized or not. It's better than R² or R.
P.S. regarding the small size of the numbers in your tables, I was reading the 2nd version, not the 1rst. Even on the 3rd draft, the numbers are all smaller than the letters in your text.
Emil O.K., I think I will give you my vote, because I don't have see any particular flaws on the articles, just some points I don't recommend and i explained that in my earlier comment. I replied to your comment below. Let's see if you disagree or not but whatever the case, I don't see any reason to disapprove the publication.You were looking at the Spearman rho only. The Pearson r is .064. The Spearman rho has p=.072, so perhaps a fluke.
I will never trust p-value if i were you. It's not significant because the sample is not large enough. But the size is not small. I will not recommend you to be mistaken on what a significance test is. It seems to me a lot of people don't know what it is. I have seen a lot of people having correlation of, say, between 0.1 and 0.3 but with p larger than 0.05 and conclude in the end "no correlation". That's wrong. Large p means that whatever your correlation is, your N is not large enough to have lot of confidence in your result. Why I dislike p value is because the p value cannot add much new information. p value (or X²) is based on two things : sample size and effect size. You already have these two information, so p value does not add anything worthy of consideration.
I am aware of this fact about statistics. :)
I saw a horrible example of it recently in this paper: http://www.jneurosci.org/content/34/13/4567.long
OSU did not differ from CS in ethnicity (χ2(4) = 8.2, p = 0.08), age, education, or verbal IQ, but had more males (61%) than CS (43%, χ2 (1) = 4.2, p = 0.04).
p=0.08... ergo did not differ
p=0.04.. ergo did differ
but the probabilities are 92% and 96%! According to the p-value analysis, neither of these are likely to be flukes. But the authors seem to believe in some magic division between 0.08 and 0.04.
It is because some variables measure good things and others negative things (in the context of well-doing on the group in Denmark). We can reverse variables so that positive values are always better and negative always worse, but it makes no difference for the math.
It will make a big difference for the interpretation, i'm sure that I'm not the only one who find your table 11 difficult to read. Generally, a "g" factor has all positive loadings on it. Sometimes, even practioners remove the variables which have zero or negative loadings on PC1. They want it to be all positive.
Often, ECT/RT are coded so that they give rise to negative correlations in the correlational matrix/PCA.
The reason to use R² in this case is that SPSS calculates the adjusted R² but not the adjusted R. In multiple regression, just adding a variable generally increases the R value, even when it is a nonsense, randomly distributed variable. This is because the regression abuses random fluctuation in the data.
Well. A lot of researchers interpret it like that. They add a new variable in the regression, and this variable has good correlation with dependent var. And yet R² is low. They conclude the new variable is not important. Given that, I don't see why R² should be trusted. To get the best picture of an effect of any given variable, the best way is to examine the regression coefficient, either standardized or not. It's better than R² or R.
P.S. regarding the small size of the numbers in your tables, I was reading the 2nd version, not the 1rst. Even on the 3rd draft, the numbers are all smaller than the letters in your text.
As I wrote, this is because we are embedding them as pictures not using LATEX tables. This saves a lot of time converting. I generally don't think it is a problem that readers must zoom in a bit for small numbers. The other option is to make the tables even larger which prevents the reading flow for those who do not closely inspect the tables (perhaps most readers).
I have attached a new version, adding the analyses about Islam as well as some general fixes.
I runned some regressions using your attached data. Concerning table 13, you should precise for each model the number of countries, because they obviously differ (Islam only, N is 61, but for other models, N is usually 48). Also, optionally, you can add some precisions about the normality of your residuals. Try this.
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT g_se_pc1
/METHOD=ENTER Islam
/SCATTERPLOT=(*ZRESID ,*ZPRED)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID).
In the first, you see that your P-P plot is more or less normal.
However, if you use this :
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT g_se_pc1
/METHOD=ENTER GDP Height
/SCATTERPLOT=(*ZRESID ,*ZPRED)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID).
You'll see that the p-p plot is not normal, but the deviation from normality seems not too alarming. The p-p plot has the same shape if you use islam+gdp+iq+height. But the p-p plot for IQ+height model seems to have a big problem.
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT g_se_pc1
/METHOD=ENTER Islam
/SCATTERPLOT=(*ZRESID ,*ZPRED)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID).
In the first, you see that your P-P plot is more or less normal.
However, if you use this :
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN TOL CHANGE ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT g_se_pc1
/METHOD=ENTER GDP Height
/SCATTERPLOT=(*ZRESID ,*ZPRED)
/RESIDUALS HISTOGRAM(ZRESID) NORMPROB(ZRESID).
You'll see that the p-p plot is not normal, but the deviation from normality seems not too alarming. The p-p plot has the same shape if you use islam+gdp+iq+height. But the p-p plot for IQ+height model seems to have a big problem.
I have never used the SPSS for inputting the commands manually, just using menus. In hindsight, it is best if papers include this kind of information in the supplementary information, so that others can reproduce results exactly.
I don't understand the meaning of p-plots statistically. I need to study more statistics.
I don't understand the meaning of p-plots statistically. I need to study more statistics.
It's an assumption of regressions, although it is rarely pointed out in academic papers, sadly. I see that they usually don't care about the normality of residuals, even if they should. Here's what Andy Field said in his book (discovering statistics using spss, 2009, p. 251) :
Some researchers sometimes (but not always) look at the normality of the data sets. It's possible to evaluate univariate normality, using syntax like that in SPSS :
FREQUENCIES VARIABLES=x1 x2 x3
/FORMAT=NOTABLE
/HISTOGRAM NORMAL
/ORDER=ANALYSIS.
EXAMINE VARIABLES=x1 x2 x3
/PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING PAIRWISE
/NOTOTAL.
But it's more definitive to look at how the p-p plot looks in your regressions. I have written the following in an earlier blog post of my mine :
(edit: after reading the underlined passage, I think it should be re-written "reject the null hypothesis that the distribution is normal")
Finally, you can read this here :
http://www.ats.ucla.edu/stat/spss/webbooks/reg/chapter1/spssreg1.htm
You need to check some of the assumptions of regression to make sure your model generalizes beyond your sample:
- Look at the graph of* ZRESID plotted against* ZPRED. If it looks like a random array of dots then this is good. If the dots seem to get more or less spread out over the graph (look like a funnel) then this is probably a violation of the assumption of homogeneity of variance. If the dots have a pattern to them (i.e. a curved shape) then this is probably a violation of the assumption of linearity. If the dots seem to have a pattern and are more spread out at some points on the plot than others then this probably reflects violations of both homogeneity of variance and linearity. Any of these scenarios puts the validity of your model into question. Repeat the above for all partial plots too.
- Look at histograms and P–P plots. If the histograms look like normal distributions (and the P–P plot looks like a diagonal line), then all is well. If the histogram looks non-normal and the P–P plot looks like a wiggly snake curving around a diagonal line then things are less good! Be warned, though: distributions can look very non-normal in small samples even when they are!
Some researchers sometimes (but not always) look at the normality of the data sets. It's possible to evaluate univariate normality, using syntax like that in SPSS :
FREQUENCIES VARIABLES=x1 x2 x3
/FORMAT=NOTABLE
/HISTOGRAM NORMAL
/ORDER=ANALYSIS.
EXAMINE VARIABLES=x1 x2 x3
/PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING PAIRWISE
/NOTOTAL.
But it's more definitive to look at how the p-p plot looks in your regressions. I have written the following in an earlier blog post of my mine :
Also, I will not put much faith on the Kolmogorov-Smirnov and the Shapiro-Wilk tests. They are very sensitive to sample size (Field, 2009, pp. 148, 788). This increases the probability of validating the null hypothesis (eg, less than 0.05) that the distribution is not normally distributed, while for testing the normality of the residuals, we want a p-value at least higher than 0.05, not less.In large samples these tests can be significant even when the scores are only slightly different from a normal distribution. Therefore, they should always be interpreted in conjunction with histograms, P–P or Q–Q plots, and the values of skew and kurtosis.
(edit: after reading the underlined passage, I think it should be re-written "reject the null hypothesis that the distribution is normal")
Finally, you can read this here :
http://www.ats.ucla.edu/stat/spss/webbooks/reg/chapter1/spssreg1.htm
Some researchers believe that linear regression requires that the outcome (dependent) and predictor variables be normally distributed. We need to clarify this issue. In actuality, it is the residuals that need to be normally distributed. In fact, the residuals need to be normal only for the t-tests to be valid. The estimation of the regression coefficients do not require normally distributed residuals. As we are interested in having valid t-tests, we will investigate issues concerning normality. … A common cause of non-normally distributed residuals is non-normally distributed outcome and/or predictor variables.
We were not using T-tests though.
I know. But when it says the "t-tests to be valid" it probably means the part of it which measures the "effect size". And effectively, it lowers the effect size(s) in your regressions.
http://pareonline.net/getvn.asp?n=2&v=8
(I have one excellent reference with details on this matter, but I couldn't remember the webpage. So I show you only the Osborne 2004 above, that's the only one that I can remember so far)
http://pareonline.net/getvn.asp?n=2&v=8
(I have one excellent reference with details on this matter, but I couldn't remember the webpage. So I show you only the Osborne 2004 above, that's the only one that I can remember so far)
We were not using T-tests though.
Emil,
Could you add a paragraph to the discussion in which you note the issues -- concerning correlation, causation, and national rates of Islamic belief -- that Peter has raised? Note that we are unable to tests competing causal hypotheses at this time.
I identified the models that seem to be associated with non-normal residuals.
IQ
IQ+HEIGHT
IQ+GDP+HEIGHT
If you do exactly using your data :
FREQUENCIES VARIABLES=IQ GDP ln_GDP Islam Height g_se_pc1
/FORMAT=NOTABLE
/HISTOGRAM NORMAL
/ORDER=ANALYSIS.
You see that Islam is not normal. (GDP is not normal either, but ln_GDP seems better). And yet it's not what causing the non-normality in distribution of the residuals. Given the above list, it seems to be IQ, but that one is normally distributed.
I don't think it's too dangerous for your regressions, so I would like to approve the publication. (as I say, i believe it's preferable to talk a little bit about the distribution of the variables and the regression residuals)
IQ
IQ+HEIGHT
IQ+GDP+HEIGHT
If you do exactly using your data :
FREQUENCIES VARIABLES=IQ GDP ln_GDP Islam Height g_se_pc1
/FORMAT=NOTABLE
/HISTOGRAM NORMAL
/ORDER=ANALYSIS.
You see that Islam is not normal. (GDP is not normal either, but ln_GDP seems better). And yet it's not what causing the non-normality in distribution of the residuals. Given the above list, it seems to be IQ, but that one is normally distributed.
I don't think it's too dangerous for your regressions, so I would like to approve the publication. (as I say, i believe it's preferable to talk a little bit about the distribution of the variables and the regression residuals)
We seem to be talking past each other. I'm not challenging your correlation. I'm saying it's an artefact of factors that are tangentially related to Islam. To be brief, Islamic polities have been less effective at monopolizing the use of violence than polities in Europe and East Asia.
I made some edits and will have to look over them tomorrow. Regarding your points, in the conclusion I added:
"One of our findings was that national rates of belief in Islam robustly predicted poor migrant socioeconomic outcomes. One interpretation of this association is simply that Islamic belief is directly causally related to poorer outcomes. A reviewer suggested, as an alternative, that the Islamic national rates might covary with unobserved macro-regional characteristics, such as a historic lack of state control and a history of pastoralism; these macro-regional characteristics could have selected for, through cultural or gene-cultural co-evolution, traits which underlie the tendency for under-performance by migrants from Islamic prevalent regions of origin. We were unable to test this and other interesting hypotheses and so remain agnostic as to the cause of the associations."
Are you OK with this addition? We are not in a position to test your hypotheses. If we can find the right variables we can test them, but we should start on the global level first, anyways.
I would replace the second sentence with:
A reviewer suggested another explanation. In most Muslim societies, the state has less effectively monopolized the use of violence, with the result that every adult male is expected to use violence as a legitimate way to resolve personal disputes and to defend "honor" or "face." Muslim immigrants thus tend to be more willing to commit violent acts that are criminalized in Western societies, particularly if these acts are targeted against non-kin.
A reviewer suggested another explanation. In most Muslim societies, the state has less effectively monopolized the use of violence, with the result that every adult male is expected to use violence as a legitimate way to resolve personal disputes and to defend "honor" or "face." Muslim immigrants thus tend to be more willing to commit violent acts that are criminalized in Western societies, particularly if these acts are targeted against non-kin.
I would replace the second sentence with:
The section now reads:
One of our findings was that national rates of belief in Islam robustly predicted poor migrant socioeconomic outcomes. One interpretation of this association is simply that Islamic belief is directly causally related to poorer outcomes. This need not be the case, or course. Regarding crime, for example, a reviewer suggested another explanation. In most Muslim societies, the state has less effectively monopolized the use of violence, with the result that every adult male is expected to use violence as a legitimate way to resolve personal disputes and to defend "honor" or "face." Muslim immigrants thus tend to be more willing to commit violent acts that are criminalized in Western societies, particularly if these acts are targeted against non-kin. We were unable to test this and other interesting hypotheses and so remain agnostic as to the cause of the associations.
Ok, do we have your approval?
Yes, I recommend approval.
Yes, I recommend approval.
I've just made Peter a reviewer also for ODP, as he was officially only reviewer for OBG.
Since "Islam" here is the most important and controversial variable, I think it should be analyzed in detail. A) Please clarify how Islam is measured. B) Does it follow a normal/gaussian distribution at the aggregate level? Perhaps a histogram showing the distribution of this variable in your 71 countries sample would be useful. This would give us an idea of how the data are scattered throughout the sample and also indicate which regression method fits best your dataset.
Overall it's a good paper and I recommend publication after these 2 points are addressed.
Overall it's a good paper and I recommend publication after these 2 points are addressed.
Since "Islam" here is the most important and controversial variable, I think it should be analyzed in detail. A) Please clarify how Islam is measured. B) Does it follow a normal/gaussian distribution at the aggregate level? Perhaps a histogram showing the distribution of this variable in your 71 countries sample would be useful. This would give us an idea of how the data are scattered throughout the sample and also indicate which regression method fits best your dataset. Overall it's a good paper and I recommend publication after these 2 points are addressed.
We already have three approvals: EA, MH, and PF. I'm fine with Emil making additional changes, though -- or I can. The Islam rates are percents and they are not close to being normally distributed. But for regression, you don't need this. (http://stats.stackexchange.com/questions/12262/what-if-residuals-are-normally-distributed-but-y-is-not .) You need more or less normally distributed errors. Pic1 shows the histogram for Islam rates; pic2, the PPlot for the residuals (tolerable); and pic3, the histogram for residuals.
Emil, would you care to add a note? Otherwise, publish this and let's move on.
True the independent variables do not need to be normally distributed. However, I'd include the Islam histogram in the paper, as it's useful to know what is the distribution of this variable and how it's expressed (frequency).Everything else is fine, so publish.