Journal:
Open Quantitative Sociology and Political Science
Authors:
Noah Carl
Title:
Explaining Terrorism Threat Level Across Western Countries on the Day of the Brussels Terrorist Attacks
Abstract:
The Foreign and Commonwealth Office of the British government reports a terrorism threat level for every country. This paper analyses variation in terrorism threat level across Western countries on 22nd March 2016, the day of the airport and metro station bombings in Brussels. It finds that percentage of Muslims in the population and military intervention in the Middle East are independently associated with terrorism threat level. In other words, Western countries which have a higher percentage of Muslims, and which have intervened militarily in the Middle East, tend to exhibit higher terrorism threat levels in March 2016. Despite a small sample size, these results are fairly robust across different specifications.
Key words:
Terrorism; Western countries; percentage Muslim; military intervention; Brussels
Length:
11 pages; 2,824 words
Back to [Archive] Postreview discussions
The fact that many Islamist terrorist attacks have been perpetrated in Muslim countries (such as Turkey, Syria, Iraq, Tunisia and Egypt) constitutes rather strong evidence that at least some Islamist terrorists are motivated by grievances other than Western military intervention
A further option is that these countries despite being inhabited mostly by Muslims, have been allied with Western governments in attacks or perceived attacks on Muslims. Turkey is a member of NATO and has taken part in aggression against Muslim states: Turkey was part of the aggression in Libya and Afghanistan and as well as the current conflict in Syria. https://en.wikipedia.org/wiki/List_of_wars_involving_Turkey
Terrorism threat level is treated as a linear scale, running from 1 (low) to 4 (high).
I think you mean ordinal or perhaps interval scale. Multiple regression/correlation assume interval scale.
You may also want to model it using latent correlations altho I'm not sure how one does this for multiple regression. For the bivariate case, see http://johnuebersax.com/stat/tetra.htm.
One major caveat concerning this measure is that it was not possible to discern how the FCO actually puts it together. In particular, it was not possible to rule out that the measure is partly based on information such as percentage of Muslims in the population or military intervention in the Middle East. If it is partly based on such information, then the analyses in Section 3 are somewhat tautological. In an attempt to discern how the measure is in fact constructed, two emails were sent to the FCO (see Appendix A). However, in both cases, the reply received was wholly uninformative: each one simply provided a link to the FCO’s travel advice page, namely FCO (2016). The analyses in Section 3 are predicated on the assumption that terrorism threat level is based on information such as secret intelligence reports, rather than demographic or foreign policy statistics.
I am happy you mentioned this problem. It is unfortunate that they did not provide more useful answers.
Do other countries or agencies publish similar terrorism threats? If they do, it would be nice to replicate the analyses with their measures, perhaps also use factor analysis if one could find a number of sources. I seem to recall that EUROPOL also publishes some similar data.
Finally, does the FCO still have their older terrorism threat estimates? It may be interesting to look at the threat level longitudinally. It is possible to estimate the proportion of Muslims in countries longitudinally as well (using country of origin information + a simple compositional model á la http://openpsych.net/ODP/2015/03/increasinginequalityingeneralintelligenceandsocioeconomicstatusasaresultofimmigrationindenmark19802014/).
Three measures of military intervention in the Middle East were utilised: first, whether a country sustained any military deaths in the Iraq (Operation Iraqi Freedom) or Afghanistan (Operation Enduring Freedom) wars, as reported by iCasualities.org (2016a,b); second, whether a country sustained at least 50 military deaths in the Iraq or Afghanistan wars, as reported by iCasualities.org (2016a,b); and third, whether a country is part of the antiISIS military coalition, as reported by Wikipedia (2016). 21 countries (75%) in the sample sustained at least one military death in Iraq or Afghanistan; 8 (29%) sustained at least 50 military deaths in Iraq or Afghanistan; and 7 (25%) are part of the AntiISIS military coalition.
Is there a reason that these threshold variables are used instead of a (log transformed perhaps) death count? The number 50 seems arbitrary and may as well have been e.g. 25. One might also argue that one should use per capita death counts. The datafile only has the dummy coded variables, not the actual counts, unfortunately, so others cannot easily try a continuous approach.
There is also the question of why the countries only include Iraq and Afghanistan and not, say, Syria or Libya. ISIS is based in Syria and Iraq, so there would seem to be prima facie reason to include these two given that ISIS is the most active Islamic terrorist organization currently operating (as far as I know).
Three control variables were utilised: first, GDP per capita at PPP for 2014, taken from OECD (2016a); harmonised unemployment rate for 2014, taken from OECD (2016b); and posttax posttransfer Gini coefficient (a measure of income inequality), taken from OECD (2016c). Because there was no recent year in which the Gini coefficient was available for all countries in the sample, the maximum value observed between 2009 and 2011 was utilised. In order to reduce skewness, the logarithmic transformation was applied to GDP per capita.
Why were these particular variables chosen and not others? The paper does not mention any reason one might want to control for these variables.
The betas are not shown for the control variables. Why is this?
(p < 0.001)
I am not a big fan of p values. I would be very happy if you instead used confidence intervals for the reasons given in e.g. http://pss.sagepub.com/content/25/1/7.
However, if you really want to use p values, perhaps it would a good idea to supplement with confidence intervals in important cases. Since there are relatively few datapoints, it would perhaps be best to use bootstrapped CIs because these do not involve parametric assumptions (are your variables normally distributed?).
although 50+ military deaths in Iraq or Afghanistan is only significant at the 10% level
I would prefer that you drop any mentioning of 10% 'significance'. It's a too high level of alpha in my opinion.

What happens if one includes more than one of the military intervention against Muslim states predictors? I imagine they show appreciable levels of collinearity, so that may not yield anything useful.
Note that percentage Muslim was Winsorized at its second largest value (namely 19%, for Israel), because its largest value (namely 99%, for Turkey) skewed the variable so substantially.
An alternative choice is log transformation. Was this also tried?

The datafile is attached. Could you instead place it on OSF? This is a better way to keep files and has built in versioning.
The variables are mentioned in the first sheet, but there are no links to the sources. Presumably these are web sources, so it would be very helpful with links.

Finally, how were the data analyzed? There is no code file attached so that others may review the analysis code.

There is some evidence that the countries with more Muslims also have Muslims that are on average more extreme. Muslims in Western countries are less extreme in their beliefs than Muslims in their home countries.
It is hard to find data about this. I analyzed Pew Research's Muslim dataset and found a clear general religious extremism factor that varied by country. Unfortunately, there are no Western countries in the dataset, so one cannot compare with the extremism of Muslims in e.g. Germany so easily. This means that there is substantial restriction of range decreasing the observed correlation. Still, there is a small positive correlations between mean extremism and proportion of the population that is Muslim. See the attached plot.
Blogpost: http://emilkirkegaard.dk/en/?p=5485
If this correlation is real and causal, it is a confound for your models.

I will do a full analytic replication of the analyses in the paper at some later point to confirm all the results. From the looks of it, the analyses conducted are fairly simple, so this should not take long.
Many thanks for commenting on the manuscript. I attach a revised paper (with corrections highlighted), a revised pdf (without corrections highlighted), a datafile, and a Stata do file (in Word format).
I have changed the relevant sentence.
I have changed the relevant sentence.
I am not personally familiar with any other terrorism threat measure similar in nature to the one provided by the FCO.
That would be a very interesting analysis to conduct. However, I do not believe it would be feasible to obtain the relevant data from the FCO in any reasonable time frame, especially given how unresponsive they were to my emails.
I have utilised log of 1 + military deaths instead of at least 50 military deaths in all the multiple regression models.
Including these interventions doesn't really make any difference––unless I'm mistaken––since the countries that took part are simply a subset of those that participated in Iraq & Afghanistan. In addition, very few military deaths have been sustained by Western forces in Syria and Libya, and there doesn't seem to be any standardised database (equivalent to iCasualties) for those conflicts.
I have included a paragraph explaining why each control variable was chosen.
Given that the focus of the paper is the effects of percentage Muslim and military intervention, I would prefer not to unnecessarily clutter the regression tables with more coefficients. This practice is quite common in economics and sociology.
I have reported 95% confidence intervals for the raw estimates.
I have eliminated mentions of 10% significance.
Yes––unless I'm mistaken––the collinearity is close to perfect.
The paper now reports that the log transformation was also tried, and it yielded highly similar results.
I'm not sure how to do this. I haven't used OSF before.
I have provided links to the sources.
I have provided the Stata do file (in Word format) used to analyse the data.
Interesting point. I would prefer not to deal with this issue in the present paper, especially given the fact that there were no Western countries in your sample. Perhaps it could be looked at in a future analysis.
A further option is that these countries despite being inhabited mostly by Muslims, have been allied with Western governments in attacks or perceived attacks on Muslims. Turkey is a member of NATO and has taken part in aggression against Muslim states: Turkey was part of the aggression in Libya and Afghanistan and as well as the current conflict in Syria. https://en.wikipedia.org/wiki/List_of_wars_involving_Turkey
I have changed the relevant sentence.
I think you mean ordinal or perhaps interval scale. Multiple regression/correlation assume interval scale.
I have changed the relevant sentence.
I am happy you mentioned this problem. It is unfortunate that they did not provide more useful answers.
Do other countries or agencies publish similar terrorism threats? If they do, it would be nice to replicate the analyses with their measures, perhaps also use factor analysis if one could find a number of sources. I seem to recall that EUROPOL also publishes some similar data.
I am not personally familiar with any other terrorism threat measure similar in nature to the one provided by the FCO.
Finally, does the FCO still have their older terrorism threat estimates? It may be interesting to look at the threat level longitudinally. It is possible to estimate the proportion of Muslims in countries longitudinally as well (using country of origin information + a simple compositional model á la http://openpsych.net/ODP/2015/03/increasinginequalityingeneralintelligenceandsocioeconomicstatusasaresultofimmigrationindenmark19802014/).
That would be a very interesting analysis to conduct. However, I do not believe it would be feasible to obtain the relevant data from the FCO in any reasonable time frame, especially given how unresponsive they were to my emails.
Is there a reason that these threshold variables are used instead of a (log transformed perhaps) death count? The number 50 seems arbitrary and may as well have been e.g. 25. One might also argue that one should use per capita death counts. The datafile only has the dummy coded variables, not the actual counts, unfortunately, so others cannot easily try a continuous approach.
I have utilised log of 1 + military deaths instead of at least 50 military deaths in all the multiple regression models.
There is also the question of why the countries only include Iraq and Afghanistan and not, say, Syria or Libya. ISIS is based in Syria and Iraq, so there would seem to be prima facie reason to include these two given that ISIS is the most active Islamic terrorist organization currently operating (as far as I know).
Including these interventions doesn't really make any difference––unless I'm mistaken––since the countries that took part are simply a subset of those that participated in Iraq & Afghanistan. In addition, very few military deaths have been sustained by Western forces in Syria and Libya, and there doesn't seem to be any standardised database (equivalent to iCasualties) for those conflicts.
Why were these particular variables chosen and not others? The paper does not mention any reason one might want to control for these variables.
I have included a paragraph explaining why each control variable was chosen.
The betas are not shown for the control variables. Why is this?
Given that the focus of the paper is the effects of percentage Muslim and military intervention, I would prefer not to unnecessarily clutter the regression tables with more coefficients. This practice is quite common in economics and sociology.
I am not a big fan of p values. I would be very happy if you instead used confidence intervals for the reasons given in e.g. http://pss.sagepub.com/content/25/1/7.
However, if you really want to use p values, perhaps it would a good idea to supplement with confidence intervals in important cases. Since there are relatively few datapoints, it would perhaps be best to use bootstrapped CIs because these do not involve parametric assumptions (are your variables normally distributed?).
I have reported 95% confidence intervals for the raw estimates.
I would prefer that you drop any mentioning of 10% 'significance'. It's a too high level of alpha in my opinion.
I have eliminated mentions of 10% significance.
What happens if one includes more than one of the military intervention against Muslim states predictors? I imagine they show appreciable levels of collinearity, so that may not yield anything useful.
Yes––unless I'm mistaken––the collinearity is close to perfect.
An alternative choice is log transformation. Was this also tried?
The paper now reports that the log transformation was also tried, and it yielded highly similar results.
The datafile is attached. Could you instead place it on OSF? This is a better way to keep files and has built in versioning.
I'm not sure how to do this. I haven't used OSF before.
The variables are mentioned in the first sheet, but there are no links to the sources. Presumably these are web sources, so it would be very helpful with links.
I have provided links to the sources.
Finally, how were the data analyzed? There is no code file attached so that others may review the analysis code.
I have provided the Stata do file (in Word format) used to analyse the data.
There is some evidence that the countries with more Muslims also have Muslims that are on average more extreme. Muslims in Western countries are less extreme in their beliefs than Muslims in their home countries.
It is hard to find data about this. I analyzed Pew Research's Muslim dataset and found a clear general religious extremism factor that varied by country. Unfortunately, there are no Western countries in the dataset, so one cannot compare with the extremism of Muslims in e.g. Germany so easily. This means that there is substantial restriction of range decreasing the observed correlation. Still, there is a small positive correlations between mean extremism and proportion of the population that is Muslim. See the attached plot.
Blogpost: http://emilkirkegaard.dk/en/?p=5485
If this correlation is real and causal, it is a confound for your models.
Interesting point. I would prefer not to deal with this issue in the present paper, especially given the fact that there were no Western countries in your sample. Perhaps it could be looked at in a future analysis.
Files have now been uploaded to OSF, as requested: https://osf.io/5tv3a/
This post contains my analytic replication of Noah's analyses. Quotes are from the paper unless otherwise stated. My code is here: https://gist.github.com/Deleetdk/a5913f7d296cb05f255d160adb9a1456
Replicated.
Replicated, except I get a median of 2.52. Typo perhaps.
Replicated.
Correlation replicated, CI differed. I get:
This is an analytic CI. The paper does not say which kind of CI is used, so I assumed it was an analytic.
Replicated. R output:
I get d = .69.
R output:
It was difficult to calculate a p value in r for the SMD. However, I think I managed to do it and got .13.
Correlation and lower CI replicated, upper CI did not. R output:
Replicated.
I get 1.63.
R:
I did not replicate the p value because the function I used rounded the number down to 0 (not your fault).
Replicated.
Replicated.
Replicated.

I'd like to resolve these slight discrepancies but note that the results generally replicated.
"The mean terrorism threat level in the sample is 2.4, while the median is 2.5. "
Replicated.
"This variable ranges from 0 (Czech Republic) to 8.2 (France), with a mean of 3.1, and a median of 2.6."
Replicated, except I get a median of 2.52. Typo perhaps.
"21 countries (75%) in the sample sustained at least one military death in Iraq or Afghanistan; the mean number of military deaths sustained in Iraq or Afghanistan is 294, while the median is 11; 7 countries (25%) are part of the antiISIS military coalition."
Replicated.
"The correlation between terrorism threat level and percentage Muslim is r = .64 (p < 0.001; 95% CI = [.33, .95])"
Correlation replicated, CI differed. I get:
> cor.test(d_main$terror[d_main$west], d_main$muslim15[d_main$west])
Pearson's productmoment correlation
data: d_main$terror[d_main$west] and d_main$muslim15[d_main$west]
t = 4.2844, df = 26, pvalue = 0.0002219
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.3555625 0.8196607
sample estimates:
cor
0.6433038
This is an analytic CI. The paper does not say which kind of CI is used, so I assumed it was an analytic.
"When percentage Muslim squared was included in a model of terrorism threat level alongside percentage Muslim it was not significant (p > 0.1), indicating minimal nonlinearity."
Replicated. R output:
Call:
lm(formula = "terror ~ poly(muslim15, 1)", data = d_main, subset = d_main$west)
Residuals:
Min 1Q Median 3Q Max
2.33148 0.57178 0.03772 0.43508 1.85348
Coefficients:
Estimate Std. Error t value Pr(>t)
(Intercept) 3.258 0.256 12.726 1.13e12 ***
poly(muslim15, 1) 27.366 6.387 4.284 0.000222 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8865 on 26 degrees of freedom
Multiple Rsquared: 0.4138, Adjusted Rsquared: 0.3913
Fstatistic: 18.36 on 1 and 26 DF, pvalue: 0.0002219
Call:
lm(formula = "terror ~ poly(muslim15, 2)", data = d_main, subset = d_main$west)
Residuals:
Min 1Q Median 3Q Max
2.2881 0.3663 0.1067 0.6465 1.6725
Coefficients:
Estimate Std. Error t value Pr(>t)
(Intercept) 7.519 8.858 0.849 0.404
poly(muslim15, 2)1 315.886 282.091 1.120 0.273
poly(muslim15, 2)2 72.623 59.668 1.217 0.235
Residual standard error: 0.8784 on 25 degrees of freedom
Multiple Rsquared: 0.4466, Adjusted Rsquared: 0.4024
Fstatistic: 10.09 on 2 and 25 DF, pvalue: 0.0006133
"The standardised difference in terrorism threat level by any military deaths in Iraq or Afghanistan is d = 0.67 (p > 0.05; 95% CI = [–0.20, 1.54])."
I get d = .69.
R output:
d estimate: 0.6888248 (medium)
95 percent confidence interval:
inf sup
1.6418342 0.2641846
It was difficult to calculate a p value in r for the SMD. However, I think I managed to do it and got .13.
"The correlation between terrorism threat level and log of 1 + military deaths is r = .40 (p = 0.037; 95% CI = [.03, .77])"
Correlation and lower CI replicated, upper CI did not. R output:
> cor.test(d_west$terror, d_west$deaths2)
Pearson's productmoment correlation
data: d_west$terror and d_west$deaths2
t = 2.1992, df = 26, pvalue = 0.03696
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.02693856 0.67010334
sample estimates:
cor
0.3960353
"When log of 1 + military deaths squared was included in a model of terrorism threat level alongside log of 1 + military deaths it was not significant (p > 0.1), indicating minimal nonlinearity."
Replicated.
"The standardised difference in terrorism threat level by part of antiISIS military coalition is d = 1.34 (p < 0.001; 95% CI = [0.60, 2.08])."
I get 1.63.
R:
Cohen's d
d estimate: 1.632013 (large)
95 percent confidence interval:
inf sup
2.6807130 0.5833124
I did not replicate the p value because the function I used rounded the number down to 0 (not your fault).
"Table 1. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among Western countries."
Replicated.
"Table 2. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among all OECD countries."
Replicated.
"Table 3. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among OECD countries located in Europe."
Replicated.

I'd like to resolve these slight discrepancies but note that the results generally replicated.
I'd like to request that you add the main numerical datapoints of interest to the abstract: effect sizes and sample size. If too many effect sizes, I usually give ranges or the mean/median.
I'd like to request that you provide exact p values instead of inequalities. This allows for better estimation of the strength of the evidence from reading and also allows for automatic checking with data mining tools. See e.g. https://peerj.com/preprints/1642/ When the numbers are very small it is better to use scientific notation, e.g. 1.4 * 10^5.
Come to think of it. It is better to use a proper method than assuming that a clearly nonnormal, noncontinuous variable is both. At least as a robustness check. I used ordered logistic regression (http://www.ats.ucla.edu/stat/r/dae/ologit.htm) and tried the 3 main models with the western subsample. Results were similar to your OLS regressions.
It is a dangerous argument to make that because a practice is common, it is okay. I'd like that you add the full results to the supplementary materials (either in the appendix or output files in the OSF repository).
I checked, the correlations are not that strong: r's .33, .50, .69.
In two places you use the fact that adding a second order term did not result in a p < alpha to argue that there is no nonlinearity. This conclusion is too strong. There are many kinds of nonlinearity many of which are not detected by this crude method. In my experience, detecting nonlinearity requires a somewhat large sample size (or very strong associations), larger than this study has. So, I think that if you tone down the language, it is alright what you did.
I'd like to request that you provide exact p values instead of inequalities. This allows for better estimation of the strength of the evidence from reading and also allows for automatic checking with data mining tools. See e.g. https://peerj.com/preprints/1642/ When the numbers are very small it is better to use scientific notation, e.g. 1.4 * 10^5.
I have changed the relevant sentence.
Come to think of it. It is better to use a proper method than assuming that a clearly nonnormal, noncontinuous variable is both. At least as a robustness check. I used ordered logistic regression (http://www.ats.ucla.edu/stat/r/dae/ologit.htm) and tried the 3 main models with the western subsample. Results were similar to your OLS regressions.
Given that the focus of the paper is the effects of percentage Muslim and military intervention, I would prefer not to unnecessarily clutter the regression tables with more coefficients. This practice is quite common in economics and sociology.
It is a dangerous argument to make that because a practice is common, it is okay. I'd like that you add the full results to the supplementary materials (either in the appendix or output files in the OSF repository).
Yes––unless I'm mistaken––the collinearity is close to perfect.
I checked, the correlations are not that strong: r's .33, .50, .69.
In two places you use the fact that adding a second order term did not result in a p < alpha to argue that there is no nonlinearity. This conclusion is too strong. There are many kinds of nonlinearity many of which are not detected by this crude method. In my experience, detecting nonlinearity requires a somewhat large sample size (or very strong associations), larger than this study has. So, I think that if you tone down the language, it is alright what you did.
Thanks to Emil for undertaking a replication of my analyses. I have uploaded a new Stata do file to the OSF page for this paper: https://osf.io/5tv3a/
This was indeed a typo.
I had not applied the Fisher transformation to the raw confidence intervals included in the Stata OLS output. Reported confidence intervals have been changed accordingly, and are the same as Emil's. (They were calculated using the cii2 command, as shown in the new Stata do file.)
I calculated dvalues by simply taking the regression coefficients from OLS models in which the dependent variable (namely, terrorism threat level) had been standardised. In other words, B1 in the following model was used as an estimate of d:
terrorism_threat_level_zscore = alpha + B1(any_deaths_Iraq_Afghanistan)
When computing Cohen's d directly (using the esize command in Stata), I get the same results as Emil. I presume the discrepancies arise from different methods of calculating the pooled standard deviation. I chose to use the OLS method, rather than calculating Cohen's d directly, because the results are more comparable with the conditional standardised differences obtained from the multivariate models (Tables 13). I will report the exact dvalues in the paper, if preferred.
I will do this once the discussion regarding dvalues has been resolved.
I will include exact pvalues for raw estimates in the next version of the paper.
In my experience, logistic and probit regression (binary and ordered) nearly always produce highly similar point estimates (average effects) to OLS, so I prefer to use the latter, given its greater simplicity and ease of interpretation. But I can note in the paper that results were similar when using ordered logistic regression.
I will include an Appendix or file of supplementary analyses with the next version of the paper.
My mistake. Would you suggest that I tried utilising an additional variable, namely number of major military interventions in the Middle East, ranging from 04?
Replicated, except I get a median of 2.52. Typo perhaps.
This was indeed a typo.
Correlation replicated, CI differed... Correlation and lower CI replicated, upper CI did not.
I had not applied the Fisher transformation to the raw confidence intervals included in the Stata OLS output. Reported confidence intervals have been changed accordingly, and are the same as Emil's. (They were calculated using the cii2 command, as shown in the new Stata do file.)
I get d = .69... I get 1.63...
I calculated dvalues by simply taking the regression coefficients from OLS models in which the dependent variable (namely, terrorism threat level) had been standardised. In other words, B1 in the following model was used as an estimate of d:
terrorism_threat_level_zscore = alpha + B1(any_deaths_Iraq_Afghanistan)
When computing Cohen's d directly (using the esize command in Stata), I get the same results as Emil. I presume the discrepancies arise from different methods of calculating the pooled standard deviation. I chose to use the OLS method, rather than calculating Cohen's d directly, because the results are more comparable with the conditional standardised differences obtained from the multivariate models (Tables 13). I will report the exact dvalues in the paper, if preferred.
I'd like to request that you add the main numerical datapoints of interest to the abstract
I will do this once the discussion regarding dvalues has been resolved.
I'd like to request that you provide exact p values instead of inequalities.
I will include exact pvalues for raw estimates in the next version of the paper.
It is better to use a proper method than assuming that a clearly nonnormal, noncontinuous variable is both.
In my experience, logistic and probit regression (binary and ordered) nearly always produce highly similar point estimates (average effects) to OLS, so I prefer to use the latter, given its greater simplicity and ease of interpretation. But I can note in the paper that results were similar when using ordered logistic regression.
I'd like that you add the full results to the supplementary materials
I will include an Appendix or file of supplementary analyses with the next version of the paper.
I checked, the correlations are not that strong: r's .33, .50, .69.
My mistake. Would you suggest that I tried utilising an additional variable, namely number of major military interventions in the Middle East, ranging from 04?
In reply to Noah's post.
Yes, it is because your method uses the total sample SD, whereas it is customary to use the pooled SD (i.e. the average of the SD within each group weighted by the group sizes). I replicated your number using the total sample SD.
I'd like that you report the normal d value (pooled SDbased). The whole point of using standardized mean differences is to get measures that are comparable across analyses and studies that do not use the same units. As far as I know, it is most common to use the pooled SD and this is the default method unless otherwise specified in statistical programs/languages. Thus, to make your result most comparable with other studies, you should use the pooled SD.
Please include a paragraph/sentence mentioning that you tried using ordered logistic modeling (or whatever you deem appropriate) and that the results were similar. I agree with your experiences.
I tried all the possible OLS models with your predictors. I always do this unless it is impossible because there are too many predictors, e.g. >15; the number of models to try is 2^p1, so with 6 predictors, it's only 127 models to try. This takes a few seconds even without using parallel computing.
The best model according to BIC was:
Muslim + any death + part of antiISIS + gdp + unemploy
which had adjusted R2 of .77. The best model with only one of the intervention predictors has adjusted R2 of .67, so there is a bit of evidence that using multiple intervention predictors is superior.
In general however, this approach tends to overfit models by using too many predictors. One could use lasso regression with crossvalidation to get more robust results. I did this for you. Crossvalidation has a random component, so I ran it 500 times and summarized the results. The results indicated that all the predictors were useful predictors.
The last row is the proportion of runs that produced a zero coefficient for that predictor, i.e. found it to be useless. As can be seen, the worst predictor was found to be useless only 4.6% of the time. Based of this, I'd tentatively (because of the sample size) conclude that it is best to use all the predictors together.
I calculated dvalues by simply taking the regression coefficients from OLS models in which the dependent variable (namely, terrorism threat level) had been standardised. In other words, B1 in the following model was used as an estimate of d:
terrorism_threat_level_zscore = alpha + B1(any_deaths_Iraq_Afghanistan)
When computing Cohen's d directly (using the esize command in Stata), I get the same results as Emil. I presume the discrepancies arise from different methods of calculating the pooled standard deviation. I chose to use the OLS method, rather than calculating Cohen's d directly, because the results are more comparable with the conditional standardised differences obtained from the multivariate models (Tables 13). I will report the exact dvalues in the paper, if preferred.
Yes, it is because your method uses the total sample SD, whereas it is customary to use the pooled SD (i.e. the average of the SD within each group weighted by the group sizes). I replicated your number using the total sample SD.
I'd like that you report the normal d value (pooled SDbased). The whole point of using standardized mean differences is to get measures that are comparable across analyses and studies that do not use the same units. As far as I know, it is most common to use the pooled SD and this is the default method unless otherwise specified in statistical programs/languages. Thus, to make your result most comparable with other studies, you should use the pooled SD.
In my experience, logistic and probit regression (binary and ordered) nearly always produce highly similar point estimates (average effects) to OLS, so I prefer to use the latter, given its greater simplicity and ease of interpretation. But I can note in the paper that results were similar when using ordered logistic regression.
Please include a paragraph/sentence mentioning that you tried using ordered logistic modeling (or whatever you deem appropriate) and that the results were similar. I agree with your experiences.
My mistake. Would you suggest that I tried utilising an additional variable, namely number of major military interventions in the Middle East, ranging from 04?
I tried all the possible OLS models with your predictors. I always do this unless it is impossible because there are too many predictors, e.g. >15; the number of models to try is 2^p1, so with 6 predictors, it's only 127 models to try. This takes a few seconds even without using parallel computing.
The best model according to BIC was:
Muslim + any death + part of antiISIS + gdp + unemploy
which had adjusted R2 of .77. The best model with only one of the intervention predictors has adjusted R2 of .67, so there is a bit of evidence that using multiple intervention predictors is superior.
In general however, this approach tends to overfit models by using too many predictors. One could use lasso regression with crossvalidation to get more robust results. I did this for you. Crossvalidation has a random component, so I ran it 500 times and summarized the results. The results indicated that all the predictors were useful predictors.
muslim15 any_1 deaths2 part_1 gdp_log unemp14 ineq0911
mean 0.431 0.266 0.107 0.507 0.105 0.228 0.072
median 0.432 0.262 0.108 0.508 0.101 0.225 0.072
sd 0.010 0.072 0.004 0.008 0.041 0.043 0.005
mad 0.004 0.057 0.002 0.003 0.036 0.034 0.002
fraction_zeroNA 0.000 0.014 0.000 0.000 0.046 0.000 0.000
The last row is the proportion of runs that produced a zero coefficient for that predictor, i.e. found it to be useless. As can be seen, the worst predictor was found to be useless only 4.6% of the time. Based of this, I'd tentatively (because of the sample size) conclude that it is best to use all the predictors together.
The latest versions of the pdf and the do file have been uploaded to the OSF page (the word version of the paper has been deleted, to avoid redundancy): https://osf.io/5tv3a/
I have done so.
I have done so.
Since the purpose of my paper is simply to show that percentage Muslim and military intervention in the Middle East are independently associated with terrorism threat level, I would prefer not to include additional discussion pertaining to model fit.
I'd like that you report the normal d value (pooled SDbased).
I have done so.
Please include a paragraph/sentence mentioning that you tried using ordered logistic modeling (or whatever you deem appropriate) and that the results were similar.
I have done so.
I tried all the possible OLS models with your predictors. I always do this unless it is impossible because there are too many predictors, e.g. >15; the number of models to try is 2^p1, so with 6 predictors, it's only 127 models to try. This takes a few seconds even without using parallel computing.
The best model according to BIC was:
Muslim + any death + part of antiISIS + gdp + unemploy
which had adjusted R2 of .77. The best model with only one of the intervention predictors has adjusted R2 of .67, so there is a bit of evidence that using multiple intervention predictors is superior.
In general however, this approach tends to overfit models by using too many predictors. One could use lasso regression with crossvalidation to get more robust results. I did this for you. Crossvalidation has a random component, so I ran it 500 times and summarized the results. The results indicated that all the predictors were useful predictors.
Since the purpose of my paper is simply to show that percentage Muslim and military intervention in the Middle East are independently associated with terrorism threat level, I would prefer not to include additional discussion pertaining to model fit.
I note that you added effect sizes to the abstract. However, you left out the confidence intervals which are necessary for the interpretation.
The sample size is also not noted. I would add this e.g. thusly:
"This paper analyses variation in terrorism threat level across Western countries (N=28) on 22nd March 2016,"
I would add the standardized betas. E.g.
"It finds that percentage of Muslims in the population and military intervention in the Middle East are independently associated with terrorism threat level (standardized betas .42 to .56 and .47 to 1.00 for percentage Muslim and military intervention, respectively)."
Otherwise, I have no further comment and approve this paper when the matter with the abstract has been settled.
The sample size is also not noted. I would add this e.g. thusly:
"This paper analyses variation in terrorism threat level across Western countries (N=28) on 22nd March 2016,"
I would add the standardized betas. E.g.
"It finds that percentage of Muslims in the population and military intervention in the Middle East are independently associated with terrorism threat level (standardized betas .42 to .56 and .47 to 1.00 for percentage Muslim and military intervention, respectively)."
Otherwise, I have no further comment and approve this paper when the matter with the abstract has been settled.
The sample size is also not noted... I would add the standardized betas.
The sample size has been added to the abstract, and all coefficients in Tables 1–3 have been changed to standardised betas (where they were not previously given as such), and these have also been added to the abstract.
Latest versions: https://osf.io/5tv3a/
I approve of publication.
One major caveat concerning this measure is that it was not possible to discern how the FCO actually puts it together. In particular, it was not possible to rule out that the measure is partly based on information such as percentage of Muslims in the population or military intervention in the Middle East. If it is partly based on such information, then the analyses in Section 3 are somewhat tautological. In an attempt to discern how the measure is in fact constructed, two emails were sent to the FCO (see Appendix A). However, in both cases, the reply received was wholly uninformative: each one simply provided a link to the FCO’s travel advice page, namely FCO (2016)
Noah,
It's an interesting paper.
That said, were I an OQSPS reviewer, I wouldn't approve  as is  on account of the caveat above. Could you not validate with a more transparent dataset, for example the Global Terrorism Index (GTI) or the political terror scale of the Global Peace Index? In theory, you could compute an Islamic terror index based on global data sets of specific cases. This would remove a good deal of confounding  and narrow the empirical issues to: (1) does % Muslim/ Wars in the Middle East explain Islamic terror, (2) does Islamic terror explain overall terror  but is probably not worth the time.
That said, were I an OQSPS reviewer, I wouldn't approve  as is  on account of the caveat above. Could you not validate with a more transparent dataset, for example the Global Terrorism Index (GTI) or the political terror scale of the Global Peace Index? In theory, you could compute an Islamic terror index based on global data sets of specific cases. This would remove a good deal of confounding  and narrow the empirical issues to: (1) does % Muslim/ Wars in the Middle East explain Islamic terror, (2) does Islamic terror explain overall terror  but is probably not worth the time.
Thanks for reading the manuscript, Chuck. I downloaded the Global Terrorism Index (GTI) values for the countries in my sample, and ran a few preliminary analyses. The GTI correlates at about r = .60 (p < 0.001) with the FCO measure. It is correlated at about r = .40 (p < 0.05) with percentage Muslim, but does not seem to be significantly related to military intervention in the Middle East.
I would propose to reframe the paper as a test of the 'percentage Muslim hypothesis' and 'military intervention in the Middle East' hypothesis using two separate indexes of terrorism threat level: the FCO measure, and the GTI measure. How does this sound?
I would propose to reframe the paper as a test of the 'percentage Muslim hypothesis' and 'military intervention in the Middle East' hypothesis using two separate indexes of terrorism threat level: the FCO measure, and the GTI measure. How does this sound?
That would be an improvement. Ideally, though, you would also have a measure of Islamic terror  weighted in the GTI manner. I will look around and see if I can find anything.
Noah,
Apparently the EU has data on # Islamic terror attacks (with no weighting for extent) by member nation for 2006 to 2014. I imagine that the numbers are intentionally deflated. The Department of Justice does something similar here, when it comes to racial "hate crimes". Thus, an African American screaming "kill Whitey" just before punching an elderly White man will not be convicted of a hate crimes as readily as a White American doing the equivalent to an elderly Black man will. Regardless, the rates per country might correspond to actual relative rates.
https://www.europol.europa.eu/search/apachesolr_search/EU%20Terrorism%20Situation%20and%20Trend%20Report
Could you run the analysis also using these number aggregated for all years and then synthesize those results with what you have. If so, I would have no objection to your paper.
Apparently the EU has data on # Islamic terror attacks (with no weighting for extent) by member nation for 2006 to 2014. I imagine that the numbers are intentionally deflated. The Department of Justice does something similar here, when it comes to racial "hate crimes". Thus, an African American screaming "kill Whitey" just before punching an elderly White man will not be convicted of a hate crimes as readily as a White American doing the equivalent to an elderly Black man will. Regardless, the rates per country might correspond to actual relative rates.
https://www.europol.europa.eu/search/apachesolr_search/EU%20Terrorism%20Situation%20and%20Trend%20Report
Could you run the analysis also using these number aggregated for all years and then synthesize those results with what you have. If so, I would have no objection to your paper.
Could you run the analysis also using these number aggregated for all years and then synthesize those results with what you have. If so, I would have no objection to your paper.
Good idea––many thanks. Will do.
Completely revised (and renamed) paper: https://osf.io/5tv3a/
Edited.
The author has addressed the major theoretical deficit present in the original manuscript by adding additional measures of terrorism and showing that these intercorrelate, thus providing support for the construct validity of all measures. In the latest version, I fail to see any problems with the concept and analyses. The write up is very clear and only suffers from a couple of minor language issues.* (The only remaining concern I would have is with the reputability of the "the religion of peace" data source. When I get a chance I will pick a couple of random countries and compare RofP data to the Global Terrorism Database to see if the numbers match.)
*e.g., "In view of the preceding limitations, it could be argued that number of casualties from Islamist terrorism per capita the most valid measure..." (A verb is missing.)
The author has addressed the major theoretical deficit present in the original manuscript by adding additional measures of terrorism and showing that these intercorrelate, thus providing support for the construct validity of all measures. In the latest version, I fail to see any problems with the concept and analyses. The write up is very clear and only suffers from a couple of minor language issues.* (The only remaining concern I would have is with the reputability of the "the religion of peace" data source. When I get a chance I will pick a couple of random countries and compare RofP data to the Global Terrorism Database to see if the numbers match.)
*e.g., "In view of the preceding limitations, it could be argued that number of casualties from Islamist terrorism per capita the most valid measure..." (A verb is missing.)
I read the latest version.
You mention Nigeria as a Muslim country, but this country is about 50/50 split between Muslim and Christian.
You now have three measures of roughly the same outcome, Islamic terrorism. You report their intercorrelations early on and I was surprised that you did not factor analyze them to produce a single measure that should have less idiosyncratic variance.
I will have to redo my replication once you have settled on your final analyses.
You mention Nigeria as a Muslim country, but this country is about 50/50 split between Muslim and Christian.
You now have three measures of roughly the same outcome, Islamic terrorism. You report their intercorrelations early on and I was surprised that you did not factor analyze them to produce a single measure that should have less idiosyncratic variance.
I will have to redo my replication once you have settled on your final analyses.