Journal:
Open Quantitative Sociology & Political Science
Authors:
Emil O. W. Kirkegaard
David Becker
Title:
Immigrant crime in Germany 2012-2015
Abstract:
Number of suspects per capita were estimated for immigrants in Germany grouped by citizenship. This was correlated with national IQs (r=-.53) and Islam prevalence in the home countries (r=.49), congruent with other studies. Multivariate analyses revealed that there was a confound with the mean age and sex distribution of the groups in Germany.
The German data lacked age and sex information for the crime data and so it was not possible to adjust for age and sex using subgroup analyses. For this reason, an alternative adjustment method was developed. This method was tested on the detailed Danish data which does have the necessary information to carry out subgroup analyses. The new method was found to give highly congruent results with the standard subgrouping method.
The German crime data were then adjusted for age and sex using the new method and these were analyzed with respect to the predictors. They were moderately to strongly correlated with national IQs (.46) and Islam prevalence in the home country (.35). Combining national IQ, Islam% and distance to Germany resulted in a model with a cross-validated r2 of 20%, equivalent to a correlation of .45.
Key words:
crime, immigrants, country of origin, Germany, Muslim, Islam, cognitive ability, intelligence, IQ
Length:
~5500 words, 17 pages.
Files:
https://osf.io/rswyv/
External reviewers:
We will try to recruit a German reviewer. David Becker is the student of Heiner Rindermann, so he will not be a suitably impartial reviewer.
Back to [Archive] Post-review discussions
Rindermann suggested a number of reviewers. However, many had a non-statistical approach and focused on Islam. Islam is used as a predictor, but is not treated in detail in this paper.
Instead, I looked for a person reasonably knowledgeable about German affairs and HBD in general. Erwin Schmidt (Twitter, blog) seems like a good candidate. He's a German science blogger who has written extensively on HBD matters from a generally empirical, numbers-focused angle. I approached him and he agreed to do a review. He will first have time in early December, around 7-8th.
Instead, I looked for a person reasonably knowledgeable about German affairs and HBD in general. Erwin Schmidt (Twitter, blog) seems like a good candidate. He's a German science blogger who has written extensively on HBD matters from a generally empirical, numbers-focused angle. I approached him and he agreed to do a review. He will first have time in early December, around 7-8th.
This paper analyses data on the crime rates of different nationalities in Germany. It's main finding is that crime rates are predictable from country-level characteristics, notably national IQ and percentage Muslim––although only the former appears to be a robust predictor. Overall, the paper is written clearly, and the analysis is well-conducted. I offer the following comments/suggestions to the authors:
1. Typo in Section 2.4:
"Second, because the population data only includes persons officially living in Germany while the crime data includes persons not living in Germany."
Delete "because".
2. Typo in Section 5.4:
"As can be seen, the correlations were strongly published towards 1."
Presumably you mean "strongly pushed"?
3. Typo in Section 7:
"To some extend, the differences in crime rates"
Change to "some extent".
4. Clearly, Georgia and Algeria are influential data points, having much higher crime rates than the other nationalities. I would be interested to know what happens to the correlations with national IQ and percentage Muslim when excluding these two data points––including what happens to the correlations with the age-sex adjusted crime rates. Additionally, what happens when you utilise log of (1 + crime rate) instead?
5. Consider adding horizontal lines at the top and bottom of each table.
1. Typo in Section 2.4:
"Second, because the population data only includes persons officially living in Germany while the crime data includes persons not living in Germany."
Delete "because".
2. Typo in Section 5.4:
"As can be seen, the correlations were strongly published towards 1."
Presumably you mean "strongly pushed"?
3. Typo in Section 7:
"To some extend, the differences in crime rates"
Change to "some extent".
4. Clearly, Georgia and Algeria are influential data points, having much higher crime rates than the other nationalities. I would be interested to know what happens to the correlations with national IQ and percentage Muslim when excluding these two data points––including what happens to the correlations with the age-sex adjusted crime rates. Additionally, what happens when you utilise log of (1 + crime rate) instead?
5. Consider adding horizontal lines at the top and bottom of each table.
Thanks for the review, Noah.
I will wait with updating until we can get the review from Schmidt.
I will wait with updating until we can get the review from Schmidt.
The crime rate of Turkey:
„The crime rate of Turkey (1.87) is not particularly high...“
„Turks, the largest immigration population in Germany (1.5 million in these data) had an adjusted
crime rate of 1.30)“
Not just overall crime rates, but also crime profiles vary from country to country. E.g. there are
about 1.5 million people with Turkish citizenship living within the German borders, but in the case
of murder roughly 8% of the suspects are Turks. There are roughly twice as many people with
Turkish roots living in Germany (3 million). So if the suspect rate for the other 1.5 million people
would be of similar size, then about 16% of the suspects for murder would be members of an ethnic
group that represents just about 3.6% of the population who lives within the German borders.
(Another example: For the years 2012-2014 7.6 - 8.5% of suspects who were accused of sexual
assault and rape were Turkish citizens.)
So in certain cases it could be interesting to take a deeper look at the data and to supplement the
“overall crime rate” with a “specific crime rate”. It could be, that there is an highly increased risk
for members of certain ethnic groups (e.g. Turks) to be involved in a certain set of severe crimes,
but that these risk values flatten out in an overall crime rate. So perhaps Sarrazin's claim that “20%
of crime in Berlin is committed by a group of about 1000 Turkish and Arab youths” is related to a
specific set of crimes but not to an overall crime rate.
Style:
Your paper is very data focused and has a very clear writing style. Personally I would have enjoyed
it, if, in addition to the analysis of data, your paper would have contained more theory. But you
mentioned in an earlier discussion that you prefer to write your papers data-focused and so this
probably is just a matter of personal taste.
IQ & Muslim %age:
“IQ remains a strong predictor in multivariate analysis, while Muslim % does not.”
This is a very interesting statement. In recent years Sarrazin usually stated, that Islam is responsible
for the integration problems of many immigrants. Your analysis shows that in the case of crime,
national IQ is the more robust predictor.
„The crime rate of Turkey (1.87) is not particularly high...“
„Turks, the largest immigration population in Germany (1.5 million in these data) had an adjusted
crime rate of 1.30)“
Not just overall crime rates, but also crime profiles vary from country to country. E.g. there are
about 1.5 million people with Turkish citizenship living within the German borders, but in the case
of murder roughly 8% of the suspects are Turks. There are roughly twice as many people with
Turkish roots living in Germany (3 million). So if the suspect rate for the other 1.5 million people
would be of similar size, then about 16% of the suspects for murder would be members of an ethnic
group that represents just about 3.6% of the population who lives within the German borders.
(Another example: For the years 2012-2014 7.6 - 8.5% of suspects who were accused of sexual
assault and rape were Turkish citizens.)
So in certain cases it could be interesting to take a deeper look at the data and to supplement the
“overall crime rate” with a “specific crime rate”. It could be, that there is an highly increased risk
for members of certain ethnic groups (e.g. Turks) to be involved in a certain set of severe crimes,
but that these risk values flatten out in an overall crime rate. So perhaps Sarrazin's claim that “20%
of crime in Berlin is committed by a group of about 1000 Turkish and Arab youths” is related to a
specific set of crimes but not to an overall crime rate.
Style:
Your paper is very data focused and has a very clear writing style. Personally I would have enjoyed
it, if, in addition to the analysis of data, your paper would have contained more theory. But you
mentioned in an earlier discussion that you prefer to write your papers data-focused and so this
probably is just a matter of personal taste.
IQ & Muslim %age:
“IQ remains a strong predictor in multivariate analysis, while Muslim % does not.”
This is a very interesting statement. In recent years Sarrazin usually stated, that Islam is responsible
for the integration problems of many immigrants. Your analysis shows that in the case of crime,
national IQ is the more robust predictor.
Noah,
All typos were fixed. Pesky extend/extent, just like advise/advice.
I will attempt to do this. However, the Danish Statistics agency have just updated their website and deleted the crime data table! They introduced several new tables that do not have the same age data as I used before (far less detailed thus introducing more age bias and not being compatible with my analysis code!). I have emailed them about this urgent problem. Normally they leave an archive version of the table, but apparently forgot this time. Very bad because it means that this research is not reproducible. I cannot even run the old analyses again because I did not save a local copy of the data (my mistake).
If they cannot be persuaded to add an archive of the table, I will request a copy of the old data so that we can keep the data open to reanalysis (by us or others).
Formatting is something we do as the last step.
[hr]
Erwin,
I will try to address the crime profile idea at a later point. Right now my R analysis code does not work because the half the Danish data are missing, the code for Germany is intertwined with that for the Danish.
I have added a paragraph in the discussion with background theory.
Due to the limited number of countries (n=83) and somewhat questionable data quality (citizenship data), I advise against drawing strong conclusions based on a single dataset. Rather, it is best to view the overall results of multiple studies. The Nordic countries produced large effect sizes for Muslim%, while e.g. the old Dutch data did not. As mentioned in the Dutch paper, this seems to be mostly due to highly crime prone immigrants from a non-Muslim country, i.e. Suriname (past member of the kingdom of Netherlands and therefore had easy immigration access).
The countries in the included sample matters quite a lot for the results I think. Unfortunately, the datasets are limited by the available data which means that they will not in general be the same across countries. This introduces a kind of weird non-random sampling variance between the studies. A future review/meta-analysis/integrative analysis will have to examine this pesky issue. It is outside the scope of the present paper, which is already long.
--
I slightly edited various parts, including adding the sample size to the abstract (should always be there!). Files are updated.
1. Typo in Section 2.4:
"Second, because the population data only includes persons officially living in Germany while the crime data includes persons not living in Germany."
Delete "because".
2. Typo in Section 5.4:
"As can be seen, the correlations were strongly published towards 1."
Presumably you mean "strongly pushed"?
3. Typo in Section 7:
"To some extend, the differences in crime rates"
Change to "some extent".
All typos were fixed. Pesky extend/extent, just like advise/advice.
4. Clearly, Georgia and Algeria are influential data points, having much higher crime rates than the other nationalities. I would be interested to know what happens to the correlations with national IQ and percentage Muslim when excluding these two data points––including what happens to the correlations with the age-sex adjusted crime rates. Additionally, what happens when you utilise log of (1 + crime rate) instead?
I will attempt to do this. However, the Danish Statistics agency have just updated their website and deleted the crime data table! They introduced several new tables that do not have the same age data as I used before (far less detailed thus introducing more age bias and not being compatible with my analysis code!). I have emailed them about this urgent problem. Normally they leave an archive version of the table, but apparently forgot this time. Very bad because it means that this research is not reproducible. I cannot even run the old analyses again because I did not save a local copy of the data (my mistake).
If they cannot be persuaded to add an archive of the table, I will request a copy of the old data so that we can keep the data open to reanalysis (by us or others).
5. Consider adding horizontal lines at the top and bottom of each table.
Formatting is something we do as the last step.
[hr]
Erwin,
I will try to address the crime profile idea at a later point. Right now my R analysis code does not work because the half the Danish data are missing, the code for Germany is intertwined with that for the Danish.
Style:
Your paper is very data focused and has a very clear writing style. Personally I would have enjoyed
it, if, in addition to the analysis of data, your paper would have contained more theory. But you
mentioned in an earlier discussion that you prefer to write your papers data-focused and so this
probably is just a matter of personal taste.
I have added a paragraph in the discussion with background theory.
IQ & Muslim %age:
“IQ remains a strong predictor in multivariate analysis, while Muslim % does not.”
This is a very interesting statement. In recent years Sarrazin usually stated, that Islam is responsible
for the integration problems of many immigrants. Your analysis shows that in the case of crime,
national IQ is the more robust predictor.
Due to the limited number of countries (n=83) and somewhat questionable data quality (citizenship data), I advise against drawing strong conclusions based on a single dataset. Rather, it is best to view the overall results of multiple studies. The Nordic countries produced large effect sizes for Muslim%, while e.g. the old Dutch data did not. As mentioned in the Dutch paper, this seems to be mostly due to highly crime prone immigrants from a non-Muslim country, i.e. Suriname (past member of the kingdom of Netherlands and therefore had easy immigration access).
The countries in the included sample matters quite a lot for the results I think. Unfortunately, the datasets are limited by the available data which means that they will not in general be the same across countries. This introduces a kind of weird non-random sampling variance between the studies. A future review/meta-analysis/integrative analysis will have to examine this pesky issue. It is outside the scope of the present paper, which is already long.
--
I slightly edited various parts, including adding the sample size to the abstract (should always be there!). Files are updated.
Just an update. I am talking with DST (Danish Stats Agency). They depublished the table due to concerns about privacy. It was possible to obtain very detailed data if one used all their breakdowns which might enable identification of persons. However, they have agreed to provide me with some data from the table. I am trying to get as detailed data as possible, at least as detailed as those used in this publication. Hopefully, they will also be able to provide me with updated data in the future.
I was able to secure data necessary to recompute the results in this study. They also provided me with the 2015 data, so the numbers for Denmark are very slightly different in some places.
I have updated the paper with the latest figures an numbers. In addition, some text was rewritten for clarity. I also added cross-national crime correlations with data from Netherlands, Norway and Finland, such as it is. These were fairly high as expected, despite the small samples.
Erwin,
We respectfully decline to undertake more detailed analyses of the crime subtypes. This study already took a very long time to prepare and thus we would like to postpone any further studies of this question to another time. However, as the data are public, anyone is free to examine this question themselves.
The files have been updated on OSF.
I have updated the paper with the latest figures an numbers. In addition, some text was rewritten for clarity. I also added cross-national crime correlations with data from Netherlands, Norway and Finland, such as it is. These were fairly high as expected, despite the small samples.
Erwin,
We respectfully decline to undertake more detailed analyses of the crime subtypes. This study already took a very long time to prepare and thus we would like to postpone any further studies of this question to another time. However, as the data are public, anyone is free to examine this question themselves.
The files have been updated on OSF.
I am satisfied with the re-drafted version, and I therefore approve the paper for publication.
I also approve this paper for publication.
The paper is interesting and adds usefully to what we know already from the Scandinavian countries. The methods seem to be OK, although there are of course serious limitations of the data. It would be nice if the Statistics Office would report where people were born and where their parents and grandparents were born. But that's something we can do nothing about. Especially interesting is the correction for sex and age for the country-level analysis based on individual-level data. I don't remember having seen that kind of procedure before. I only marked some minor typos and stylistic details in the paper. Otherwise it's ready to go, I would think. I approve.
We fixed the minor issues pointed out. Mostly grammar and missing/duplicate words (i.e. typical Emil errors).
Not sure what you mean by individual-level data. We had no such data. But we did have those age x sex data from Denmark which allowed for the validation of the new adjustment method.
Published: https://openpsych.net/paper/50
Not sure what you mean by individual-level data. We had no such data. But we did have those age x sex data from Denmark which allowed for the validation of the new adjustment method.
Published: https://openpsych.net/paper/50