Title:
New crime data for Norway, Finland and Italy
Journal:
ODP
Abstract:
I present new data for immigrant crime in Norway, Finland and Italy. For Norway and Finland it is found that crime rates are very predictable from the following country of origin variables: Islam rate, national IQ, GDP. Crime was not very predictable in Italy, but this seems to be due to bad data.
Authors:
Emil O. W. Kirkegaard
Source files are attached. Additional source files here:
Norwegian data
https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdEJVbGZoVWwyT25GYUVuZlVoemsxQUE&usp=drive_web#gid=0
Danish data
https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdDZhZHlVYm8tbjhkZVZzNzBCR2t6R3c&usp=drive_web&pli=1#gid=0
Italian data
https://docs.google.com/spreadsheets/d/1tMrSDx2MGtEFmOIqg2Vp8T1apG_6vPCrftfp3Dtvmxk/edit#gid=1063999899
Back to [Archive] Withdrawn submissions
"It does seem that the max-min ratio is a useful criteria for identifying the problematic datasets since every dataset in which predictors did not work had unrealistically high ratios. " This conclusion is based only on 2 datasets (Italian and Norwegian). N= 2 is very small thus it's very poor evidence. Can you find previously publishes studies to confirm this speculative hypothesis?
"Both the Italian and Norwegian datasets are based on citizenship, rather than country of origin". This is not true. As the ISTAT website clearly says, the data are based on country of birth (http://dati.istat.it/Index.aspx?DataSetCode=DCCV_CONDGEO1&Lang=) "Territorio di provenienza geografica" means country of origin. This was further confirmed by an email from ISTAT with a reply from my inquiry, which specified that their dataset refers to country of birth and not to nationality. As they explained, this accounts for the value of 0 for Kosovo as Kosovo was made an independent country in 2008 so there cannot be any criminals born after 2008, unless we assume that 0-3 year old kids are as criminal as adults.
"The most criminal group, Stateless, had a per capita crime rate of 0.850, while the least criminal group, Kosovo, had one of 0 as mentioned before. Removing Kosovo does not make things much better". All these errors should be amended before the paper is published.
"Both the Italian and Norwegian datasets are based on citizenship, rather than country of origin". This is not true. As the ISTAT website clearly says, the data are based on country of birth (http://dati.istat.it/Index.aspx?DataSetCode=DCCV_CONDGEO1&Lang=) "Territorio di provenienza geografica" means country of origin. This was further confirmed by an email from ISTAT with a reply from my inquiry, which specified that their dataset refers to country of birth and not to nationality. As they explained, this accounts for the value of 0 for Kosovo as Kosovo was made an independent country in 2008 so there cannot be any criminals born after 2008, unless we assume that 0-3 year old kids are as criminal as adults.
"The most criminal group, Stateless, had a per capita crime rate of 0.850, while the least criminal group, Kosovo, had one of 0 as mentioned before. Removing Kosovo does not make things much better". All these errors should be amended before the paper is published.
Title:
New crime data for Norway, Finland and Italy
Journal:
ODP
Abstract:
I present new data for immigrant crime in Norway, Finland and Italy. For Norway and Finland it is found that crime rates are very predictable from the following country of origin variables: Islam rate, national IQ, GDP. Crime was not very predictable in Italy, but this seems to be due to bad data.
Authors:
Emil O. W. Kirkegaard
Source files are attached. Additional source files here:
Norwegian data
https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdEJVbGZoVWwyT25GYUVuZlVoemsxQUE&usp=drive_web#gid=0
Danish data
https://docs.google.com/spreadsheet/ccc?key=0AoYWmgpqFzdsdDZhZHlVYm8tbjhkZVZzNzBCR2t6R3c&usp=drive_web&pli=1#gid=0
Italian data
https://docs.google.com/spreadsheets/d/1tMrSDx2MGtEFmOIqg2Vp8T1apG_6vPCrftfp3Dtvmxk/edit#gid=1063999899
Could you show a scatter plot for the Italian data so we can see what's happening?
Also could you check if the crime national IQ relation holds for these Australian data:
http://aic.gov.au/documents/E/1/E/%7BE1E2943C-1FB7-40D6-B85E-DFB354BE751A%7Dethnic.pdf
There is a plot of the crime rates in the Italian datafile linked to.
Here's the male crime with IQ as indep. var.
Here's the male crime with IQ as indep. var.
Please use the quote feature. Reading quotes without that is hard.
No. It is based on noting the regularity between 2 factors in all available, 5, datasets.
In all datasets with spuriously high ratios, the predictors don't work well. In all datasets with realistic but high ratios, predictors work well. In all datasets based on citizenship proxy, predictors don't work well and there are spuriously high ratios. This is the crux of the reasoning.
The crime datafile has "Territorio di provenienza geografica" as you say, "Territory of geographical origin" (Google Translate).
The population datafile has "Dati: Popolazione straniera residente al 1° gennaio - focus sulla cittadinanza" - "Data: Foreign resident population on 1st January - focus on citizenship"
Calculating the crime rates per capita is assuming some high correspondence between country of origin and matching citizenship. This assumption is perhaps not true. Can you ask them for citizenship population size?
However, note that the Norwegian dataset 1 used citizenship for the crime variable and the population size and still the ratios were spuriously high. The Norwegian datasets 2-3 were based on country of origin data for population and crime vars which are not freely and publicly available, but is available to researchers working for the Norwegian stats agency (SSB) whose publications I have relied on.
I see. I will add that to the paper if we cannot get better population count data.
It does seem that the max-min ratio is a useful criteria for identifying the problematic datasets since every dataset in which predictors did not work had unrealistically high ratios.
This conclusion is based only on 2 datasets (Italian and Norwegian). N= 2 is very small thus it's very poor evidence. Can you find previously publishes studies to confirm this speculative hypothesis?
No. It is based on noting the regularity between 2 factors in all available, 5, datasets.
In all datasets with spuriously high ratios, the predictors don't work well. In all datasets with realistic but high ratios, predictors work well. In all datasets based on citizenship proxy, predictors don't work well and there are spuriously high ratios. This is the crux of the reasoning.
Both the Italian and Norwegian datasets are based on citizenship, rather than country of origin.
This is not true. As the ISTAT website clearly says, the data are based on country of birth (http://dati.istat.it/Index.aspx?DataSetCode=DCCV_CONDGEO1&Lang=) "Territorio di provenienza geografica" means country of origin. This was further confirmed by an email from ISTAT with a reply from my inquiry, which specified that their dataset refers to country of birth and not to nationality.
The crime datafile has "Territorio di provenienza geografica" as you say, "Territory of geographical origin" (Google Translate).
The population datafile has "Dati: Popolazione straniera residente al 1° gennaio - focus sulla cittadinanza" - "Data: Foreign resident population on 1st January - focus on citizenship"
Calculating the crime rates per capita is assuming some high correspondence between country of origin and matching citizenship. This assumption is perhaps not true. Can you ask them for citizenship population size?
However, note that the Norwegian dataset 1 used citizenship for the crime variable and the population size and still the ratios were spuriously high. The Norwegian datasets 2-3 were based on country of origin data for population and crime vars which are not freely and publicly available, but is available to researchers working for the Norwegian stats agency (SSB) whose publications I have relied on.
The most criminal group, Stateless, had a per capita crime rate of 0.850, while the least criminal group, Kosovo, had one of 0 as mentioned before. Removing Kosovo does not make things much better.
As they explained, this accounts for the value of 0 for Kosovo as Kosovo was made an independent country in 2008 so there cannot be any criminals born after 2008, unless we assume that 0-3 year old kids are as criminal as adults.
I see. I will add that to the paper if we cannot get better population count data.
All these errors should be amended before the paper is published.
As you said, the data are not matched because immigrant population size is calculated on citizenship, whilst crime data are calculated on country of origin. So this may be at the root of the problem. Instead of saying that the data is broken, perhaps explain that you had to calculate the ratio from 2 separate datasets and that these didn't match (i.e. one citizenship and the other country of birth) and that this could account for the weird results. Also add the explanation for Kosovo provided by ISTAT.
I thought they matched before you pointed it out. :(
But I will rewrite based on this. Can you ask them for matching data? The best solution would be to get matching data based on both citizenship and country of origin.
But I will rewrite based on this. Can you ask them for matching data? The best solution would be to get matching data based on both citizenship and country of origin.
I've emailed ISTAT asking for matching data.
If it is impossible to obtain matching datasets, then there's two options regarding the Italian data: 1) leave it out of the paper entirely, 2) keep it in but note it's deficiency.
Which do reviewers prefer?
Which do reviewers prefer?
If it is impossible to obtain matching datasets, then there's two options regarding the Italian data: 1) leave it out of the paper entirely, 2) keep it in but note it's deficiency.
Which do reviewers prefer?
Imagine you found that the Italian results agreed with your theoretical perspective and then after that you found the mismatch. What would you do? Do that. Generally, you shouldn't assume that a transferability hypothesis is correct. Or that it will hold for all countries:
(a) It might not hold at all, meta-analytically
(b) It might hold under some select conditions (for certain groups of destination countries)
(c) It might hold on average, but with some major moderators (e.g., Mediterranean countries)
(d) It might hold more or less consistently
At this point we are trying to collect data which we can later aggregate. We should be open to the possibility that our hypothesis is false and we should be constructively critical of the position. I would only approve this paper if you at least entertain possibility b and c and conjecture why a transferability hypothesis might only hold for Scandinavian countries. Maybe we should try to look at non-European data. I came across some Japanese crime rates by nation of origin a while back -- I'll see if I can locate the file.
If it is impossible to obtain matching datasets, then there's two options regarding the Italian data: 1) leave it out of the paper entirely, 2) keep it in but note it's deficiency.
Which do reviewers prefer?
Imagine you found that the Italian results agreed with your theoretical perspective and then after that you found the mismatch. What would you do? Do that. Generally, you shouldn't assume that a transferability hypothesis is correct. Or that it will hold for all countries:
(a) It might not hold at all, meta-analytically
(b) It might hold under some select conditions (for certain groups of destination countries)
(c) It might hold on average, but with some major moderators (e.g., Mediterranean countries)
(d) It might hold more or less consistently
At this point we are trying to collect data which we can later aggregate. We should be open to the possibility that our hypothesis is false and we should be constructively critical of the position. I would only approve this paper if you at least entertain possibility b and c and conjecture why a transferability hypothesis might only hold for Scandinavian countries. Maybe we should try to look at non-European data. I came across some Japanese crime rates by nation of origin a while back -- I'll see if I can locate the file.
I would include it, as I am wary about publication bias which also results from authors not reporting tests that did not come out with the preferred p-value.
I have asked Ken Kura (who I met in London at the Intelligence conference) to locate the Japanese data. I'll send him a new email.
I am withdrawing this paper because I am using the Norwegian data for another paper and waiting for the Italian statistics people to get back to us regarding useful Italian data.