Hello There, Guest!

# [ODP] Increasing inequality in general intelligence and socioeconomic status as a res

You are again arguing for some narrow definition of model. This is not the only way the word is used in science.

One does not need actual comparison data for modeling. In many cases, such data are not actually available... which is also why one is doing the modeling in the first place. In this case, there are population data available and one data point from the army study, which the model results can and is compared to in the study. So, by your narrow definition, it is still modeling.

The model* also makes predictions for data not yet publicly available, i.e. what the mean IQ should be in the immigrant population.

* ... or models, depending whether you want to call it 1 model with 4 parameters, or 4 models. We talk about them as 4 models in the paper, but it is perhaps better to call it one model with 4 parameters.
I thought I have insisted enough on this. I did not say model. I said statistical model. There's a huge difference between these terms. Your figure 6 is said to be modeled. As I told you before, and illustrated by the quote of Field (2009), a statistical model, by definition, implies several assumptions (e.g., constraints). But if what you do in Figure 6 is a simple computation of means, there is no assumption made here (at least, I don't see any). There is no parameters constrained to be zero, or equal to another parameter, etc.

Quote:or models, depending whether you want to call it 1 model with 4 parameters, or 4 models. We talk about them as 4 models in the paper, but it is perhaps better to call it one model with 4 parameters.

In models with weak/medium/strong gains, perhaps I can accept the use of statistical modeling, because you're predicting the IQ of the immigrants given some assumed values of IQ gains. You made an assumption here. But in the no g gain model (Figure 6) there is no such thing. It's only descriptive because it's your observed data. The models with weak/medium/strong gains are not your observed data.
It is not a statistical model in the sense that is used for e.g. latent variable modeling/structural equation modeling.

The four parameters makes assumptions, namely, that the parameter is what they say it is. The assumption of the no-gains model being that... there are no gains due to environment.
1) As I said earlier, there should be some data on the countries of origin of the immigrant population. Most readers have no idea who actually moves to Denmark. At the very least, there should be basic information like "x% of the immigrant population is of non-European origin and y% of European origin as of 2014." Generally, non-European immigration would be expected to increase inequality more, given that IQ levels are relatively uniform across Europe.

2) Sample sizes should be indicated in Table 1, at least mention in the caption that Ns range from x to y countries.

3) "Then, for each year, we calculated the composite population using the population data and their IQs"

This could be expressed more clearly, e.g., "for each year, we estimated the composite IQ distribution by modeling the effect on Danish IQ of changes in the composition of the population, based on the national IQs of the immigrants' countries of origin."

4) "one where there are large gains, one with medium gains and one with small gains. Concretely, we modeled these as the immigrants closing the g gap to the IQ of Denmark by 25%, 50% and 75% respectively"

The percentages should be presented from large to small as in the preceding sentence.

5) Language should be improved -- for example:

a) "We think the immigration to western countries leads to a policy conundrum"

We think that immigration to Western countries...

b) 'western' should be capitalized throughout

c) "Immigration will lead to higher socioeconomic inequality in the countries"

Immigration will cause higher socioeconomic inequality in Western countries

d) "There are two parts of the spatial transferability hypothesis."

... two parts TO the spatial...

e) "Comparing with cognitive data from the military draft"

Comparison with cognitive data...

6) If these issues are dealt with, I approve publication.
Dalliard,

Thanks for the review.

Quote:1) As I said earlier, there should be some data on the countries of origin of the immigrant population. Most readers have no idea who actually moves to Denmark. At the very least, there should be basic information like "x% of the immigrant population is of non-European origin and y% of European origin as of 2014." Generally, non-European immigration would be expected to increase inequality more, given that IQ levels are relatively uniform across Europe.

We have added a table to a new subsection in 1.1, that gives the top 10 countries by 10 year intervals as well as their relative percentages.

Quote: 2) Sample sizes should be indicated in Table 1, at least mention in the caption that Ns range from x to y countries.

Added "Sample sizes range from 119 to 154 with a mean of 130" to the caption.

Quote: 3) "Then, for each year, we calculated the composite population using the population data and their IQs"

This could be expressed more clearly, e.g., "for each year, we estimated the composite IQ distribution by modeling the effect on Danish IQ of changes in the composition of the population, based on the national IQs of the immigrants' countries of origin."

Quote:Then, for each year, we estimated the composite IQ distribution by modeling the effect on Danish IQ of changes in the composition of the population, based on the national IQs of the immigrants' countries of origin (using the same national IQ data as previously). The plot of the results is shown in Figure 6.

Quote: 4) "one where there are large gains, one with medium gains and one with small gains. Concretely, we modeled these as the immigrants closing the g gap to the IQ of Denmark by 25%, 50% and 75% respectively"

The percentages should be presented from large to small as in the preceding sentence.

Fixed.

Quote:5) Language should be improved -- for example:

a) "We think the immigration to western countries leads to a policy conundrum"

We think that immigration to Western countries...

Changed to:
Quote:We think that immigration as it is happening right now to Western countries leads to a policy conundrum for some policy makers. Our argument is as follows:

Quote:b) 'western' should be capitalized throughout

Done.

Quote:c) "Immigration will lead to higher socioeconomic inequality in the countries"

Immigration will cause higher socioeconomic inequality in Western countries

Fixed.

Quote:d) "There are two parts of the spatial transferability hypothesis."

... two parts TO the spatial...

Changed to: The spatial transferability hypothesis has two parts.

Quote:e) "Comparing with cognitive data from the military draft"

Comparison with cognitive data...

Fixed.

---

I will have a Native speaker friend of mine read it through and see if she can find more instances where the language can be improved.

A new draft is available, version 10.
(2015-Jan-15, 04:06:08)Emil Wrote: The assumption of the no-gains model being that... there are no gains due to environment.

I said that the calculation made in your figure6 is just your observed data. And I have strongly insisted on observed data. The other models (those with IQ gains) are not your actual data. This is what is usually meant by assumption. Modeling is not a descriptive stats and is not your observed data. So, what you should have in the description under figure 6 should be "Figure 6: Change in mean IQ and SD over time in Denmark calculated from population data by country of origin and national IQs.". Or you can use "computed". At least, if you don't say "modeled", it's fine.
Again, you are using an idiosyncratic narrow definition of what a model is. It does not make sense to alter the wording to fit that usage.

We do have a language rewrite on the way, so hopefully it will be somewhat better. It is hard for non-natives to write completely fluently. Even for someone who speaks a closely related language (Danish and English both being Germanic languages, and there also being flow between them both recently and in Norse items).
(2015-Jan-21, 12:29:59)Emil Wrote: Again, you are using an idiosyncratic narrow definition of what a model is. It does not make sense to alter the wording to fit that usage.

That's handwaving argument. In fact, it's not even an argument at all. For this to be an argument, such affirmation should have been accompanied by some citations, which are absent in your post. By saying that my definition is an "idiosyncratic narrow definition" you're showing you don't obviously know what's a model. Because that definition I have given to you is actually not "my" definition. But it's a logical conclusion anyone can derive from what statisticians are writing. So, affirming I'm distorting the definition of a statistical model proves that you don't understand the manner in which statisticians use the term "statistical model".

Anyone who reads enough papers on this matter can certainly notice that statisticians (and even non-statisticians) employ quite often a sentence like this : "the models are fitted against the data". That's the perfect occasion for asking you this question : why do you think they are saying "models are fitted against the data" ? The response is obvious. They make a distinction between the statistical models (unobservables) and the observed data (observables).

In your paper, what you have is :

model0 = observed data
model1 ≠ observed data
model2 ≠ observed data
model3 ≠ observed data

Thus, while models 1-3 (IQ gains) can be statistically tested between each other against the data, this is not the case for model0 (no IQ gains). Models 1-3 can be said to be approximations of the observed data, but not model0. Thus, model0 violates the definition of a statistical model. By definition, a statistical model can be "statistically tested" with respect to the data. It's the purpose of a statistical model, i.e., to know how a given model approximates the data. And a model (e.g., model0) which is equivalent to the observed data cannot be "statistically tested" because model0 = data. No one can say, for example, that model0 has better model fit than models 1-3, even if it's the most accurate description of your data (which is not difficult because model0=data). Every models can fail the statistical test when they are inconsistent with the data; and the possibility of failure can apply to models 1-3 but not to model0, because, once again, model0=data.

Have you ever heard of the following saying ? From the statistician George E. P. Box :

Quote:Essentially, all models are wrong, but some are useful.

And I have seen several economists quoting him, in order to make clear what's a model. What this sentence reveals is that a model necessarily incorporates a degree of inexactness. That's what I meant earlier by approximations. It is only when models are approximations that they can be compared and tested against each other. As other asked, how can we not compare models ?

If you don't trust my words, perhaps you will trust the words of others. Models are expressed as equations, and understood as approximations with regard to the data. For instance :

Nachtigall et al. 2003 p. 4
(Why) Should We Use SEM? Pros and Cons of Structural Equation Modeling

Jeffrey M. Wooldwridge 2012 pp. 3-5
Introductory Econometrics: A Modern Approach

Konishi & Kitagawa 2008 p. 4
Information Criteria and Statistical Modeling (Springer Series in Statistics)

Rex B. Kline 2011 pp. 8, 16
Principles of Structural Equation Model

Sheldon M. Ross 2010 p. 540
Introductory Statistics (3rd edition)

Marloes Maathuis 2012
1. Role of statistical models

Quote:Model is by definition a simplification of (a complex) reality.

Anu Maria 1997
Introduction to Modeling and Simulation

Quote:Modeling is the process of producing a model; a model is a representation of the construction and working of some system of interest. A model is similar to but simpler than the system it represents. One purpose of a model is to enable the analyst to predict the effect of changes to the system. On the one hand, a model should be a close approximation to the real system and incorporate most of its salient features. On the other hand, it should not be so complex that it is impossible to understand and experiment with it

Galit Schmueli 2010
To Explain or to Predict?

Quote:Exploratory data analysis (EDA) is a key initial step in both explanatory and predictive modeling. It consists of summarizing the data numerically and graphically, reducing their dimension, and “preparing” for the more formal modeling step.

...

2.6.1 Validation. In explanatory modeling, validation consists of two parts: model validation validates that f adequately represents F, and model fit validates that fˆ fits the data {X, Y}. In contrast, validation in predictive modeling is focused on generalization, which is the ability of fˆ to predict new data {Xnew,Ynew}.

...

The top priority in terms of model performance in explanatory modeling is assessing explanatory power ... In contrast, in predictive modeling, the focus is on predictive accuracy or predictive power, which refer to the performance of fˆ on new data.

Cosma Shalizi 2011
Evaluating Statistical Models

Quote:Using a model to summarize old data, or to predict new data, doesn't commit us to assuming that the model describes the process which generates the data. But we often want to do that, because we want to interpret parts of the model as aspects of the real world. We think that in neighborhoods where people have more money, they spend more on houses - perhaps each extra \$1000 in income translates into an extra \$4020 in house prices. Used this way, statistical models become stories about how the data were generated. If they are accurate, we should be able to use them to simulate that process, to step through it and produce something that looks, probabilistically, just like the actual data. This is often what people have in mind when they talk about scienti c models, rather than just statistical ones.

An example: if you want to predict where in the night sky the planets will be, you can actually do very well with a model where the Earth is at the center of the universe, and the Sun and everything else revolve around it. You can even estimate, from data, how fast Mars (for example) goes around the Earth, or where, in this model, it should be tonight. But, since the Earth is not at the center of the solar system, those parameters don't actually refer to anything in reality. They are just mathematical ctions. On the other hand, we can also predict where the planets will appear in the sky using models where all the planets orbit the Sun, and the parameters of the orbit of Mars in that model do refer to reality.

SAS/STAT® 9.2 User's Guide, Second Edition

Quote:Obviously, the model must be "correct" to the extent that it sufficiently describes the data-generating mechanism

Topics in Statistical Data Analysis: Revealing Facts From Data

Quote:The following figure illustrates the statistical thinking process based on data in constructing statistical models for decision making under uncertainties.

Mueller & Hancock 2007
Best Practices in Structural Equation Modeling

Quote:A central issue addressed by SEM is how to assess the fit between observed data and the hypothesized model, ideally operationalized as an evaluation of the degree of discrepancy between the true population covariance matrix and that implied by the model's structural and nonstructural parameters. As the population parameter values are seldom known, the difference between an observed, sample-based covariance matrix and that implied by parameter estimates must serve to approximate the population discrepancy.

Kenneth A. Bollen 1989 pp. 68, 72
Structural Equations with Latent Variables

Quote:Model-reality consistency is a more "slippery" issue. Here the question is whether the model mirrors real-world processes. For instance, does an econometric model of the U.S. economy really correspond to the behavior of the economy? Fully assessing model-reality consistency is not possible since it presupposes perfect knowledge of the "real" world with which to evaluate the model. In practice, we imperfectly evaluate model-reality consistency in several ways. One is comparing the predictions implied by a model to those observed in a context different from the data that supply the model parameter estimates. For instance, we might check the realism of an econometric model by contrasting its predictions of inflation rates to those observed in the future. If we are fortunate enough to be able to manipulate variables in the model, we can do so and see if the model correctly predicts the consequences. Or, we can examine the assumptions and relations embedded in a model and debate their validity based on other experiences or insights.

It is tempting to use model-data consistency as proof of model-reality consistency, but we could be misled by so doing. The problem lies in the asymmetric link between these two consistency checks. If a model is consistent with reality, then the data should be consistent with the model. But, If the data are consistent with a model, this does not imply that the model corresponds to reality.

[...]

In sum, structural equation models face the same restrictions as other empirical methodologies. We can only reject a model - we can never prove a model to be valid. A good model-to-data fit does not mean that we have the true model.

The last paragraph helps to better understand why models are not actual data. Since all models are "wrong", so to speak, the best fitting model is not a proof this model is the true model, as they are all approximations.

And finally, the best one is that blog article :

The True Meaning Of Statistical Models

Briggs (2014) has nicely summarized the essence of a typical statistical model : "Why substitute perfectly good reality with a model?", "Because a statistical model is only interested in quantifying the uncertainty in some observable, given clearly stated evidence", "Every model (causal or statistical or combination) implies (logically implies) a prediction". This cannot illustrate better all I have said earlier. A statistical model is an approximation, and thus is different from a descriptive stats. Unfortunately, your so-called statistical model of no gain has no uncertainty in it.

I repeat, the description in your figure 6 definitely needs to be rewritten.
I don't think we will reach agreement on this semantic issue.

I have asked Ken Kura to review the paper (as author-chosen reviewer). He has some criticism as well. I asked him to post it here on the forum for others to see (he sent it to my email). We will be revising the paper according to his criticism.

If we can get Dalliard, Kura, Meisenberg, and Fuerst or Piffer, then we will have 4 approvals.
Here's the new version, #11. It now has a proper introduction (about ½ page), another paragraph for results for immigrants only over time (requested by Kura), another paragraph in the discussion, a table with information about the largest countries of origin requested by Dalliard, and a lot of language edits thanks to Laird Shaw.

https://osf.io/dei73/

Forum Jump:

Users browsing this thread: 1 Guest(s)