Many thanks for the review, Emil. Replies below.

The data are not available. I take it that this is due them being secret/apply-only.

The BES data should be available for download. Please check

this link.

Subject?

'Subjected to' means 'received' whereas 'subject to' means 'susceptible to', so the former is correct here.

I note that this uses the economist tradition of stating the findings in the introduction. Seems redundant as they were already mentioned in the abstract above, meaning they get repeated 3 times: abstract, intro, conclusion.

I have altered the relevant sentence so that it now says:

"the present study explores whether the political attitudes of British academics are indeed both more left-wing and more liberal than those of the general population."

Use ≈

This has been changed.

It would be more proper to call this "non-academics".

I have altered the relevant sentence so that it now says:

"Insofar as academics comprise such a small share of the sample (0.3%), the reference category for this variable can be considered to be the general population, although strictly speaking it represents all nonacademics (99.7%)."

Careful. Now you are contrasting Guardian readers vs. newspaper readers who are not Guardian readers.

I have added the following sentence:

"The reference category for this variable is therefore the population of individuals who read some other daily newspaper."

Was O a single item, or based on multiple OCEAN/Big five items?

I have added the following sentence:

"The latter measure is based on the Ten Item Personality Test (TIPI; Gosling et al., 2003), and is included in the dataset as a single variable scaled from 0–10."

Judging from the table data, it seems likely that the p values are just calculated incorrectly.

The p-values for weighted estimates were computed by Stata, and I believe they are correct. Weighting affects the standard errors, as well as the point estimates.

Yes. It has to result from non-random sampling with some variable not among these or strongly related to them. Candidates?

Not sure, unfortunately.

Please quantify this, e.g. Pearson correlation.

I have added a footnote on p. 5, which states the following:

"The correlation between the unweighted distribution from the BES and the average of the two distributions from Understanding society is r = .93 for both the broad and narrow definitions of party identity. By contrast, the correlation between the weighted distribution from the BES and the average of the two distributions from Understanding society is r = .65 for the broad definition of party identity and r = .64 for the narrow definition."

Please report correlation matrix of all primary variables. Remember that some readers may be more interested in other predictors, e.g. for meta-analysis.

I would prefer not to report this. If readers want to find out the bivariate correlations, they can download the data, and run my Stata code.

Age^2 is not a good way to control for non-linear effects.

I have now included dummies for age quintiles in the models instead, but it made essentially no difference.

Note that including education as a co-predictor is problematic... What about the overlap of measurement criticism? Some O items concern political stuff very similar or identical to your social items.

I have added the following statements on p. 4:

"Note that the reason for utilizing education and openness to experience is that each has been posited to at least partially account for the left-liberal skew of academia (see Gross, 2013; Duarte et al., 2014; Carl, 2017). I.e., it has been asserted that academics tend to be have more left-liberal attitudes due to their higher education and greater openness to experience. Including these variables as covariates in a multiple regression analysis allows one to estimate how much of the skew they do in fact account for."

Are these R2 adjusted or not?

Non-adjusted, but it makes very little difference.

You cannot use OLS for a binary outcome! Please use a logistic model.

I got this criticism from another reviewer recently, and I disagree. So I will repeat what I said to that reviewer:

The linear probability model (LPM; i.e., OLS with a binary dependent variable) is widely used in the economics literature, and is now preferred to logit and probit by many econometricians. The two main reasons are: greater interpretability, and lack of small sample bias that afflicts maximum likelihood estimation when specifying fixed effects.

The conventional criticism of the LPM, namely that predicted probabilities may fall outside the interval 0–1, is not relevant if one’s purpose is simply to estimate the marginal effect of an independent variable. As Wooldridge (2002) notes in his seminal textbook on econometrics (Econometric Analysis of Cross-Section and Panel Data):

“If the main purpose is to estimate the partial effect of [the independent variable] on the response probability, averaged across the distribution of [the independent variable], then the fact that some predicted values are outside the unit interval may not be very important.”

Similarly, in the blog for their own econometrics textbook (Mostly Harmless Econometrics), Angrist and Pischke (2012) write:

“If the conditional expectation function (CEF) is linear, as it is for a saturated model, regression gives the CEF – even for LPM. If the CEF is non-linear, regression approximates the CEF. Usually it does it pretty well. Obviously, the LPM won’t give the true marginal effects from the right nonlinear model. But then, the same is true for the “wrong” nonlinear model! The fact that we have a probit, a logit, and the LPM is just a statement to the fact that we don’t know what the “right” model is. Hence, there is a lot to be said for sticking to a linear regression function as compared to a fairly arbitrary choice of a non-linear one! Nonlinearity per se is a red herring.”

Moreover, as the economist Marc Bellmere (2013) notes on his blog (please see also Allison, 2012; Smart, 2013): “The probit and logit are not well-suited to the use of fixed effects because of the incidental parameters problem.”

Allison, P. (2012). Logistic Regression for Rare Events. Statistical Horizons, available online.

Smart, F. (2013). Incidental Parameters Problem with Binary Response Data and Unobserved Individual Effects. Econometrics By Simulation, available online.

It's probably wise to cite something not your own research

I have cited Duarte et al. (2014).

The last question is arguably misincluded. It concerns the environment, not economics.

I have replaced this item with an item pertaining to deficit reduction.

Some questions are hard to interpret. You should include full texts

Full wording for all items have been included in Appendix A.

Were the distributions of politics normal? Please include distribution plots.

The distributions have been included in the new Appendix B.

Latest files available

here.