Hi Meng,

Thanks for reviewing. The quotes below are from you unless otherwise specified.

Perhaps I missed something but this paper seems to me to be a continuity of your actual research, i.e., correlations between g and social variables. Thus I'm somewhat disconcerted about the title of the paper and the section 1. Introduction, which is only about data sharing. Why nothing about your previous research on g and social variables and your S-factor ? And even in the discussion section, still no word about your previous research. So, what is the purpose of the paper ? To illustrate the importance of data sharing ? Or that this data has some use for psychological research (which is debatable considering its limitations) ?

The paper is about presenting a new dataset. This is why the introduction mentions this topic, the title is about this and we don't cite any of the research related to S factor, no S factor analysis was carried out, nor were any of the typical socioeconomic data analyzed (such as education or income or criminality).

The analyses presented in the paper are only presented to showcase what kind of analyses one can do with the dataset and show that one finds known results when doing so (successful calibration).

I don't understand how this was not clear to you. Let me know if you have any suggestions for how to make this more clear if you think it should be.

I'm skeptical. I wouldn't qualify as a test, one test composed of only 2-4 items. I think you should write about it in the limitation section.

There are 14 useable questions that can be used as items in a test. The matrix shows the intercorrelations between using tests with different numbers of these items. The trade-off is that using more of the questions results in more missing data but also more precise measurement. IRT is able to estimate scores for persons with missing data, so the trade-off is less grave than it would have been if one had used a method that required full data (such as ordinary factor analysis). If you are interested in the items, you can find them in the supplementary materials (

*data/test_items.csv*).

I added a paragraph in the limitations section about the items:

*The cognitive ability data is limited to about 14 items with sufficient amount of data. This necessarily limits the reliability of the measurement. Furthermore, as far as we know, these items have not been validated against known test batteries or used in any other studies.*Let me know if this is satisfactory to you.

That would be much clearer if you write "between the most and least religious groups".

I have added

*groups*.

Even if the graph (and some of the following ones) suggests this conclusion, I won't use such wording "linear relationship" when the variable is not nominal. If you have a 3-category variable, 1 "no", 2 "neither", 3 "yes", a line that looks linear shouldn't be qualified as a linear relationship in my opinion.

I think you meant to say

*continuous*. I think it's alright to say it's linear if the scale is a Likert or similar which is plausibly interpreted as being close to interval. I think this is the case for the analyses we present. For instance, I think the 4 point scale in Figure 6 is pretty plausibly interpreted as being interval scale or close to:

- Extremely important
- Somewhat important
- Not very important
- Not at all important

Note that a violation of interval scale would be unlikely to result in a linear relationship as seen. It's easier to make a relationship non-linear than linear.

Furthermore, note that the analysis in Figure 8 does not display a linear relationship despite using the same answer options. Thus, it's possible to get both linear and non-linear looking results with these answer options.

What do you mean ?

To calculate a correlation, one must be able to rank the possible values. However, how should one rank the answers "I would donate time" and "I would donate money"? It's not clear which one is the greatest sacrifice.

I note that one should probably reorder the groups on the plot so that the None-answer is on the left. This was already done for the plots found in the supplementary materials, but the figure in the paper was not updated. This has been done now.

"time of birth in the year" ?

Effects of when time of birth falls within a year, e.g. January vs. February. The last clause is necessary because otherwise one might think it includes the difference between being born in 1962 vs. 1970, a cohort or age effect.

I will leave another comment later, I think, because I don't understand something about section 5.3. Which is, the use of p-values...

By the way, can you explain what the null hypothesis is about ? NH of what ? Specify it in your paper, also. It helps to clarify things.

What do you not understand about it?

The null hypothesis for a chi square test is always that the samples come from populations with the same mean, so it seems redundant to specify it explicitly. However, because you requested it, I have done it. The text new reads:

*It is possible to do a large-scale test of astrology using the OKCupid dataset by examining whether Zodiac sign is related to every question in the dataset. Zodiac sign is arguably a nominal variable and the questions are either ordinal (possibly interval-like) or nominal. Thus, to use all the questions, a test that can handle nominal x nominal variables was needed. We settled on using the standard chi square test because the goal was to look for any signal at all, not estimate effect sizes. This is a strong test because it is possible that there are effects of time of birth within a given year which are unrelated to Zodiac sign. For instance, being born in summer may be related to which kind of activities one takes part in at age 3 due to limitations of the weather, and the experiences from these activities may have a causal impact on one’s later personality.*

To clarify, the null hypothesis tested by the chi square test here is that the answers have the same frequency for all the 12 Zodiac populations. Figure 11 shows a density-histogram of the p-values.Let me know if this is satisfactory.

---

I noted that there was some odd whitespace on page 9. I have fixed this.

I have added page numbers.

--

A new version will be uploaded shortly.