I was rude enough to contact Harpending about our work.
He had many interesting things to say:
Hi Endre, thanks for the preprint. Your approach seems exactly right to me but I am no expert on statistics and data analysis. I do have some complaints that I will lay out below.
There is a large consortium that is pursuing GWAS for cognitive SNPs that you cite. I expect that you are their ongoing nightmare because you and Piffer are not afraid to finish their sentence, so to speak. You are wise to keep your head down but things may be getting better. I was at a meeting at the Chicago economics department a few months ago that included many of the GWAS bunch. It is probably worth a look at this link: there are videos of some of the talks. There was no self-righteous hostility, to my surprise, but everyone in the talks stepped gingerly, careful not to mention group differences. The discussions in the bar in the evening were much more open and explicit (and honest).
Here are some not-well-organized reactions to your draft:
I am likely being an old fogey, but "correlated to" just sounds wrong to me, but I do see that it is growing in popularity. I would say "correlated with".
You are not very clear about what you did. For example you performed PCAs but you did not lay out what data matrix you factored. I think of PCA starting with a matrix of rows that correspond to genes and columns that correspond to objects. What are your objects? Populations? What populations? I presume, early on, there are 3 rows corresponding to the GWAS hits?
What are the entries of the data matrix? We see two in the literature. If x_i is the frequency of the i'th allele in a population (or individual), are the entries of your data matrix x_i or x_i/(x_i bar(1-x_i bar)). There are arguments to be made for either one, the second is customary in some circles because the "speed" of frequency change under either drift or selection ought to be proportional to x_bar(1-xbar). The Visscher lot uses the second normalization. I don't know whether or not this would make any different at all with respect to the patterns that you find.
You should not report significance levels for cross-national findings since the data are autocorrelated. Are Denmark, the UK, Sweden, and Nigeria four nations or 2 nations? This error plagues cross-cultural research: anthropologists used to call it Galton's problem.
You speak of "selection" a lot but I can't see why. Of course the differences must reflect selection but the details are open to our imaginations. What selection to you see?
There two, maybe 3, pockets of high IQ in the world. One is in NW Europe, one in NE Asia, and maybe one in south India. The European and Asian pockets seem to be different. It would be nice if you had, say, Steve Hsu's and James Lee's 23andMe scans to see if they have the same configuration as Venter and Watson. May be hard to get since Hsu and Lee both have ponies in this race.
My sense of what is going on in cognitive genetics and more generally is that the old old fight between the biometricians and the Mendelians is coming back full steam and that we will see a period of Galton coming back as he certainly ought to.
Best, Henry
From this it seems that partnering with Lee/Hsu might not be such a terrible idea, the question is whether they would like to.
Ps. I have systematically been calling this your idea, but our work. Does that sound right to you? It is not like the stuff I have to do is self explanatory (indeed, some of it isn't explained anywhere), so I am doing some creative work too even though I admit to not having had anything to do with developing the idea, nor the theory.
Psps. I contacted him some time ago, before you said you wanted to wait. He likes the idea and the work, so perhaps he could put in a good word for us with Greg? I'll post my reply to Harpending here, before sending it to him so that you can include stuff you want to ask about or similar.