Back to [Archive] Free GWAS project

Piffer's Freewas
Admin
I think the method is just insensitive for individuals with only 4-10 SNPs. Waiting for more SNPs to be discovered seems wise.
Ok so far we have excellent results for Watson and Venter, and negative results for Emily and the Italian lawyer. I believe the average odds ratios is still higher than 1, given that Emily's and the Italian lawyer is around 1 and W+V is around 1.3, so we're still hopeful. What matters is the average odds ratio across genomes, not individual flukes.
There are also explanations to account for the bad results with 23andMe: 1) different genome versions causing a problem; 2) Only top scientists such as Watson and Venter are so smart as to provide positive results and Emily and the Italian lawyer not being as smart as they claim (for a simple statistical law of measurement errors, anyone with a decent cognitive level can score high on an IQ test provided that they attempt many times.
Can you try with my own genome?
Another possibility> 23andMe is crap, given its absurdly low cost.
If all the high IQ genomes we get from 23andMe give negative results we will have to try with a better quality personal genomic service. I cannot really believe that Watson and Venter's results are just a fluke
I can. :)


I don't care if you can. Your beliefs are not gonna affect my research agenda.Only the analysis of more genomes will reveal the truth. It's even possible that only Watson and Venter are genetically smart and those without good odds ratios are pumped up genetic dummies.IQ is not 100% genetic. Full stop.
I doubt the differing genome versions are causing a problem. And if 23andme is lower quality it is only going to be off less than 1% of the time. But neither of the two problems above should be lowering the odds-ratios since there are only two alternatives, so therefore the mistakes in the right and wrong direction should average out.

Another possiblity: let Hsu in on this and let him evaluate the code/method to see if I am off somewhere. This isn't a trivial job (will probably take weeks) since there are many hundreds of lines of code, but I will be available for support. And if he finds something that salvages the method he should get credit. But I do not think there is a problem anywhere.

The method did not work for Emil, Lawyer woman, Emily nor Razib, all of whom should be much genetically smarter than their reference populations...

We could try making a PC with those 69 SNPs. Or wait for the hundred thousand genomes project (is on its way).
Ps. I do not know anything about using full genomes and I do not have space for them anywhere without someone buying me a new computer so I do not think genome sequencing is the way to go. Using full genomes is both an enormous technical and logistic problem, neither of which I have the resources (including time) to solve.
The method did not work for Emil, Lawyer woman, Emily nor Razib, all of whom should be much genetically smarter than their reference populations...

We could try making a PC with those 69 SNPs. Or wait for the hundred thousand genomes project (is on its way).


I said it already many times that the individual results do not matter. What matters are the overall results...are the odds ratios positive after pooling all the genomes together? I would say so, thus the odds are still in my method's favor. You also have not tried my own genome yet.
After you've done also my genome, we will have to pool all of them together and calculate the overall odds ratios. This is the way to go...science is not concerned with individuals but with groups.
Saying the method does not work simply because it is not working on some people is like saying that if we find someone with high IQ who drops out of high school, IQ does not predict academic achievement. Pretty dodgy.
We need to compute the average odds ratios across genomes. As I said SNPs are only a small portion of the genetic variation thus it is perfectly likely that there will be many people who don't beat their reference population because of epigenetics, environment and other genes that are not SNPs. Even Meisenberg agrees with me on this point.
I can also put it this way> say there is a correlation between odds ratios and IQ. This correlation will be somewhere in the 0.3/0.5 range, as the SNPs we look at account for 10/25% of the variance to be optimistic. You need a decently powered study to detect such a correlation, at least 50 individuals. What you are thinking is a fantasy but nothing to do with reality. You cannot get a significant correlation just by looking at 5 individuals. That'd be possible only if odds ratios had a 0.9 correlation with IQ, but that is unlikely. We'll just have to wait for more genomes to be collected.
Also, I know that Emily has scored 150 on gigi but she admitted to having scored lower on other IQ tests. I think the smartest individuals in our sample are Venter and Watson, whatever their IQ, so the results are actually promising.
Admin
Tegan's genome.
IQ: >145 (multiple certified tests, perfect SAT score)
height: 151 cm
Origin: American, Scottish last name.

http://emilkirkegaard.dk/en/wp-content/uploads/genome_Tegan_McCaslin_Full_20141004033745.zip
Current ideas: rerun pipeline when I have a batch of genomes to test, including the ones we are waiting for.
Also try just using the top ten or top hundred SNPs.

If this does not work, then I will go through the SNPs with an R2 higher than 0.9 and see which ones of these differ significantly between our brainiacs and the reference popularions. These SNPS, if we find any, should be tested for significance in an unrelated batch of smarties.
Current ideas: rerun pipeline when I have a batch of genomes to test, including the ones we are waiting for.
Also try just using the top ten or top hundred SNPs.
If this does not work, then I will go through the SNPs with an R2 higher than 0.9 and see which ones of these differ significantly between our brainiacs and the reference popularions. These SNPS, if we find any, should be tested for significance in an unrelated batch of smarties.


Good ideas. I am currently waiting only for 1 genome from a 150 IQ guy though...I won't find any other genomes until my Indiegogo campaign will be started.
It could also turn out that the only portion of IQ that has real life impact is that which is due to genetics and this is why Watson and Venter are so much smarter genetically than the others in the sample, which may instead be genetically normal and have "reprogrammed" their IQ through effort or environment/epigenetics. Just a speculation though...will need a big sample to be tested.
Once we've got a decent sample of genomes it should be easy to run a single FE for all of them.
This is what I get on 23andMe:
This is an advanced view of all the uninterpreted SNP data from your chip. The data from 23andMe's Browse Raw Data feature is for research and informational use only. This data has undergone chip-wide quality review, and a subset of SNPs have been individually validated for accuracy. However, the majority of SNPs have not undergone this rigorous individual validation, and any SNP result obtained from this raw data should be independently verified.
Edit 1: On the other hand, the error rate is not very high: http://liorpachter.wordpress.com/2013/11/30/23andme-genotypes-are-all-wrong/
It's time to re-run the height study.

http://infoproc.blogspot.dk/2014/10/common-variants-and-and-biological.html

697 SNPs found.


I don't have access to this paper. Can you download it and post it here?
Admin
http://www.openpsych.net/forum/showthread.php?tid=146&pid=1899#pid1899

The supp. material table 1 has all the data you need. It has beta sizes, SNP ID and effect allele. Basically, just write some code to add up all the beta values for each genome. Tegan's genome is useful for this because she is 1.51 m tall, so she should have a low polygenic score.
Since rerunning the pipeline takes about as much (of my) time whether I use 1 or 10 people I'll run it when we have all the genomes (we are still waiting for one IIRC). You might not be able to reach me this week, but I'll be back after that.

Ps. why don't we try the freewas on height? If it works on that phenotype we will have a proof of concept and an easier time having our ideas accepted since height is much less controversial.

Who is Tegan?
Since rerunning the pipeline takes about as much (of my) time whether I use 1 or 10 people I'll run it when we have all the genomes (we are still waiting for one IIRC). You might not be able to reach me this week, but I'll be back after that.

Ps. why don't we try the freewas on height? If it works on that phenotype we will have a proof of concept and an easier time having our ideas accepted since height is much less controversial.

Who is Tegan?


There is not much genetic evidence that height was subject to different selection pressures across populations. All we know is that East Asians tend to be shorter but what about the other groups? Blacks living in the US are as tall as US Whites...
But yes we can try the Freewas on height too. We'll have to compute the average frequencies for the 1KG populations for the list of SNPs attached in Emil's file. Then use that as vector for the individual genomes.
Admin
Tegan is my girlfriend. I attached her genome earlier in the thread.

Another idea with the height data, is to redo the analysis with many different numbers of SNPS, at least: 1:15. We want to know how sampling affects the results. If sampling is not found to be a problem for the height data with e.g. 10 SNPs where we have 700 SNPs now, then that is evidence that relying on 10 SNPs for g/edu. att. will not give spurious results either, thus countering the sampling error objection.