Back to [Archive] Other discussions

Proposal: Piffer method as software
http://database.oxfordjournals.org/content/2013/bat063.full

IQdb: an intelligence quotient score-associated gene resource for human intelligence
Abstract

Intelligence quotient (IQ) is the most widely used phenotype to characterize human cognitive abilities. Recent advances in studies on human intelligence have identified many new susceptibility genes. However, the genetic mechanisms involved in IQ score and the relationship between IQ score and the risk of mental disorders have won little attention. To address the genetic complexity of IQ score, we have developed IQdb (http://IQdb.cbi.pku.edu.cn), a publicly available database for exploring IQ-associated human genes. In total, we collected 158 experimental verified genes from literature as a core dataset in IQdb. In addition, 46 genomic regions related to IQ score have been curated from literature. Based on the core dataset and 46 confirmed linked genomic regions, more than 6932 potential IQ-related genes are expanded using data of protein–protein interactions. A systematic gene ranking approach was applied to all the collected and expanded genes to represent the relative importance of all the 7090 genes in IQdb. Our further systematic pathway analysis reveals that IQ-associated genes are significantly enriched in multiple signal events, especially related to cognitive systems. Of the 158 genes in the core dataset, 81 are involved in various psychotic and mental disorders. This comprehensive gene resource illustrates the importance of IQdb to our understanding on human intelligence, and highlights the utility of IQdb for elucidating the functions of IQ-associated genes and the cross-talk mechanisms among cognition-related pathways in some mental disorders for community.


I had emailed the authors of that website a while ago but they couldn't provide (because they didn't know!) the IQ increasing alleles for the SNPs they listed. I guess their principal aim was to do pathway analysis.
2 Qs:

(1) Is Bartletts neccessary?

I was reading the vignette for the "psych" R package and it claimed that Bartlett's is "[m]ore useful for pedagogical purposes than actual applications."

(2) Is it desirable to mix 1kg and ALFRED populations in the same analysis?*

Remember that for Alfred we use many proxy SNPs. Also, the sample/population sizes might be different in 1KG and Alfred. Ninja edit: Lastly, the groups might be chosen at different levels of granularity (1KG mixes Japanese/Chinese IIRC, while ALFRED has several types of Chinese). Dunno if any of this matters.

---
* This is simple to do since both Alfred and 1KG populations are represented as population objects in the script, but I do not know if it is a good idea for statistical reasons. (Populations are the groups like YRI, CEU, Ami, Hakka). Each population also has a "Continent group" property where I just use the 1KG classifications like AFR, EUR etc.
2 Qs:

(1) Is Bartletts neccessary?

I was reading the vignette for the "psych" R package and it claimed that Bartlett's is "[m]ore useful for pedagogical purposes than actual applications."

(2) Is it desirable to mix 1kg and ALFRED populations in the same analysis?*

Remember that for Alfred we use many proxy SNPs. Also, the sample/population sizes might be different in 1KG and Alfred. Ninja edit: Lastly, the groups might be chosen at different levels of granularity (1KG mixes Japanese/Chinese IIRC, while ALFRED has several types of Chinese). Dunno if any of this matters.

---
* This is simple to do since both Alfred and 1KG populations are represented as population objects in the script, but I do not know if it is a good idea for statistical reasons. (Populations are the groups like YRI, CEU, Ami, Hakka). Each population also has a "Continent group" property where I just use the 1KG classifications like AFR, EUR etc.


No, it's not desirable to mix 1Kg and Alfred in the same analysis.
One of the uses of my method is as a tool in gene discovery. The factor score that I extracted from the 4 SNPs should be correlated with SNP allele frequencies throughout the genome. SNPs that are highly correlated with the factor score can be used in future genome-wide association studies. This strategy can greatly decrease the number of SNPs that are included in a genome-wide association study, reducing the multiple-testing problem and required sample sizes. Then we could check the frequencies of all these alleles on Craig Venter and James Watson's genomes (which are published online) and carry out a chi-square test to see if their frequencies are higher than in the normal population. Then once we've selected the best alleles, the sample size required to carry out a GWAS will be much smaller due to avoidance of massive Bonferroni correction.
It'd be helpful if someone could build a software that does this automatically.
Admin
This is assuming that g alleles are correlated. This was found for the studies you carried out so far, but may not be the case for the bulk of or some of g SNPs.

Good idea tho.
Well if there has been selection on intelligence (which seems to be the case), the alleles must be correlated. Moreover, we can use Venter and Watson's genomes (or any other high IQ people willing to share their genome) to check whether the genes found with this method have higher frequencies in their genome compared to the reference population. This would constitute further evidence that these genes are genuinely involved in raising IQ.
OSS.

But this is also by default since with scripting languages (like R or Python) you run human readable code, not machine code.

I'm starting another project for Piffer now so this is on hold