Statistical Methods in Systems Biology - Genetics of Gene Expression
Kruglyak's paper showed genetic mapping of GE was possible and started procedures to analyze it.
Cheung's paper really started genetics of GE in humans.
Kruglyak's complexity paper took a shot at the rationale for the human genome project. He's trying to argue that we can't find most of the loci that contribute to a phenotype such as gene expression
1. what is the relation of his formula at the bottom of 1573 to the usual formula for heritablity? Is his estimate reliable?
2. What is he trying to do with the elaborate simulation of various numbers of additive effects? Separate true r-squared from estimated.
3. Look up transgressive segregation and epistasis. Could these explain some of his results?
Dixon took Cheung's approach further using a bigger array, more individuals, and better analytics.
1. What is 'narrow sense' heritability? How does that differ from Kruglyak's definition?
2. How does the distribution of heritabilities compare with that of Kruglyak?
3. How do the results compare with Cheung's 2005 paper?
Goring actually typed real lymphocytes, getting closer to reality but introducing other sources of variation.
1. Be clear on the difference in the genotype data between Dixon and Goring.
2. How comparable are the Dixon and Goring heritability results? What might explain the differences?
Lee, Pe'er et al used the Brem & Kruklyak data to try to define 'modules' of genes with related function.
1. Go to the methods section at the end. What are they doing differently than Brem & Kruglyak? How does their approach differ from the eQTL approach we've seen so far?
2. Look up 'decision trees'. Why would they pick this form of non-linear statistical representation? What consequences would their choice have for what they can detect?
Their Supplementary Information contains some useful details.