Heritability 2.0. Genome-wide association studies (GWAS) have acquired a slightly negative connotation in the last two years as the results of the enormous efforts were moderate at best. Even though several hundreds of variants have been identified as susceptibility genes for various diseases, the identified genetic risk factors only explain a tiny fraction of the risk for these diseases. Much of what causes common and rare diseases is still unknown – there is a vast discrepancy between population estimates of the genetic contribution and the contribution explained through identified genetic risk factors. This phenomenon has been labeled the “missing heritability”. Now, a recent study using novel statistical tools for GWAS data finds that there is not that much missing after all…
The liability to schizophrenia – nature or nurture. Schizophrenia is a psychiatric disorder characterized by impairment of thought processes and lack of emotional responsiveness. The cause of schizophrenia is unknown, but twin studies and family aggregation studies point clearly towards a genetic contribution. In contrast to other neurodevelopmental disorders, a genetic contribution has always been a matter of debate in schizophrenia. For example, the concordance in twins was used as an argument in favor and against a genetic impact in schizophrenia. As of 2012, a modest genetic contribution to schizophrenia is beyond doubt and several GWAS have identified some genetic risk factors. However, in parallel to most other disorders, the identified SNPs only explain a very small fraction of the liability, i.e. the tendency for schizophrenia. In the era of exome sequencing, this calls out for a large-scale hunt for rare variants. However, we might be tricked by the polygenic nature of the disease.
Heritability and Francis Galton. As already mentioned in a previous post, heritability is a tricky subject that should best be avoided in modern molecular genetics. Heritability refers to the proportion of the variance of the liability to a given disease explained by genetic factors in a population. Traditionally, heritability is estimated through studies of twins or families and the history of estimating heritability is tightly interconnected with the development of modern statistical methods. In fact, the name regression analysis is taken from Francis Galton’s observation that the height of sons of tall fathers regresses (i.e. goes back) towards the mean height in the population. He basically observed that tall fathers have tall sons, but that on average, the sons are smaller than the fathers. Galton’s observations are also the first attempt to assess the connection between the height of the parents and the height of the children. The concepts developed Galton eventually resulted in our modern understanding of heritability. By the way, Galton also conceived the first twin studies.
Heritability through SNP data. Lee and colleagues now apply a novel concept to GWAS datasets on schizophrenia, trying to estimate heritability through SNP data. Basically, SNP data provide information about relatedness between individuals, which can be used to estimate the proportion of variation in susceptibility due to SNPs (aka liability aka heritability) – a twin-free approach for estimating heritability. The authors have pioneered this method on human height and find that 40% of the variation in human height is due to common SNPs. The statistical methodology behind this approach is not easy to understand and the authors subsequently published a separate paper trying to explain the methodology of their study in simpler terms using a question-and-answer format. Applying this technique to schizophrenia, they now estimate the contribution of common SNPs to be 30% – a stark contrast to the <5% explained by the significant SNPs from GWAS studies.
30% vs. 5% – the true polygenic nature of human disease. The difference is perplexing. The known SNPs from GWAS explain little of the heritability, but looking at the global contribution of all SNPs, a large proportion of the tendency for the disease is explained. What exactly is happening here? GWAS genotype several thousands of individuals for several hundred thousands markers for individual, but only few markers are followed up, as stringent significance levels are required to distinguish real associations from genomic noise. This means that >99.99% of the data from GWAS is thrown away. Lee and colleagues simply look at the additive effects of all SNPs (contribution 30%) in contrast to the few SNPs that are significant (contribution 5%) and this makes the difference. Basically, much of the heritability is hidden in many individual SNPs, but each SNP only has a little, non-significant effect that is impossible to pinpoint in isolation. This finding sheds an interesting light on the genetic basis of disease. Much of the disease risk is in fact hidden in common SNPs, but the precise SNP is difficult if not impossible to identify.
Common versus rare variant – again. The study by Lee and colleagues brings up an old question again: should efforts be focused on applying novel deep sequencing technologies (exome or genome sequencing) or larger GWAS? The fact that 90% of the common variants contributing to schizophrenia remain to be detected argues for larger GWAS rather than deep sequencing. Why look “deeper” when the answer is already in plain sight, but short of sufficient sample sizes? However, there is no guarantee that the contributing common variants may ever be detected. Maybe their contribution is too small and would require unreasonably high samples sizes. Maybe we would have to get used to the idea that there is in fact missing heritability – this time due to common variants that we see but cannot grasp.
Application to EuroEPINOMICS. It is tempting to apply the same methodology to existing epilepsy GWAS data on focal epilepsy and IGE/GGE to estimate the contribution of common variants to IGE/GGE or focal epilepsy. Also, the family of ion channel genes might be looked at separately to corroborate the channelopathy hypothesis for common epilepsies. While EuroEPINOMICS mainly focuses on next generation sequencing technologies, common genetic variants and GWAS 2.0 may become interesting again in the future.
A brief disclaimer. I have used the term “heritability” quite loosely in this post. In fact, Lee and colleagues don’t estimate the heritability, but the proportion of the liability to schizophrenia that is due to common SNPs captured on the array that was used. This measure of “SNP-heritability” does not include the effect of not-included SNPs and rare variants. Also, their study looks at heritability in the narrow sense, not including non-additive effects, i.e. the effect of gene-gene interaction and gene-environment interaction. In brief, there is much else out there that can be causative, as well.