Heterogeneity. Family-based exome sequencing or trio exome sequencing for de novo mutations is currently the method of choice to identify genetic risk factors in neurodevelopmental disorders. However, given the increasingly recognized variability in the human genome, the hunt for causative de novo mutations is sometimes an uphill battle – it is impossible to distinguish causal mutations from random events unless genes are affected repeatedly. In a recent publication in Nature, Fromer and colleagues present the most comprehensive search for de novo mutations in schizophrenia to date. They observe an incredible genetic heterogeneity that reflects the genetic architecture of neurodevelopmental disorders.
Exome history lesson. Let’s start with a brief history of trio exome sequencing. When trio sequencing was first established for patients with unexplained neurodevelopmental disorders including intellectual disability or autism, there was great enthusiasm. This new method systematically queried the entire coding sequence of the genome in parents and the affected child. This identified a de novo mutation in the vast majority of patients and at this time (2009/2010), any de novo mutation was still considered pathogenic. Only a few years later, we know that this is not true. Most individuals, affected by disease or unaffected, carry a new mutation with functional consequences in one of their genes that was not present in their parents. This is the new normal. Furthermore, the additional disease-related de novo events in diseases such as autism or intellectual disability are so rare that they do not impact on the overall frequency of de novo mutations when comparing affected versus unaffected siblings in cohorts of 100-200 families. Accordingly, it has become extremely difficult for any novel gene to be called causal. Either this gene is an already known disease gene or it is found to be hit by de novo mutations in more patients that you would expect by chance. Neither of these two conditions is met in the genes that Fromer and collaborators identify in schizophrenia.
18 recurrent genes. In total, Fromer and collaborators performed trio exome sequencing on 623 parent-offspring trios with schizophrenia. They identified 637 de novo mutations in 617 probands and queried their data with respect to four hypotheses:
- Hypothesis 1: Is there an increase of de novo mutations?
- Answer: No.
- Hypothesis 2: Is there genic recurrence, i.e. are there genes affected more than would be expected by chance?
- Answer: Barely, see below.
- Hypothesis 3: Is there enrichment for particular gene sets?
- Answer: Yes, for synaptic proteins.
- Hypothesis 4: Is there enrichment for autism/ID genes?
- Answer: Barely.
In summary, particularly the negative and barely positive answers for hypotheses 1 and 2 are unexpected. In such a large cohort, we would have expected true candidate genes to recur on a frequent basis – even a gene mutated in 1% of patients should have been affected ~6 times. However, only 18 genes are mutated in two patients each. There are no triple hits, quadruple hits or more frequently mutated genes, not even genes such as Titin (TTN). And 14/18 genes have never been seen in any study of any neurodevelopmental disease before. I have compared genes affected by de novo mutations and the 18 genes with double hits to all previous studies on trio exome sequencing in neurodevelopmental disorders. Click these links [double hits, all de novos] to download the tables, the information was collated by the authors in the Supplementary Tables. I used this data for an R script and added the publicly available Epi4K data, RVIS and Path Score.
Context. Lets put the gene findings into the context of what is known in epilepsy. None of the 18 recurrent genes has ever been seen in epilepsy. One gene (KIAA1244) was previously found to carry a de novo mutation (DNM) in a control trio. Of all genes found by Fromer et al., 17 were also found to carry a DNM in the published Epi4K data set, three of which were also found with DNMs in controls. Only two genes were found in other neurodevelopmental phenotypes previously, including ALMS1 and SCN2A. ALMS1 is a gene with a relative poor genic intolerance score (RVIS). Recessive mutations in ALMS1 cause Alstrom Syndrome, a disease characterized by blindness, endocrinological and cardiac problems, but not epilepsy. Within the context of the Fromer study, DNMs in this gene are probably genomic noise. SCN2A is probably the gene that is found to be affected by DNMs in most neurodevelopmental phenotypes, which is an interesting an unexpected twist on the established gene for Benign Familial Neonatal-Infantile Seizures (BFNIS). Some of the other genes are also interesting candidates including a mutation in GRIN2A.
Conclusion. Even though most genes do not make sense by themselves, there is an overrepresentation of genes involved in synaptic function and, specifically, activity-regulated cytoskeletal neuronal proteins and NMDA receptors. This observation demonstrates that trio exome data sets can be paired with network analyses to identify possibly causal networks. The current study also shows that cohorts for trio exome sequencing are slowly approaching four-digit numbers. It will be interesting to see what else this data yields once the sample sizes reach the next level.