2D. I am writing this post during our EuroEPINOMICS meeting in Tübingen listening to presentation from CoGIE, the EuroEPINOMICS project working on IGE/GGE and Rolandic Epilepsies and RES, the project on rare epilepsies. At some point during the afternoon, I made my selection for the best graph during the presentations today – an overview of the conservation space of epilepsy genes.
Pathogenicity and mutability. The one thing that we have learned about exomes in the last two years is the large amount of data. There are a lot of variants that might confuse you once you move past the clear candidates. For example, some genes simply fool you by looking interesting. While we know about the amazing variability of the Mucin genes, other genes that are highly variable, might be less obvious. Last year, we analyzed a RES quartett for homozygous variants and ended up with an indel in PDE4DIP. Now we know that this gene is highly variable. There are two dimensions how we can classify the genes in the human genome, the proximity to known epilepsy genes and the variability in the human genome.
Pathogenicity. Campbell and collaborators previously published a thorough investigation of available datasets to develop a pathogenicity score for epilepsy genes. This “Campbell score” basically tells you how closely a gene of interest is related to genes known to cause epilepsy. Basically, every gene in the human genome gets a score or rank that we can integrate into our assessment.
Mutability. Classifying genes by the “Petrovski score” or RVIS looks are the variation in the Exome Variant server. Most genes causally involved in epilepsy are amongst the top 25% mutation-intolerant genes. This method was used in the Epi4K paper on Infantile Spasms.
A final remark on SCN9A. In two earlier posts, I wondered about the role of SCN9A as a possible modifier gene. While SCN9A is a promising candidate gene with respect to function, it is highly variable. It is at the 73rd percentile in RVIS. For highly variable genes, a chance distribution of rare variants is possible that results in clustering of variants in cases compared to controls.
Us, this week. We are having an intense and productive meeting in Tübingen. There are many challenges ahead in interpreting the large amount of data that is produced by the sequencers everyday. When I called our Luxembourg meeting the “meeting of the 1000 exomes” last year, I felt that I was slightly exaggerating. Nowadays, we don’t even count anymore. Interpreting data will be the major task for the future and we could see at this meeting that we are getting better and better at developing tools to do so.