How a pathogenic de novo mutation in SCN1A ended up in the Exome Variant Server

The omics flood. Large amounts of sequence data are produced every day and we can use the genetic information of several thousand individuals as controls of any present-day genetic study. However, much of research on “traditional” epilepsy genes had been performed prior to the genomic era and often only included limited control cohorts. This begs the question whether a closer look at the currently available data might provide additional information. Now, a recent paper in the Journal of Neurogenetics investigates the presence of reported mutations for epilepsy in large, available datasets. And the results are surprising.

EVS and 1000 Genomes. We currently have the possibility to query exome data of ~13,000 individuals through the 1000 Genomes Project and the Exome Variant Server (EVS). The EVS includes samples sequenced from studies of heart, lung, and blood disorders, indivuals are usually not screened for epilepsy or other neurological disorders. However, it can be assumed that individuals with severe epileptic encephalopathies would usually not be included in these studies. Cherepanova and colleagues now queried both data sources for variants previously reported to be associated with epilepsy and compiled a map of genetic variation in known epilepsy-associated genes.

Candidates. Cherepanova and colleagues included a list of 19 genes and 280 reported variants in their study. The excluded indels and splice site mutations and queried EVS and 1000 Genomes for the remainder of 208 variants. Of those, 7 variants were reported in the control datasets including variants in SCN1A, SCN1B and EFHC1. In addition, they found a plethora of variants in this dataset predicted to be pathogenic through computational methods. There are two findings of this study, which are particularly perplexing, (1) the existence of a pathogenic SCN1A de novo mutation in controls and (2) the amount of variation found in EFHC1.

 Study by Cherepanova et al. on the presence of epilepsy-related genetic variants in large exome datasets. The authors queried the Exome Variant Server and the 1000 Genomes Project for reported pathogenic variants in 19 genes known to cause epilepsy. The identified 7 variants present in these databases, including one SCN1A mutation previously reported as a de novo mutation in a patient with Severe Infantile Multifocal Epilepsy. In addition, three EFHC1 mutations appear at low frequency in these databases and are reported in 40-100 individuals in these databases.

Study by Cherepanova et al. on the presence of epilepsy-related genetic variants in large exome datasets. The authors queried the Exome Variant Server and the 1000 Genomes Project for reported pathogenic variants in 19 genes known to cause epilepsy. They identified 7 variants present in these databases, including one SCN1A mutation previously reported as a de novo mutation in a patient with Severe Infantile Multifocal Epilepsy. In addition, three EFHC1 mutations appear at low frequency in these databases and are reported in 40-100 individuals in these databases.

SCN1A R1596C. This variant has previously been reported as a de novo mutation in a patient with SIMFE (Severe Infantile Multifocal Epilepsy). The de novo status of this variant and the fact that this variant affected a highly conserved region led to the conclusion that this variant was pathogenic. Now the same variant has been found in a single European American individual in the Exome Variant Server. There are several possible explanations for this observation. First, the data in the exome variant server might be unreliable and falsely positive, i.e. a mutation was called even though it is not there. While this argument was valid for high throughput sequence data in the past, exome data and calling algorithms have improved considerably. Therefore, a purely technical artifact is possible, but not very likely. Secondly, an individual with severe epilepsy was included, either on purpose or by accident. Due to the fact that the data is anonymous, this is impossible to trace back. Third, the mutation is not pathogenic and only represents a low-frequency mutation hotspot. The original variant reported by Harkin and collaborators was de novo, eliminating the possibility that we are dealing with a low frequency population variant. The segregation of the EVS variant cannot be traced back. In addition, the reported variant was found in an atypical phenotype (SIMFE) rather than Dravet Syndrome. Whereas an SCN1A mutation can be found in >80% of patients with Dravet Syndrome and also in mutation-negative patients through novel technologies, the probability of a different cause of the disease might be higher in atypical phenotypes.

EFHC1 – the chameleon. The EFHC1 gene is the big loser of the study by Cherepanova and colleagues. EFHC1 was initially reported in families with Juvenile Myoclonic Epilepsy and subsequently reported in different epilepsies including recessive mutations in epileptic encephalopathies. Three reported EFHC1 mutations were found in EVS at very low frequency. However, these variants were found in 40-100 individuals, making a purely technical artifact unlikely. These three variants (F229L, P77T, R221H) were reported in affected and unaffected individuals in the initial study and might represent susceptibility variants. In addition, the EFHC1 gene is the single gene with multiple truncation mutations identified in EVS, which were found in four individuals in total. In summary, some reported EFHC1 epilepsy-associated variants were found to be low frequency population variants and the spectrum of variants in the population also includes truncation mutations. These findings make the interpretation of identified EFHC1 mutations challenging and the difficulties in understanding the biology of this gene adds to this.

Conclusions. There are two main conclusions from the study by Cherepanova and colleagues. First, in known epilepsy genes such as SCN1A, some of the reported variants may be revisited. Assessing the de novo status of a causative variant is virtually mandatory. SCN1A represents the epilepsy gene with the highest number of reported mutations. Therefore, it will naturally be the first gene where the additional complexity added by the available omics data will manifest. The situation is even more complex in the p.Arg1912X variant that is found in 1000 Genomes and was previously found and published in four patients with Dravet Syndrome. Of these four mutations, a de novo status could be found twice and paternal inheritance in one patient, while segregation was unknown in the fourth patient. Accordingly, while this variant may have variable penetrance, it might also represent a low-frequency mutation hotspot. The second main conclusion relates to the spectrum of variants in epilepsy genes that are more uncertain. EFHC1 is a prime example of a gene in which the distinction of pathogenic from normal variation is difficult and a meaningful biological assay to assess epilepsy-related functional defects is not available. Accordingly, novel concepts are necessary to tell pathogenic variants from genomic noise in these genes.

18 thoughts on “How a pathogenic de novo mutation in SCN1A ended up in the Exome Variant Server

  1. Pingback: Exome sequencing in epileptic encephalopathies – a classification of de novo mutations | Beyond the Ion Channel

  2. Pingback: Less is more – gene identification in epileptic encephalopathies through targeted resequencing | Beyond the Ion Channel

  3. Pingback: Genes, patents and the Myriad story | Beyond the Ion Channel

  4. Pingback: Transmission of rare variants in parent-offspring trios – power or no power? | Beyond the Ion Channel

  5. Pingback: Genetic imaging in Dravet Syndrome – variation on a theme? | Beyond the Ion Channel

  6. Pingback: Guilt by association: SCN1A in Temporal Lobe Epilepsy | Beyond the Ion Channel

  7. Pingback: Why I am still struggling with SCN9A in Dravet Syndrome | Beyond the Ion Channel

  8. Pingback: Are there incidental findings in exomes that require immediate action? | Beyond the Ion Channel

  9. Pingback: Mutation intolerance – why some genes withstand mutations and others don’t | Beyond the Ion Channel

  10. Pingback: Beneath the surface – the role of small inherited CNVs in autism | Beyond the Ion Channel

  11. Pingback: From unaffected to Dravet Syndrome – extreme SCN1A phenotypes in a large GEFS+ family | Beyond the Ion Channel

  12. Pingback: 2B or not 2B – mutations in GRIN2B and Infantile Spasms | Beyond the Ion Channel

  13. Pingback: Modifier genes in Dravet Syndrome: where to look and how to find them | Beyond the Ion Channel

  14. Pingback: CACNA2D2, the ducky mouse, and what it takes to be an epilepsy gene | Beyond the Ion Channel

  15. Pingback: Five questions you should be asking the ILAE Genetics Commission | Beyond the Ion Channel

  16. Pingback: The age of mega-genomics, type 2 diabetes, and protective variants in SLC30A8 | Beyond the Ion Channel

  17. Pingback: A polygenic trickle of rare disruptive variants in schizophrenia | Beyond the Ion Channel

  18. Pingback: SCN1A – This is what you need to know in 2014 | Beyond the Ion Channel

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s