Papers of the week – Encephalitis-antibodies, FAN1, Art and Parent-of-Origin Effects

Dennis' paper of the week

Biggest surprise this week: Imprinted genes interact with non-imprinted genes frequently. But first sequencing reports, statistical frameworks for rare variants analyzes and an impressive translational result.

A novel encephalitis with seizures and the analysis of the effects of antibodies. In their study published in LANCET NEUROLOGY Petit-Pedrol and coworkers characterized serum and CSF samples for antigens in 140 patients with encephalitis, seizures or status epilepticus as well as antibodies to unknown neurophil antigens. High titres of serum and CSF GABAA receptor antibodies are reported to be associated with a severe form of encephalitis with seizures, refractory status epilepticus, or both, which could be exploited for immunotherapy with 15 patients.

Continue reading

An inconvenient truth – segregation of monogenic variants in small families

Climate change. In the era of exome and genome sequencing, it might be worthwhile revisiting the merit of family studies in epilepsy research. Seizure disorders are known to have a highly diverse genetic architecture. When singleton studies identify a single, unique gene finding, this discovery usually does not provide much information about the potential causal role of the variant given the high degree of genomic noise. In contrast, family studies are usually considered more robust, as segregation of variants can be traced. Here is the inconvenient truth: unless the family is very large, segregation of possibly monogenic variants adds little information given the vast amount of variants present in our genomes. Continue reading

Dealing with the genetic incidentaloma – the ACMG recommendations on incidental findings in clinical exome and genome sequencing

Clinical genome sequencing. While exome and genome sequencing is widely used as a research tool, these technologies are also routinely applied in a clinical setting. As with many other data-rich diagnostic tests in medicine, there is an ongoing question on how to deal with potentially relevant findings that turn up indicentally. Now the American College of Medical Genetics and Genomics (ACMG) has released their long-expected recommendations on data return of incidental findings in clinical exome and genome sequencing. Their recommendations provide an interesting basis for discussion on what to do with genetic findings that are found by chance. Continue reading

The sequester and biomedical research – lessons for Europe

Transatlantic. The so-called sequester, automatic spending cuts across the board- have gone into effect in the US and also impact on the level of public funding for biomedical research. In a recent commentary in JAMA, Ezekiel Emmanuel comments on the decline of support for the NIH, which he believes goes far beyond the results of the spending cuts and can be traced back to four main factors. In this post, we would like to discuss to what extent his four main arguments also apply to the European scientific community. Continue reading

The return of TBC1D24

First of its kind. In 2010, a virtually unknown gene became the first epilepsy gene to be discovered through massive parallel sequencing techniques. This gene, TBC1D24, was found in two recessive families with different types of epilepsy. Afterwards, it became silent around this gene with no further findings. Now, a recent paper reports on a third family with a mutation in this gene with a complex phenotype of epileptic encephalopathy and movement disorders. As the mutation is located in an alternative exon of this gene, this raises important issues on how we identify and interpret mutations. Continue reading

Big data now, scientific revolutions later

Sequence databases are not the only repositories that see exponential growth. The internet helps companies to collect information in unprecedented orders of magnitude, which has spurned the development of new software solutions. “Big data” is the term that stuck with it and blew life into the data analysis. Widespread coverage ensued, including a series of blog posts published by the New York Times. Data produced by sequencing is big: Current hard drives are too slow for raw data acquisition in modern sequencers and we have to ship the discs because we lack the bandwidth to transmit the data via the internet. But we process them only once and in a couple of years from now they can be reproduced with ease.

Large-scale data collection is once again hailed as the next big thing and spiced with calls for a revolution in science. In 2008, Wired even announced the end of theory. Experimental scientists make good use of hypotheses and targeted experiments under the scientific method the last time I checked though. A TEDMED12 presentation by Atul Butte, bioinformatician at Stanford is symptomatic in it’s revolutionary language and caused concern with Florian Markowetz, bioinformatician at the Cancer Center in Cambdridge, UK (and a Facebook friend of mine). Florian complains and explains that the quantitative changes in the data does not lead to a new quality of science and calls for better theories and model development. He’s right, although the issue of data acquisition and source material had deserved more attention (what can you expect from a mathematician).

Big data

The part of the data we care about in biology is quite moderate but note that the computing resources of the BGI are in league with the Large Hadron Collider.

We don’t know what to expect from e.g. exome sequencing for a particular disease and the only way to find out is to do the experiment, look at the data, come up with guestimates and confirm your finding in the next round. Current data gathering and analysis projects in the life sciences won’t be classified as big data by the next sweep of scientists anyway. They are mere community technology exploration projects using ad hoc solutions.

Be literate when the exome goes clinical

Exomes on Twitter. Two different trains of thoughts eventually prompted me to write this post. First, a report of a father identifying the mutation responsible for his son’s disease pretty much dominated the exome-related twittersphere. In Hunting down my son’s killer, Matt Might describes his family’s journey that finally led to the identification of the gene coding for N-Glycanase 1 as the cause of his son’s disease, West Syndrome with associated features such as liver problems. The exome sequencing that finally led to the discovery was part of a larger program on identifying the genetic basis of unknown, putatively genetic disorders reported in a paper by Anna Need and colleagues, which is available through open access. This paper is an interesting proof-of-principle study that exome sequencing is ready for prime time. Need and colleagues suggest exome sequencing can find causal mutations in up to 50% of patients. By the way, a gene also that turned up again was SCN2A in a patient with severe intellectual disability, developmental delay, infantile spasms, hypotonia and minor dysmorphisms. This represents a novel SCN2A-related phenotype, expanding the spectrum to severe epileptic encephalopathies.

The exome consult. My second experience last week was my first “exome consult”. A colleague asked me to look at a gene list of a patient to see whether any of the genes identified (there were 300+ genes) might be related to the patient’s epilepsy phenotype. Since I wasn’t sure how to best handle this, I tried to run an automated PubMed search for combination of 20 search terms with a small R script I wrote. Nothing really convincing came up except the realisation that this will be an issue that we will be increasingly faced in the future: working our way through exome dataset after the first “flush” of data analysis did not reveal convincing results. Two terms that came to my mind were bioinformatic literacy as something that we need to improve and Program or be Programmed, a book by Douglas Rushkoff on the “Ten commands of the Digital Age”. In his book, he basically points out that in the future, understanding rather than simply using IT will be crucial.

The cost of interpretation is rising. The Genome Center in Nijmegen suggests on their homepage that by the year 2020, whole-genome sequencing will be a standard tool in medical research.  What this webpage does not say is that by 2020, 95% of the effort will not go into the technical aspects of data generation, but into data interpretation. For biotechnology, interpretation will be the largest marketing sector.

By 2020, probably more than 10 million genomes will have been sequenced. Data interpretation rather than data generation will represent the most pressing issue.

So, what about epilepsy? “50% of cases to be identified” sounds good for any grant proposal that I would write, but this might be a clear overestimate. Need and colleagues used a highly selected patient population and even in the variants they identified, causality is sometimes difficult to assess. We are maybe much further away from clinical exome sequencing in the epilepsies than we would like to admit. The only reference point we have for seizure disorders to date is large datasets for patients with autism and intellectual disability. While some genes with overlapping phenotypes can be identified, we would virtually be drowning in exome data without being capable of making sense of this.

10,000 exomes now. I would like to predict that after having identified some low-hanging fruits with monogenic disorders, 10,000 or more “epilepsy exomes” would have to be collected before making significant progress. It is, therefore, crucial not to be tempted by wishful thinking that particular epilepsy subtypes necessarily have to be monogenic, as in the case of epileptic encephalopathies or other severe epilepsies. Much of the genetic architecture of the epilepsies might be more complex than anticipated, requiring larger cohorts and unanticipated perseverance.