Sequence first. There are larger genetic studies but not too many. In a recent study in Nature Genetics, roughly 150,000 individuals were genotyped to assess the importance of rare, disruptive variants in SLC30A8 in type 2 diabetes. This genomic tour de force was made possible by available and curated databases that could be tapped to extract the necessary genetic information. Also, this study highlights some of the surprises that we can expect by mining the human genome for disease-related information. Rare, disruptive variants in SLC30A8 protect against type 2 diabetes. Let’s review why these rare, protective genetic factors might be particularly important for biomedical research and what kind of studies we need to identify them.
What to find under GWAS peaks. The current study by Flannick and collaborators started as a gene hunting exercise to identify causative rare variants in 115 genes that were under or close to GWAS peaks for type 2 diabetes. In brief, GWAS peaks are regions of the human genome with common variants significantly associated with a given disease. It is often impossible to identify a single variant and attribute causality, as the identified variant may simply be a marker of associated variants. In order to obtain more information, large sequencing efforts are underway to find rare and possibly disruptive variants that might have a stronger effect. When Flannick and collaborators assessed rare variants in the SCL30A8 gene, coding for a pancreatic zinc transporter, they observed something strange: a single nonsense variant in the SLC30A8 gene was more frequent in controls than in cases. This particular variant that truncated the SLC30A8 protein was found in 7/~6400 cases with type 2 diabetes and 21/~7500 controls. This difference is barely statistically significant, but it piqued the authors’ interest. Adding additional cases and controls for a total of 30,000 cases and 120,000 controls confirmed the initial suspicion. The authors found that loss-of-function mutations in SLC30A8 are twice as frequent in controls compared to cases. To put it differently: carrying a loss-of-function mutation in SLC30A8 cuts the risk for type 2 diabetes in half (~65% reduction).
It took us five minutes. The New York Times recently featured this study and looked behind the scenes. While the genotyping efforts within this study appear to be enormous, some results of this study could be virtually generated on the spot. NYT quotes a deCODE scientist from Iceland who suggested that the actual database mining of existing genetic and phenotypic data was only a minor effort. In addition to the contribution of important samples for this study, the deCODE data could also provide important information on the lower than average blood glucose levels in individuals carrying particular loss-of-function variants. Generating such data for the purpose of a large-scale association study is prohibitive due to the enormous efforts that would be required. Accordingly, existing databases that allow such information to be extracted are essential.
Lesson learnt. Rare protective variants may be particularly relevant as they hint towards genes that are good targets for future drug developments. Basically, the function of a given protein is easier to inhibit than to substitute, which makes SLC30A8 a good target. Also, the numbers in this study are impressive. With an overall cohort of roughly 150,000 individuals, the authors generated a significant finding that was impossible to achieve in smaller cohorts. In fact, in some subcohorts of 500-1000 individuals, a single rare SLC30A8 variant was only observed in patients. While such a finding is completely in line with random fluctuation, it is easy to understand how one might be misled by such a finding. There has always been a suggestion that rare variant association studies need very large samples sizes to arrive at solid conclusions. The current study nicely demonstrates this point. Sample sizes in neurodevelopmental disorders are much smaller, and a comparable study in the field of neurogenetics is currently difficult to envision as most epilepsies are rare diseases. Nevertheless, the current study by Flannick and collaborators reminds us that joint genetic and phenotypic databases will be one of the key instruments for future genetic studies.