Genomics meets linkage. This blog post is about family studies in epilepsy genetics. One of my tasks for the next two months is to write the “Trilateral Grant” – we were invited to submit a full proposal for a German-Israeli-Palestinian grant by the German Research Foundation (DFG) on the genetics of familial epilepsies. As keeping up our blogging schedule will be my other big task for the coming months, I thought that I could combine both and explore some topics regarding family studies on this blog. Let’s start with a sobering fact – small dominant families remain difficult to solve, not because of too little but rather too much genetic data.
When gene identification works. Linkage analysis identifies genetic markers that travel together with affected individuals in a family. If a given family is sufficiently large (more than 9-10 affected individuals), a genetic interval can usually be identified that significantly stands out. If combined with exome and genome sequencing, there is a good chance that the causative gene can be identified. So far, so good. Linkage plus exome analysis works beautifully in large families – but what can we learn from smaller dominant families? To what extent can we zero in on the culprit gene if there are only 2-3 affected individuals? The disappointing answer: these families won’t help us much.
River of variants. The problem with identifying the causative gene in small families can be best explained when looking at the transmission of rare variants. Depending on the calling and filter algorithm, 200-300 rare variants are transmitted from either parent to a child, independent of disease. In an affected parent-child pair, it could be any of these variants. As rare variants are transmitted with a probability of 50%, there is little genetic information that the affected father or mother add. If there is another affected sibling, the number of candidate variants will be cut in half approximately, which still leaves us with 100-150 variants to choose from. If the gene is a known candidate, the story is straightforward – for novel genes, the genetic evidence is not sufficient enough.
“RRB”. I tried to demonstrate this dilemma in a slightly different way – let’s make up a measure called “relative research benefit” which is a hypothetical measure of genetic knowledge gained, corrected for recruitment effort. For example, with a predefined recruitment budget, how much more information would be obtain if we hypothetically exchanged 100 recruited single cases for 20 parent-child duos that could be recruited with the same effort. The parent-child duos are harder to identify. There are phone calls to be made, and you probably need to hop into your car and do field trips. Consequently, with a defined research budget, we will recruit fewer unrelated patients. We could also spend all our efforts on recruiting a single family with 8-9 affected individuals in exchange for this. Which option would be most beneficial for gene discovery?
Flaws. I admit that this model is full of flaws. For example, recruitment effort might not be compared that easily. For some centers (e.g. our recruitment pipeline in Israel and Palestine) family recruitment might be straightforward, while other centers rarely see familial cases. Also, this model ignores the fact that small dominant families might significantly add to the body of knowledge if known or likely candidates are identified. Nevertheless, I wanted to put the following suggestion out there: for small dominant families, the relative research benefit is lower than for singletons. The additional effort put into recruiting parent-sib pairs or small dominant families is outweighed by larger number of samples through singleton recruitment.
Association beats linkage. Ever since the seminal paper in 1996 by Risch and Merikangas, which set the stage for the era of genome-wide association studies, linkage studies (i.e. family studies) are only the second best choice. Association studies are more powerful when it comes to the genetics of complex disease. Even though family studies in small families may still assume a monogenic disease model, this inheritance model is impossible to prove on the basis of these families – accordingly, we slip into discussions about complex inheritance where association studies have proved to be superior. For example, in the field of microdeletions, family studies were confusing rather than helpful. Even variants with a strong genetic risk usually do not segregate.
Ascending from the valley of despair. What can you if your main area is family studies and recruitment of small families is your focus? How can you escape the difficulties that are inherent to genetic studies in small dominant families? By changing the ground rules! First, small families carry additional genetic information compared to singletons, and if family recruitment is efficient, it may almost reach the efficiency of singleton recruitment, adding the increase in genetic information virtually at no cost. Second, small families may show you the value of non-established candidates (see “Flaws”). There is an entire cosmos of genetic variants that were only found in individual families before. Even small families can add to this if they add the crucial additional family. For this purpose, it is important that data are accessible for other researchers. Third, the field is evolving. The theoretical number of 100-200 variants identified in a small multiplex family is probably an overestimate – in fact we might get better and better at guessing the causative gene based on our gain of knowledge about the existing variation in the human genome and the particular phenotype.