Monday, 18 February 2013

Heritability debates

It is time to try and get to grips with what is happening in genetics/epidemiology. I tend to rely on Meena Kumari, the ICLS expert, but really should get a bit more self-sufficient before I start to write my book. For a long time I was happy that a lot of things which "look genetic" could be, as done elegantly by Debbe Lawlor, alternatively seen as evidence of the huge power of the life course. People are not just born into a home environment, they are to a large extent born into a life course. As Debbie and George Davey Smith showed awhile ago, someone with a high blood level of Vitamin A might not be healthy because of the vitamin but because having a high level of the vitamin was a 'biological embedding' of a whole string of favorable circumstances stretching over years of their lives. Similarly Hilary Graham showed that being a smoker was associated in women with a string of unfavorable experiences in a similar way. I always thought these were profound insights and these papers have greatly influenced my thinking.

So I was not too surprised by the revelations from Genome Wide Association Studies around 2009-10 that only a very small amount of the similarity between people in a lot of characteristics seemed to be explicable in terms of "a gene" (or gene variant, all corrections gladly received). I must admit that only 4% of variation in height attributable to genetics did seem rather small. And if someone had done it for red hair or green eyes, that would have surprised me even more. I don't think anyone did this as hair and eye colour do not carry the same moral/ideological weight as height. But the shock was due to the wide difference between what GWAS studies showed and the estimates of heritability obtained from twin studies, which had been the classic method for assessing the influence of 'genes v. ennvironment' before the genome was sequenced and GWAS became possible.

When people do GWAS (Genome Wide Association Studies), it has 2 phases. First you take a 'training sample'. This has to be very large, and one looks for an association of the characteristic of interest (lets say height) with a Single Nucleotide Polymorphism (SNP) given that it is now possible to include vast numbers of these on a 'SNIP-CHIP'. Because you have no hypothesis about which SNP will be associated with (e.g.) height, you need to get an enormously significant association, as I was once told "p with many many zeroes". The second phase is to take this SNP and see what proportion of the variation in (e.g.) height it explains in a new sample independent of the training sample, the "validation sample". When you do this for height you can explain around 4% of the variation in height. This was quite a shock.

But lets face it what everyone cares about is really how heritable are other personal characteristics with ideological meaning such as intelligence or mental health. As intelligence and mental health are generally acknowledged to be complex characteristics, much more than height, what hope was there?

But wait! Along came Professor Visscher and colleagues. Their paper in Twin Research and Human Genetics (vol 13(6) pp. 517-24) aims to set out their method in a way that is more comprehensible than their original paper in Nature Genetics that shows after all about 40% of height  is explained by genetics. Never mind, they said, if there is no individual SNP associated significantly with height that explained more than 4% of it. What you need to do is to add up all the SNPs that are associated with height, but so weakly that they do not meet the very stringent requirements for significance used in GWAS studies. This can then be extended to intelligence with results that are summarised as follows:

Data from twin and family studies are consistent with a high heritability of intelligence, but this inference has been controversial. We conducted a genome-wide analysis of 3511 unrelated adults with data on 549692 single nucleotide polymorphisms (SNPs) and detailed phenotypes on cognitive traits. We estimate that 40% of the variation in crystallized-type intelligence and 51% of the variation in fluid-type intelligence between individuals is accounted for by linkage disequilibrium between genotyped common SNP markers and unknown causal variants. These estimates provide lower bounds for the narrow-sense heritability of the traits. We partitioned genetic variation on individual chromosomes and found that, on average, longer chromosomes explain more variation. Finally, using just SNP data we predicted ~1% of the variance of crystallized and fluid cognitive phenotypes in an independent sample (P=0.009 and 0.028, respectively). Our results unequivocally confirm that a substantial proportion of individual differences in human intelligence is due to genetic variation, and are consistent with many genes of small effects underlying the additive genetic influences on intelligence. [my emphasis]G Davies, A Tenesa, A Payton, J Yang, S E Harris, D Liewald, X Ke, S Le Hellard, A Christoforou, M Luciano, K McGhee, L Lopez, A J Gow, J Corley, P Redmond, H C Fox, P Haggarty, L J Whalley, G McNeill, M E Goddard, T Espeseth, A J Lundervold, I Reinvang, A Pickles, V M Steen, W Ollier, D J Porteous, M Horan, J M Starr, N Pendleton, P M Visscher and I J Deary "Genome-wide association studies establish that human intelligence is highly heritable and polygenic"  Molecular Psychiatry (2011) 16, 996–1005;
However, Makowsky et al. for example threw some doubt on the enterprise in PLOS Genetics thus:
 Recently, a large proportion of the “missing heritability” for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h2 up to 0.83, R2 up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R2 values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (~0.80), substantial room for improvement remains. Robert Makowsky, Nicholas M. Pajewski, Yann C. Klimentidis,Ana I. Vazquez,Christine W. Duarte,David B. Allison, Gustavo de los Campos. Beyond Missing Heritability: Prediction of Complex Traits. PLOS Genetics 2011; 7(4); e1002051
The furthest I can get today is to comment on this phrase "substantial room for improvement". It sounds as if people have already decided what % of the trait is "heritable" and are now trying to find a method that gives the answer they already expect and want to find.


  1. This comment has been removed by the author.

  2. It's interesting to see hard-core genetics research agreeing with the argument put so well a few years ago by Cosma Shalizi, contra the g-merchants, that factor-analytic studys of IQ are entirely consistent with it having many sources; see and

    Brendan Halpin

    1. I didn't know that, many thanks Brendan. Is it not that Visscher et al think IQ will turn out to have many GENETIC sources though?