How synonymous are synonymous SNPs?

Read with caution!

This post was written during early stages of trying to understand a complex scientific problem, and we didn't get everything right. The original author no longer endorses the content of this post. It is being left online for historical reasons, but read at your own risk.

Synonymous SNPs are SNPs in protein-coding exons that don’t change the amino acid thanks to the redundancy of the genetic code. When researchers analyze sequence data, synonymous SNPs do figure into various quality control metrics, but are often thrown out when people search for the causal mutation behind a disease or a trait. A typical example of this is Ng 2010‘s search for the causal mutation of Kabuki syndrome:

our analyses focused primarily on nonsynonymous (NS) variants, splice acceptor and donor site mutations (SS) and coding indels (I), anticipating that synonymous variants were far less likely to be pathogenic.

This is everyday practice when working with sequence data. Throwing synonymous SNPs out seems less common in SNP array studies, because there you don’t sequence every possible genetic variation, so a synonymous SNP might well be tagging something else unseen-but-important via linkage disequilibrium.

I started to wonder, are there any reasons why a synonymous SNP might actually be causal for a trait? Yes, there are. Here are a few possible reasons.

microRNAs. Brest 2011, studying Crohn’s disease, finds that a synonymous SNP (c.313C>T in IGRM) makes an enormous difference in whether miR-196 binds to IGRM. Specifically, miR-196 binds well to the C allele, downregulating IGRM, but not to the T allele. The T allele thus leads to higher risk of Crohn’s disease because overexpression of IGRM reduces the cell’s ability to autophagize invading bacteria.
Translation efficiency under amino acid starved conditions. Different codons bind to different tRNAs, and different tRNAs have different affinity for amino acids. Under conditions of amino acid abundance, there’s no difference in translation levels between codons, but a few investigators have shown that when bacteria are starved of a certain amino acid, proteins with the more affine codon will continue to be expressed while translation of proteins with the less affine codon will be suppressed. As far as I can tell, Dittmar 2005 was the first to show this; there continues to be work on this issue e.g. Subramaniam 2012. As far as I can tell all of this work has been done in bacteria and I don’t know whether it’s been shown in animals or if such specific amino acid-starved conditions ever even arise in humans.
mRNA secondary structure. This in turn can impact two things. (3a) The efficiency of translation and thus the amount of protein produced. See Wang 2006 and Nackley 2006 for two examples. Or (3b) the speed at which translation proceeds along the mRNA, and since proteins begin to fold while they are still being translated, this can lead to totally different protein folding. Kimchi-Sarfaty 2007 seems to have discovered one such example, where a synonymous SNP leads to identical mRNA and protein levels but different protein folding and function in MDR1. For a review of both of these mechanisms see Parmley 2007.
Alternative splicing. In theory, a synonymous mutation might affect the way an RNA gets spliced. A 2011 news feature in Nature hints at this possibility but does not directly cite any studies that found such a case. When I Googled it I found Faa 2010 who found a mild form of cystic fibrosis caused by different splice products due to a synonymous mutation. However I hesitate to include this in the list here, because mutations that create a splice site cannot truly be called synonymous to begin with. For instance, if you use snpEff to annotate sequence variants, its classification scheme will classify splice site mutations as ‘high impact’, as opposed to synonymous SNPs which are classified as ‘low impact’. It seems imaginable that some synonymous SNP might have some subtle effect on splicing without actually creating or destroying a splice site, and if so, then I would consider this as a fourth mechanism, but so far I haven’t found any examples of it.

There may yet be other mechanisms as well. The Nature article points out that some synonymous sites are strongly conserved, and perhaps we don’t yet know all the reasons why this might be.

PS. Thanks to Larry Parnell, Variable Genome, and phys.org for helping point me toward the relevant papers.