How to test for selection (Adaptive Recursion III)

ResearchBlogging.orgBefore my unintended break from blogging, I had started writing about the work by Stolz, Feder, and Velez on bioluminescent color in the Jamaican click beetle, Poryphorus plagiophthalamus (here and here). In this organism are two sets of bioluminescent organs – a dorsal pair and a single ventral organ. Not only can the two sets of organs differ in color within an organism, but – and this is what makes the species special – the colors can be polymorphic within the species. By that I mean within a population, one can find green and yellow-green dorsal organs in addition to yellow-green, yellow, and orange ventral organs. Variation of bioluminescent color within the population is apparently unheard of, even within the rest of the Poryphorus genus. The polymorphism of bioluminescent color provides a simple system for evolutionary and ecological study (as I point out in my first post about the species).

Fig 1: (A) Paired dorsal light organs of P. plagiophthalamus. (B) Allele colors in dorsal organs: green (dGR) and yellow–green (dYG). (C) Ventral light organ of a yellow bioluminescing beetle. (D) Allele colors in ventral organ: green (vYG), yellow (vYE), and orange (vOR). From Stolz et al. (2003).

Instead of outlining the entire series of studies like I had intended, I want to extract two larger themes out of the papers – how biologists test for selection on DNA sequences and how the different color alleles in the beetles arose (and I promise, this is a really cool system of allele origination!).

I ask the first question because the authors employ several tests to detect selection and when writing about these studies for a mini-review in my evolution course, I stumbled in this area. I resolved to figure this out for personal education purposes and because I have yet to find a good source that explains these tests in a readily understandable way, I decided to blog about it. For this reason, if I make any mistakes, please point them out! I am writing about this topic to teach myself something I didn’t learn in any of my classes!

I also want to note that readers probably won’t come out understanding evolution in Jamaican click beetles after reading this post. I look at the selection tests out of order and I don’t discuss in much detail the resulting selective scenario the authors propose. (The post about allele origination will be chock-full of click beetle biology, however!)

The three tests I examine are the QTL sign test, the McDonald-Kreitman test, and substitution rate ratios.

QTL Sign Test

(Apparently) developed by Allen Orr, the QTL sign test helps detect whether selection may or may not have acted upon quantitative traits at the molecular level. QTL means “quantitative trait locus” – basically a gene whose alleles affect the phenotype in a quantifiable way and is not necessarily an on/off system. Additionally, a quantitative trait is frequently affected by multiple loci (or polygenic). A quantitative (and not on/off) trait such as weight is not controlled by a single gene – there is no “gene for weight”; instead, weight is a culmination of multiple genes that happen to act upon weight.

Scientists first pick a quantitative trait to examine based on how strong of a difference there is between two phenotypes, R. After QTL mapping in which the affecting loci/nucleotides are found, the QTLs can be given a plus or minus sign for positive or negative effects, respectively; a higher (plus) or lower (minus) weight, for example. When the distribution of plus/minus loci is determined, a statistical test can be performed to infer the likelihood of that given distribution appearing by chance, or in this case, how likely the difference in phenotypes (R) is to have evolved neutrally (Figure 1; left, shows what a neutral distribution could look like). If the found probability is less than 0.05, the null hypothesis (neutral evolution) can be safely rejected. Selection probably played some role.

Stolz et al. use a QTL sign test to find whether or not diversifying selection is acting on the bioluminescence of the dorsal and ventral organs. Luciferase in the click beetles is a great example of a QTL: the detected mutations do not turn luciferase on or off, but instead shift the produced light’s wavelength by several nanometers up (plus) or down (minus).

A difference between a typical QTL analysis and the analysis performed on click beetles is that we are looking at point mutations within a single gene, rather than multiple loci. Stolz et al. thus call their analysis a QTN test – a quantitative trait nucleotide test – but the same principles of QTL apply: bioluminescent color is affected by multiple mutations, not just a single one, and they each have quantitative effects.

Stolz et al. looked at the divergence between the dGR and vYE alleles, assuming these two alleles to be the ancestral and least-derived states of the loci (for reasons not explained here). The difference between phenotypes (R), wavelength in this case, is 31 nanometers. Nine fixed non-synonymous substitutions contribute to this difference and the nine nucleotides in vYE increase wavelength (and are assigned ‘plus’ status) (Figure 1; right). The probability of finding nine plus mutations and zero minus mutations was 0.039 – low enough to reject the null hypothesis of neutral evolution. This finding provides evidence that selection is acting on bioluminescent color.

Figure 1: On the left is an example of a neutral distribution of plus and minus nucleotides - there is no detectable directional selection. On the right is a recreation of the data from Stolz et al. (2003) with nine plus mutations of varying strengths. The number line only indicates the order of the nucleotides in the gene; it has no implications of genetic distance.

McDonald-Kreitman Test

A well-known way to detect selection at the molecular level is the McDonald-Kreitman (M-K) test. The test compares the ratios of synonymous and non-synonymous fixed differences between species and polymorphic differences within a species. This may sound a bit complicated at first, but it makes sense – let me explain.

A synonymous (s) site is where a base substitution has no effect on the translated codon (hence synonymous; same amino acid = same “word”), and a non-synonymous (n) site is where the translated codon does change. A polymorphic (P) site is one which shows variation within the species whereas a fixed (D) site shows no variation within the species but is different compared to a related species.

This is how the M-K test works to detect selection: under neutral evolution, selection is not acting and thus differences should only be attributable to the mutation rate. Furthermore, because they are only affected by the mutation rate, the ratios of non-synonymous to synonymous differences (n/s) should be equal between fixed (Dn/Ds) and polymorphic (Pn/Ps) categories. Additionally, the ratio between fixed and polymorphic (D/P) sites should be equal between synonymous and non-synonymous categories. Basically, all ratios should divide to 1 (Table 1) and any divergence from 1 indicates selection may be acting. If D > P or n>s, then directional selection is presumed to be acting upon the sequence.

Table 1: An example of neutrality in a McDonald-Kreitman test; all ratios divide to 1.
Fixed (D) Polymorphic (P)
Synonymous (s) 13 4
Non-synonymous (n) 13 4

As with the QTL sign test, the McDonald-Kreitman test used on the beetles is slightly different – instead of testing between species differences, they tested the differences between the ventral and dorsal loci. (These loci have diverged for over a million years and can presumably be treated as “different species.”)

Let us first look at a region of luciferase that does not affect color (non-color region). (Table 2).

Table 2: McDonald-Kreitman test for the non-color region of luciferase.
Fixed (D)
Synonymous (s) 13
Non-synonymous (n) 16
The non-color region of luciferase shows a similar table to Table 1. This 2×2 contingency table has a p-value of 0.845, an indication of neutrality.

The ratios of synonymous/non-synonymous in both fixed and polymorphic columns are either the same or close to being the same (Dn/Ds ≈ Pn/Ps). The “fixed” ratio confirms the site is selectively neutral – the non-synonymous sites are being fixed at the same rate as synonymous sites. Furthermore, Ds/Ps ≈ Dn/Pn.

Now let’s look at the coding region of luciferase (Table 3).

Table 3: McDonald-Kreitman test for the color region of luciferase.
Fixed (D) Polymorphic (P)
Synonymous (s) 1 6
Non-synonymous (n) 16 6
There is an excess in Dn and a deficit of Ds in the color region. P-value = 0.011.

There is an excess of fixed non-synonymous sites which indicates the presence of selection. However, Stolz et al. note that Ds is low compared to the rest of the numbers in the table (and in Table 2) which they claim is “atypical of directional selection” (emphasis mine). They exclude codon bias as a possible explanation and also note that this “paucity” of silent fixations is abnormal within the Poryphorus genus. They conclude that intergenic recombination may have cleared any differences between the two loci (reducing both Ds and Dn) and rapid selection subsequently increased Dn. (Don’t worry; intergenic recombination will make a lot more sense in a later post.)

Thus, much like the QTL sign test, the McDonald-Kreitman test looks for divergence from the neutral model in the distribution of base substitutions, inferring the presence of selection if the divergence is strong enough.

Substitution Rate Ratios

Similar to the M-K test, substitution rate ratios look at the difference between synonymous and non-synonymous substitutions between two sequences, but it doesn’t bother to examine fixed and polymorphic differences. In this way, the test is simpler.

The test comes down to two ratios: the number of synonymous substitutions per synonymous site (dS) and the number of non-synonymous substitutions per non-synonymous site (dN). If dN = dS, then the sequences are undergoing neutral evolution (similar to the reasoning in the M-K test). If dN/dS > 1, positive selection; if dN/dS < 1, purifying selection. (dN/dS is often denoted as ω.)

In the color region of luciferase, dN = 0.0217 and dS = 0.0062 (errors omitted). Thus, dN/dS = 3.49. In the non-color region, dN = 0.0023 and dS = 0.058; dN/dS = 0.040. The two dN/dS ratios were significantly different (P = 0.0013). Because dN/dS in the color region is much higher than 1, positive selection is inferred to be acting. (Stolz et al. make no mention of why the non-color region has such a low dN/dS ratio, however. The value indicates purifying selection is rather strong here, so while the non-color region may not be important in determining bioluminescent color, I would presume it codes for an essential structural component of luciferase.)

Other Indirect Tests

The three tests discussed here by no means exhaust the ways one can test for selection. Not only are there other statistical tests one can employ, but there are other indirect ways of detecting selection in a genetic sequence. For example, a reduction of local nucleotide diversity may indicate a selective sweep. As selection drives an allele towards fixation, selection further removes diversity in the surrounding sequence due to hitchhiking. This pattern was found in the ventral orange allele in the Jamaican click beetle: nucleotide diversity in vOR was 0.00046 and in vYE, vOR’s presumed ancestor, diversity was 0.00129. While this isn’t particularly rigorous, it serves as another piece of evidence that selection is acting upon luciferase in the Jamaican click beetles.

This post serves as a (hopefully) basic overview of how molecular biologists can test for selection on DNA sequences. There are many other tests and there are a host of problems associated with each one that I haven’t even begun to explore. I can never stress enough that I am not an expert in this area – I am only providing my understanding of the material in hopes of being corrected by those who know more than me as a way to teach myself evolutionary concepts and, if correct, hopefully teach others in a similar boat as mine.

Orr HA (1998). Testing natural selection vs. genetic drift in phenotypic evolution using quantitative trait locus data. Genetics, 149 (4), 2099-104 PMID: 9691061

Stolz U, Velez S, Wood KV, Wood M, & Feder JL (2003). Darwinian natural selection for orange bioluminescent color in a Jamaican click beetle. Proceedings of the National Academy of Sciences of the United States of America, 100 (25), 14955-9 PMID: 14623957

Source I used to understand selection tests: Genetics of Populations by Philip Hedrick (Google Books)

One thought on “How to test for selection (Adaptive Recursion III)

  1. Allelic effects at quantitative trait loci (QTL) between lineages are potentially informative for indicating the action of natural selection. The QTL Sign Test uses the number of + and − alleles observed in a QTL study to infer a history of selection. This test has been constructed with a “neutral” model that in practice assumes each QTL corresponds to a single mutation, and that conditions on the phenotypic difference between the two lines in question. Unfortunately, conditioning on the phenotypic difference results in a huge loss of power to reject the neutral hypothesis, and leads to extreme sensitivity to the variance in locus effect magnitude across QTLs. Anyone interested in using the QTL sign test to test for selection should check out

    Really, only the distribution of quantitative trait locus (QTL) effect sizes and the distribution of mutational effects together provide sufficient information regarding the history of selection. Mutational effects can be estimated from mutation accumulation experiments, and from the two a range of selection strengths can be inferred under which such a distribution of mutations could generate the observed QTL:


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s