Before my unintended break from blogging, I had started writing about the work by Stolz, Feder, and Velez on bioluminescent color in the Jamaican click beetle, Poryphorus plagiophthalamus (here and here). In this organism are two sets of bioluminescent organs – a dorsal pair and a single ventral organ. Not only can the two sets of organs differ in color within an organism, but – and this is what makes the species special – the colors can be polymorphic within the species. By that I mean within a population, one can find green and yellow-green dorsal organs in addition to yellow-green, yellow, and orange ventral organs. Variation of bioluminescent color within the population is apparently unheard of, even within the rest of the Poryphorus genus. The polymorphism of bioluminescent color provides a simple system for evolutionary and ecological study (as I point out in my first post about the species).
Instead of outlining the entire series of studies like I had intended, I want to extract two larger themes out of the papers – how biologists test for selection on DNA sequences and how the different color alleles in the beetles arose (and I promise, this is a really cool system of allele origination!).
I ask the first question because the authors employ several tests to detect selection and when writing about these studies for a mini-review in my evolution course, I stumbled in this area. I resolved to figure this out for personal education purposes and because I have yet to find a good source that explains these tests in a readily understandable way, I decided to blog about it. For this reason, if I make any mistakes, please point them out! I am writing about this topic to teach myself something I didn’t learn in any of my classes!
I also want to note that readers probably won’t come out understanding evolution in Jamaican click beetles after reading this post. I look at the selection tests out of order and I don’t discuss in much detail the resulting selective scenario the authors propose. (The post about allele origination will be chock-full of click beetle biology, however!)
The three tests I examine are the QTL sign test, the McDonald-Kreitman test, and substitution rate ratios.
QTL Sign Test
(Apparently) developed by Allen Orr, the QTL sign test helps detect whether selection may or may not have acted upon quantitative traits at the molecular level. QTL means “quantitative trait locus” – basically a gene whose alleles affect the phenotype in a quantifiable way and is not necessarily an on/off system. Additionally, a quantitative trait is frequently affected by multiple loci (or polygenic). A quantitative (and not on/off) trait such as weight is not controlled by a single gene – there is no “gene for weight”; instead, weight is a culmination of multiple genes that happen to act upon weight.
Scientists first pick a quantitative trait to examine based on how strong of a difference there is between two phenotypes, R. After QTL mapping in which the affecting loci/nucleotides are found, the QTLs can be given a plus or minus sign for positive or negative effects, respectively; a higher (plus) or lower (minus) weight, for example. When the distribution of plus/minus loci is determined, a statistical test can be performed to infer the likelihood of that given distribution appearing by chance, or in this case, how likely the difference in phenotypes (R) is to have evolved neutrally (Figure 1; left, shows what a neutral distribution could look like). If the found probability is less than 0.05, the null hypothesis (neutral evolution) can be safely rejected. Selection probably played some role.
Stolz et al. use a QTL sign test to find whether or not diversifying selection is acting on the bioluminescence of the dorsal and ventral organs. Luciferase in the click beetles is a great example of a QTL: the detected mutations do not turn luciferase on or off, but instead shift the produced light’s wavelength by several nanometers up (plus) or down (minus).
A difference between a typical QTL analysis and the analysis performed on click beetles is that we are looking at point mutations within a single gene, rather than multiple loci. Stolz et al. thus call their analysis a QTN test – a quantitative trait nucleotide test – but the same principles of QTL apply: bioluminescent color is affected by multiple mutations, not just a single one, and they each have quantitative effects.
Stolz et al. looked at the divergence between the dGR and vYE alleles, assuming these two alleles to be the ancestral and least-derived states of the loci (for reasons not explained here). The difference between phenotypes (R), wavelength in this case, is 31 nanometers. Nine fixed non-synonymous substitutions contribute to this difference and the nine nucleotides in vYE increase wavelength (and are assigned ‘plus’ status) (Figure 1; right). The probability of finding nine plus mutations and zero minus mutations was 0.039 – low enough to reject the null hypothesis of neutral evolution. This finding provides evidence that selection is acting on bioluminescent color.
A well-known way to detect selection at the molecular level is the McDonald-Kreitman (M-K) test. The test compares the ratios of synonymous and non-synonymous fixed differences between species and polymorphic differences within a species. This may sound a bit complicated at first, but it makes sense – let me explain.
A synonymous (s) site is where a base substitution has no effect on the translated codon (hence synonymous; same amino acid = same “word”), and a non-synonymous (n) site is where the translated codon does change. A polymorphic (P) site is one which shows variation within the species whereas a fixed (D) site shows no variation within the species but is different compared to a related species.
This is how the M-K test works to detect selection: under neutral evolution, selection is not acting and thus differences should only be attributable to the mutation rate. Furthermore, because they are only affected by the mutation rate, the ratios of non-synonymous to synonymous differences (n/s) should be equal between fixed (Dn/Ds) and polymorphic (Pn/Ps) categories. Additionally, the ratio between fixed and polymorphic (D/P) sites should be equal between synonymous and non-synonymous categories. Basically, all ratios should divide to 1 (Table 1) and any divergence from 1 indicates selection may be acting. If D > P or n>s, then directional selection is presumed to be acting upon the sequence.
|Table 1: An example of neutrality in a McDonald-Kreitman test; all ratios divide to 1.|
|Fixed (D)||Polymorphic (P)|
As with the QTL sign test, the McDonald-Kreitman test used on the beetles is slightly different – instead of testing between species differences, they tested the differences between the ventral and dorsal loci. (These loci have diverged for over a million years and can presumably be treated as “different species.”)
Let us first look at a region of luciferase that does not affect color (non-color region). (Table 2).
|Table 2: McDonald-Kreitman test for the non-color region of luciferase.|
|The non-color region of luciferase shows a similar table to Table 1. This 2×2 contingency table has a p-value of 0.845, an indication of neutrality.|
The ratios of synonymous/non-synonymous in both fixed and polymorphic columns are either the same or close to being the same (Dn/Ds ≈ Pn/Ps). The “fixed” ratio confirms the site is selectively neutral – the non-synonymous sites are being fixed at the same rate as synonymous sites. Furthermore, Ds/Ps ≈ Dn/Pn.
Now let’s look at the coding region of luciferase (Table 3).
|Table 3: McDonald-Kreitman test for the color region of luciferase.|
|Fixed (D)||Polymorphic (P)|
|There is an excess in Dn and a deficit of Ds in the color region. P-value = 0.011.|
There is an excess of fixed non-synonymous sites which indicates the presence of selection. However, Stolz et al. note that Ds is low compared to the rest of the numbers in the table (and in Table 2) which they claim is “atypical of directional selection” (emphasis mine). They exclude codon bias as a possible explanation and also note that this “paucity” of silent fixations is abnormal within the Poryphorus genus. They conclude that intergenic recombination may have cleared any differences between the two loci (reducing both Ds and Dn) and rapid selection subsequently increased Dn. (Don’t worry; intergenic recombination will make a lot more sense in a later post.)
Thus, much like the QTL sign test, the McDonald-Kreitman test looks for divergence from the neutral model in the distribution of base substitutions, inferring the presence of selection if the divergence is strong enough.
Substitution Rate Ratios
Similar to the M-K test, substitution rate ratios look at the difference between synonymous and non-synonymous substitutions between two sequences, but it doesn’t bother to examine fixed and polymorphic differences. In this way, the test is simpler.
The test comes down to two ratios: the number of synonymous substitutions per synonymous site (dS) and the number of non-synonymous substitutions per non-synonymous site (dN). If dN = dS, then the sequences are undergoing neutral evolution (similar to the reasoning in the M-K test). If dN/dS > 1, positive selection; if dN/dS < 1, purifying selection. (dN/dS is often denoted as ω.)
In the color region of luciferase, dN = 0.0217 and dS = 0.0062 (errors omitted). Thus, dN/dS = 3.49. In the non-color region, dN = 0.0023 and dS = 0.058; dN/dS = 0.040. The two dN/dS ratios were significantly different (P = 0.0013). Because dN/dS in the color region is much higher than 1, positive selection is inferred to be acting. (Stolz et al. make no mention of why the non-color region has such a low dN/dS ratio, however. The value indicates purifying selection is rather strong here, so while the non-color region may not be important in determining bioluminescent color, I would presume it codes for an essential structural component of luciferase.)
Other Indirect Tests
The three tests discussed here by no means exhaust the ways one can test for selection. Not only are there other statistical tests one can employ, but there are other indirect ways of detecting selection in a genetic sequence. For example, a reduction of local nucleotide diversity may indicate a selective sweep. As selection drives an allele towards fixation, selection further removes diversity in the surrounding sequence due to hitchhiking. This pattern was found in the ventral orange allele in the Jamaican click beetle: nucleotide diversity in vOR was 0.00046 and in vYE, vOR’s presumed ancestor, diversity was 0.00129. While this isn’t particularly rigorous, it serves as another piece of evidence that selection is acting upon luciferase in the Jamaican click beetles.
This post serves as a (hopefully) basic overview of how molecular biologists can test for selection on DNA sequences. There are many other tests and there are a host of problems associated with each one that I haven’t even begun to explore. I can never stress enough that I am not an expert in this area – I am only providing my understanding of the material in hopes of being corrected by those who know more than me as a way to teach myself evolutionary concepts and, if correct, hopefully teach others in a similar boat as mine.
Orr HA (1998). Testing natural selection vs. genetic drift in phenotypic evolution using quantitative trait locus data. Genetics, 149 (4), 2099-104 PMID: 9691061
Stolz U, Velez S, Wood KV, Wood M, & Feder JL (2003). Darwinian natural selection for orange bioluminescent color in a Jamaican click beetle. Proceedings of the National Academy of Sciences of the United States of America, 100 (25), 14955-9 PMID: 14623957
Source I used to understand selection tests: Genetics of Populations by Philip Hedrick (Google Books)