First, a brief review of what the ampliconic class is.
From my Y Chromosome II post:
The final sequence class, the ampliconic, is more complex than the previous two classes as it contains more genes and has stranger architecture. The 10.2Mb class is broken into seven segments and contains the highest density of genes on the MSY. An amplicon is a generic term to group together the highly repetitive MSY-specific units. To identify these amplicons, Skaletsky et al. compared a 50kb sliding window to the rest of the euchromatic sequences in 1kb steps and any window that showed over 50% similarity to another sequence was deemed an amplicon (blue regions in Figure 3). Although this seems arbitrary, 60% of the region shows over 99.9% similarity to something else in the region (Skaletsky et al., 2003).
There is something to this high sequence similarity. What explains it?
Palindromes. Huge, massive palindromes. As you most likely know, a palindrome is “a word, line, verse, number, sentence, etc., reading the same backward as forward, as [in] Madam, I’m Adam” (dictionary.com). The longest palindrome I am aware of is Demitri Martin’s 224-word poem, Dammit, I’m Mad.
That’s in language though – what is a biological palindrome? It’s nearly the same idea. In an e-mail from a member of the Page lab, I was given this example:
The second half is the “reverse-compliment” of the first half. The first half itself is repeated on the right side of the other DNA strand. The double-stranded nature of DNA complicates the picture, but hopefully that makes sense. As this this free mol bio textbook (Google Books) says, “the sequence reads the same forwards on one strand as it reads backwards on the complementary strand” (86). The figure in the book reflects this (figure 1; just add a spacer to get our palindromes).
The three bases in the middle of the example, AAA, are considered a spacer (and each half is an “arm” of the palindrome). If the spacer is smaller than an arm, the Page lab considers that a palindrome; if the spacer is larger than an arm, the Page lab considers that only an inverted repeat. The significance of the spacer will be explained two posts from now.
As I said earlier, the Y chromosome’s palindromes are absolutely huge. The longest palindrome, P1, is 2.9 Mb long. That is 2,900,000 letters! The two arms account for ~30% of the ampliconic class and ~13% of the Y’s euchromatin. The other palindromes aren’t slouches either, as indicated in Table 1. The fact that the two arms are over 99.5% identical with each other explains the high sequence similarity found in the Page lab’s scans.
Why palindromes? I will tell you in the next few posts!
Skaletsky H, Kuroda-Kawaguchi T, Minx PJ, Cordum HS, Hillier L, Brown LG, Repping S, Pyntikova T, Ali J, Bieri T, Chinwalla A, Delehaunty A, Delehaunty K, Du H, Fewell G, Fulton L, Fulton R, Graves T, Hou SF, Latrielle P, Leonard S, Mardis E, Maupin R, McPherson J, Miner T, Nash W, Nguyen C, Ozersky P, Pepin K, Rock S, Rohlfing T, Scott K, Schultz B, Strong C, Tin-Wollam A, Yang SP, Waterston RH, Wilson RK, Rozen S, & Page DC (2003). The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature, 423 (6942), 825-37 PMID: 12815422