World Library  
Flag as Inappropriate
Email this Article

McDonald–Kreitman test

Article Id: WHEBN0030761750
Reproduction Date:

Title: McDonald–Kreitman test  
Author: World Heritage Encyclopedia
Language: English
Subject: Molecular evolution, Statistical genetics, List of statistics articles
Publisher: World Heritage Encyclopedia

McDonald–Kreitman test

The McDonald–Kreitman test is a statistical test often used by evolution and population biologists to detect and measure the amount of [2] Typically, silent mutations in protein-coding regions are used as the "control" in the McDonald–Kreitman test.

In 1991, John H. McDonald and Martin Kreitman derived the McDonald–Kreitman test while performing an experiment with Drosophila (fruit flies) and their differences in amino acid sequence of the alcohol dehydrogenase gene. McDonald and Kreitman proposed this method to estimate the proportion of substitutions that are fixed by positive selection rather than by genetic drift.[4]

In order to set up the McDonald–Kreitman test, we must first set up a two-way contingency table of our data on the species being investigated as shown below:

Fixed Polymorphic
Synonymous Ds Ps
Nonsynonymous Dn Pn
  • Ds: the number of synonymous substitutions per gene
  • Dn: the number of non-synonymous substitutions per gene
  • Ps: the number of synonymous polymorphisms per gene
  • Pn: the number of non-synonymous polymorphisms per gene

To quantify the values for Ds, Dn, Ps, and Pn, you count the number of differences in the protein-coding region for each type of variable in the contingency table.

The null hypothesis of the McDonald–Kreitman test is that the ratio of nonsynonymous to synonymous variation within a species is going to equal the ratio of nonsynonymous to synonymous variation between species (i.e. Dn/Ds = Pn/Ps). When positive or negative selection (natural selection) influences nonsynonymous variation, the ratios will no longer equal. The ratio of nonsynonymous to synonymous variation between species is going to be lower than the ratio of nonsynonymous to synonymous variation within species (i.e. Dn/Ds < Pn/Ps) when negative selection is at work, and deleterious mutations strongly affect polymorphism. The ratio of nonsynonymous to synonymous variation within species is lower than the ratio of nonsynonymous to synonymous variation between species (i.e. Dn/Ds > Pn/Ps) when we observe positive selection. Since mutations under positive selection spread through a population rapidly, they don't contribute to polymorphism but do have an effect on divergence.[5]

Using an equation derived by Smith and Eyre-Walker, we can estimate the proportion of base substitutions fixed by natural selection, α,[6] using the following formula:

\alpha = 1 - \frac{D_sP_n}{D_nP_s}

Alpha represents the proportion of substitutions driven by positive selection. Alpha can be equal to any number between -∞ and 1. Negative values of alpha are produced by sampling error or violations of the model, such as the segregation of slightly deleterious amino acid mutations.[7] Similar to above, our null hypothesis here is that α=0, and we expect Dn/Ds to equal Pn/Ps.[4]

The Neutrality Index

The neutrality index (NI) quantifies the direction and degree of departure from neutrality (where Pn/Ps and Dn/Ds ratios equal). When assuming that silent mutations are neutral, a neutrality index greater than 1 (i.e. NI > 1) indicates negative selection is at work, resulting in an excess of amino acid polymorphism. This occurs because natural selection is favoring the purifying selection, and the weeding out of deleterious alleles.[8] Because silent mutations are neutral, a neutrality index lower than 1 (i.e. NI < 1) indicates an excess of nonsilent divergence, which occurs when positive selection is at work in the population. When positive selection is acting on the species, natural selection favors a specific phenotype over other phenotypes, and the favored phenotype begins to go to fixation in the species as the allele frequency for that phenotype increases.[9] To find the neutrality index, we can use the following equation:

See the McDonald-Kreitman Test

Sources of Error with the McDonald-Kreitman Test

One drawback of performing a McDonald–Kreitman test is that the test is vulnerable to error, as with every other statistical test. Many factors can contribute to errors in estimates of the level of adaptive evolution, including presence of slightly deleterious mutations, variation of mutation rates across the genome, variation in coalescent histories across the genome, and changes in the effective population size. All these factors result in α being underestimated.[10] However, according to research done by Charlesworth (2008),[2] Andolfatto(2008),[11] and Eyre-Walker(2006),[7] none of these factors are significant enough to make scientists believe the McDonald–Kreitman test is unreliable, except for the presence of slightly deleterious mutations in species.

In general, the McDonald-Kreitman test is often thought to be unreliable because of how significantly the test tends to underestimate the degree of adaptive evolution in the presence of slightly deleterious mutations. A slightly deleterious mutation can be defined as a mutation that negative selection acts on only very weakly so that its fate is determined by both selection and random genetic drift.[2] If slightly deleterious mutations are segregating in the population, then it becomes difficult to detect positive selection and the degree of positive selection is underestimated. Weakly deleterious mutations have a larger chance of contributing to polymorphism than strongly deleterious mutations, but still have low probabilities of fixation. This creates bias in the McDonald–Kreitman test's estimate of the degree of adaptive evolution, resulting in a dramatically lower estimate of α. By contrast, since strongly deleterious mutations contribute to neither polymorphism nor divergence, strongly deleterious mutations do not bias estimates of α.[12] The presence of slightly deleterious mutations is strongly linked to genes that have experienced the greatest reduction in effective population size.[13] This means that soon after a recent reduction in effective population size in a species has occurred, such as a bottleneck, we observe a larger presence of slightly deleterious mutations in the protein-coding regions.[14] We can make a direct connection with the increase in number of slightly deleterious mutations and the recent decrease in effective population size.[13] For more information on why population size affects the tendency of slightly deleterious mutations to increase in frequency, refer to the article Nearly neutral theory of molecular evolution.

Additionally, as with every statistical test, there is always that chance of having type I error and type II error in the McDonald Kreitman test. With statistical tests, we must strive more try to avoid making type I errors, to avoid rejecting the null hypothesis, when it is in fact, true.[15] However, the McDonald Kreitman test is very vulnerable to type I error, because of the many factors that can lead to the accidental rejection of the true null hypothesis. These such factors include variation in recombination rate, nonequilibrium demography, small sample sizes, and in comparisons involving more recently diverged species.[13] All of these factors have the ability to influence the ability of the mcDonald-Kreitman test to detect positive selection, as well as the level of positive selection acting on a species. This inability to correctly determine the level of positive selection acting on a species often leads to a false positive, and the incorrect rejection of the null hypothesis.

While performing the McDonald–Kreitman test, scientists also have to avoid making too numerous type II errors. Otherwise, a test's results may be too flawed and its results will be termed useless.

Error-correcting Mechanisms of the McDonald-Kreitman Test

There continues to be more experimentation with the McDonald–Kreitman test and how to improve the accuracy of the test. The most important error to correct for is the error that α is severely underestimated in the presence of slightly deleterious mutations, as discussed in the previous section "Sources of Error with the McDonald-Kreitman Test." This possible adjustment of the McDonald–Kreitman test includes removing polymorphisms below a specific value from the data set to improve and increase the number of substitutions that occurred due to adaptive evolution.[2] To minimize the impact of slightly deleterious mutations, it has been proposed to exclude polymorphisms that are below a certain cutoff frequency, such as <8% or <5% (there is still much debate about what the best cutoff value should be). By not including polymorphisms under a certain frequency, you can reduce the bias created by slightly deleterious mutations, since less polymorphisms will be counted. This will drive the estimate of α up. Therefore, the degree of adaptive evolution estimated will not be so severely underestimated, deeming the McDonald–Kreitman test to be more reliable.[12]

One adjustment necessary is to control for the type I error in the McDonald–Kreitman test, refer to the discussion of this in previous section "Sources of Error with the McDonald Kreitman Test." One method to avoid type I errors is to avoid using populations that have undergone a recent bottleneck, meaning they have recently undergone a recent decrease in effective population size.[13] To make the analysis as accurate as possible in the McDonald–Kreitman test, it is best to use large sample sizes, but there is still debate and how large "large" is.[15] Another method of controlling for type I error, Peter Andolfatto(2008) suggests, is to establish significance levels by coalescent simulation with recombination in genomewide scans for selection on noncoding DNA. By doing this, you will be able to improve the accuracy of your statistical test and avoid any false positive tests. <[11] With all these possible ways to avoid making type I errors, scientists should cautiously choose which populations they are analyzing, to avoid analyzing populations that will lead to inaccurate results.

See also


  1. ^ Futuyma, D. J. 2013. Evolution. Sinauer Associates, Inc.: Sunderland.
  2. ^ a b c d e [1], Charlesworth, J. Eyre-Walker, A. 2008. The McDonald–Kreitman Test and Slightly Deleterious Mutations. Molecular Biology and Evolution 25 (6): 1007-1015.
  3. ^ [2], Kimchi-Sarfaty, C. Oh, J. M. Kim, I. Sauna, Z. E. Calcagno, A. M. Ambudkar, S. V. Gottesman, M. M. 2007. A "Silent" Polymorphism in the MDR1 Gene Changes Substrate Specificity. Science 315 (5811): 525-528.
  4. ^ a b [3], Eyre-Walker, A. 2006. The genomic rate of adaptive evolution. Trends in Ecology and Evolution 21 (10): 569-575.
  5. ^ [4], Barbadilla, A. Casillas, S. Egea, R. 2008. Standard and generalized McDonald-Kreitman test: a website to detect selection by comparing different classes of DNA sites. Nucleic Acids Research 36: 157-162.
  6. ^ Smith, N. G. C.; Eyre-Walker, A. (2002). "Adaptive protein evolution in Drosophila". Nature 415 (6875): 1022–1024.  
  7. ^ a b [5], Eyre-Walker, A. 2002. Changing Effective Population Size and the McDonald-Kreitman Test. Genetics Society of America 162: 2017-2024.
  8. ^ [6], Meiklejohns, C. D. Montooth, K. L. Rand, D. M. 2007. Positive and negative selection on the mitochondrial genome. Trends in Genetics 23 (6): 259-263.
  9. ^ [7], Stoletzki, N. Eyre-Walker, A. 2010. Estimation of the Neutrality Index. Molecular Biology and Evolution 28 (1): 63-70.
  10. ^ [8], Baines, J. F. Parsch, J. Zhang, Z. 2008. The Influence of Demography and Weak Selection on the McDonald–Kreitman Test: An Empirical Study in Drosophila. Molecular Biology and Evolution 26: 691-698.
  11. ^ a b [9], Andolfatto, P. 2008. Controlling Type-I Error of the McDonald-Kreitman Test in Genomewide Scans for Selection on Noncoding DNA. Genetics Society of America 180 (3): 1767-1771.
  12. ^ a b [10], Messer, P. W. Petrov, D. A. 2013. Frequent adaptation and the McDonald–Kreitman test. Proceedings of the National Academy of Sciences of the United States of America 110 (21): 8615-8620.
  13. ^ a b c d [11], Parsch, J. Zhang, Z. Baines, J. 2009. The Influence of Demography and Weak Selection on the McDonald–Kreitman Test: An Empirical Study in Drosophila. Molecular Biology and Evolution 26 (3): 691-698.
  14. ^ [12], Ellegren, Hans. 2009. A Selection Model of Molecular Evolution Incorporating the Effective Population Size. Evolution 63(2): 301-305.
  15. ^ a b Rossman, A. J. Chance, B. L. 2012. Workshop Statistics Discovery with Data. John Wiley & Sons: Danvers.
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.