Maximum-Likelihood Model Averaging To Profile Clustering of Site Types across Discrete Linear Sequences

3362 views
|
DOI: 10.4016/12250.01
|
login to rate  
URL
Embed
| More
  1. Stephens JC (1985) Statistical methods of DNA sequence analysis: detection of intragenic recombination or gene conversion.. Mol Biol Evol 2: 539-556.
  2. Nekrutenko A, Li WH (2000) Assessment of compositional heterogeneity within and between eukaryotic genomes.. Genome Res 10: 1986-1995.
  3. Nachman MW (2001) Single nucleotide polymorphisms and recombination rate in humans.. Trends Genet 17: 481-485.
  4. Wolfe KH, Sharp PM, Li WH (1989) Mutation rates differ among regions of the mammalian genome.. Nature 337: 283-285.
  5. Huelsenbeck JP, Nielsen R (1999) Variation in the pattern of nucleotide substitution across sites.. J Mol Evol 48: 86-93.
  6. Nei M (1987) . Molecular Evolutionary GeneticsColumbia University Press. New York, USA.
  7. Nielsen R (2005) Molecular signatures of natural selection.. Annu Rev Genet 39: 197-218.
  8. Yang ZH (1996) Among-site rate variation and its impact on phylogenetic analyses.. Trends Ecol Evol 11: 367-372.
  9. Attimonelli M, Lanave C, Sbisa E, Preparata G, Saccone C (1985) Multisequence comparisons in protein coding genes. Search for functional constraints.. Cell Biophys 7: 239-250.
  10. Reeves JH (1992) Heterogeneity in the substitution process of amino acid sites of proteins coded for by mitochondrial DNA.. J Mol Evol 35: 17-31.
  11. Zheng Y, Roberts RJ, Kasif S (2004) Segmentally variable genes: a new perspective on adaptation.. PLoS Biol 2: e81doi:10.1371/journal.pbio.0020081.
  12. Marin I, Fares MA, Gonzalez-Candelas F, Barrio E, Moya A (2001) Detecting changes in the functional constraints of paralogous genes.. J Mol Evol 52: 17-28.
  13. Andres AM, de Hemptinne C, Bertranpetit J (2007) Heterogeneous rate of protein evolution in serotonin genes.. Mol Biol Evol 24: 2707-2715.
  14. Gaut BS, Weir BS (1994) Detecting substitution-rate heterogeneity among regions of a nucleotide sequence.. Mol Biol Evol 11: 620-629.
  15. Hartmann M, Golding GB (1998) Searching for substitution rate heterogeneity.. Mol Phylogenet Evol 9: 64-71.
  16. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review.. ACM Computing Surveys 31: 264-323.
  17. Berkhin P (2006) A Survey of Clustering Data Mining Techniques.. Grouping Multidimensional Data: Recent Advances in Clustering: 25-71Springer-Verlag Berlin Heidelberg. Berlin, Heidelberg.
  18. Mrazek J, Karlin S (1998) Strand compositional asymmetry in bacterial and large viral genomes.. Proc Natl Acad Sci U S A 95: 3720-3725.
  19. Ponger L, Mouchiroud D (2002) CpGProD: identifying CpG islands associated with transcription start sites in large genomic mammalian sequences.. Bioinformatics 18: 631-633.
  20. Zharkikh AA, Rzhetsky A (1993) Quick assessment of similarity of two sequences by comparison of their L-tuple frequencies.. Biosystems 30: 93-111.
  21. Liang H, Zhou W, Landweber LF (2006) SWAKK: a web server for detecting positive selection in proteins using a sliding window substitution rate analysis.. Nucleic Acids Res 34: W382-W384.
  22. Proutski V, Holmes E (1998) SWAN: sliding window analysis of nucleotide sequence variability.. Bioinformatics 14: 467-468.
  23. Fares MA, Elena SF, Ortiz J, Moya A, Barrio E (2002) A sliding window-based method to detect selective constraints in protein-coding genes and its application to RNA viruses.. J Mol Evol 55: 509-521.
  24. Pesole G, Attimonelli M, Preparata G, Saccone C (1992) A statistical method for detecting regions with different evolutionary dynamics in multialigned sequences.. Mol Phylogenet Evol 1: 91-96.
  25. Schmid K, Yang Z (2008) The trouble with sliding windows and the selective pressure in BRCA1.. PLoS ONE 3: e3746doi:10.1371/journal.pone.0003746.
  26. Karlin S, Brendel V (1992) Chance and statistical significance in protein and DNA sequence analysis.. Science 257: 39-49.
  27. Karlin S, Ladunga I, Blaisdell BE (1994) Heterogeneity of genomes: measures and values.. Proc Natl Acad Sci U S A 91: 12837-12841.
  28. Karlin S (1998) Global dinucleotide signatures and analysis of genomic heterogeneity.. Curr Opin Microbiol 1: 598-610.
  29. Goss PJ, Lewontin RC (1996) Detecting heterogeneity of substitution along DNA and protein sequences.. Genetics 143: 589-602.
  30. Tang H, Lewontin RC (1999) Locating regions of differential variability in DNA and protein sequences.. Genetics 153: 485-495.
  31. Peng X, Karuturi RK, Miller LD, Lin K, Jia Y (2005) Identification of cell cycle-regulated genes in fission yeast.. Mol Biol Cell 16: 1026-1042.
  32. Schaeffer SW, Walthour CS, Toleno DM, Olek AT, Miller EL (2001) Protein variation in Adh and Adh-related in Drosophila pseudoobscura. Linkage disequilibrium between single nucleotide polymorphisms and protein alleles.. Genetics 159: 673-687.
  33. Zheng Y, Roberts RJ, Kasif S (2004) Identification of genes with fast-evolving regions in microbial genomes.. Nucleic Acids Res 32: 6347-6357.
  34. Dermitzakis ET, Clark AG (2001) Differential selection after duplication in mammalian developmental genes.. Mol Biol Evol 18: 557-562.
  35. Schmid KJ, Nigro L, Aquadro CF, Tautz D (1999) Large number of replacement polymorphisms in rapidly evolving genes of Drosophila. Implications for genome-wide surveys of DNA polymorphism.. Genetics 153: 1717-1729.
  36. Levin MS (2007) Towards hierarchical clustering.. Computer Science - Theory and Applications: 205-215Springer Berlin/Heidelberg. Heidelberg.
  37. Castro RM, Coates MJ, Nowak RD (2004) Likelihood based hierarchical clustering.. IEEE Trans Signal Process 52: 2308-2321.
  38. Sullivan J, Joyce P (2005) Model selection in phylogenetics.. Annu Rev Ecol Evol Syst 36: 445-466.
  39. Akaike H (1974) New look at statistical-model identification.. IEEE Trans Automat Contr Ac19: 716-723.
  40. Hurvich CM, Tsai CL (1989) Regression and time-series model selection in small samples.. Biometrika 76: 297-307.
  41. Schwarz G (1978) Estimating dimension of a model.. Ann Stat 6: 461-464.
  42. Raftery AE, Madigan D, Hoeting JA (1997) Bayesian model averaging for linear regression models.. J Am Stat Assoc 92: 179-191.
  43. Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests.. Syst Biol 53: 793-808.
  44. Johnson JB, Omland KS (2004) Model selection in ecology and evolution.. Trends Ecol Evol 19: 101-108.
  45. Zhang Z, Li J, Zhao XQ, Wang J, Wong GK (2006) KaKs_Calculator: calculating Ka and Ks through model selection and model averaging.. Genomics Proteomics Bioinformatics 4: 259-263.
  46. Kullback S, Leibler RA (1951) On information and sufficiency.. Ann Math Stat 22: 79-86.
  47. Wilson RJ, Goodman JL, Strelets VB (2008) FlyBase: integration and improvements to query tools.. Nucleic Acids Res 36: D588-D593.
  48. Benach J, Winberg JO, Svendsen JS, Atrian S, Gonzalez-Duarte R (2005) Drosophila alcohol dehydrogenase: acetate-enzyme interactions and novel insights into the effects of electrostatics on catalysis.. J Mol Biol 345: 579-598.
  49. Chen Z, Jiang JC, Lin ZG, Lee WR, Baker ME (1993) Site-specific mutagenesis of Drosophila alcohol dehydrogenase: evidence for involvement of tyrosine-152 and lysine-156 in catalysis.. Biochemistry 32: 3342-3346.
  50. Cols N, Marfany G, Atrian S, Gonzalez-Duarte R (1993) Effect of site-directed mutagenesis on conserved positions of Drosophila alcohol dehydrogenase.. FEBS Lett 319: 90-94.
  51. Persson B, Krook M, Jornvall H (1991) Characteristics of short-chain alcohol dehydrogenases and related enzymes.. Eur J Biochem 200: 537-543.
  52. Albalat R, Gonzalez D, Atrian S (1992) Protein engineering of Drosophila alcohol dehydrogenase. The hydroxyl group of Tyr152 is involved in the active site of the enzyme.. FEBS Lett 308: 235-239.
  53. Cols N, Atrian S, Benach J, Ladenstein R, Gonzalez-Duarte R (1997) Drosophila alcohol dehydrogenase: evaluation of Ser139 site-directed mutants.. FEBS Lett 413: 191-193.
  54. Benyajati C, Place AR, Powers DA, Sofer W (1981) Alcohol dehydrogenase gene of Drosophila melanogaster: relationship of intervening sequences to functional domains in the protein.. Proc Natl Acad Sci U S A 78: 2717-2721.
  55. Bodmer M, Ashburner M (1984) Conservation and change in the DNA sequences coding for alcohol dehydrogenase in sibling species of Drosophila.. Nature 309: 425-430.
  56. Gillespie JH (1986) Variability of evolutionary rates of DNA.. Genetics 113: 1077-1091.
  57. Gu X, Fu YX, Li WH (1995) Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites.. Mol Biol Evol 12: 546-557.
  58. Arndt PF, Hwa T, Petrov DA (2005) Substantial regional variation in substitution rates in the human genome: importance of GC content, gene density, and telomere-specific effects.. J Mol Evol 60: 748-763.
  59. Takano TS (1998) Rate variation of DNA sequence evolution in the Drosophila lineages.. Genetics 149: 959-970.
  60. Wagner A (2007) Rapid detection of positive selection in genes and genomes through variation clusters.. Genetics 176: 2451-2463.
  61. Yu J, Thorne JL (2006) Testing for spatial clustering of amino acid replacements within protein tertiary structure.. J Mol Evol 62: 682-692.
  62. Choi SC, Hobolth A, Robinson DM, Kishino H, Thorne JL (2007) Quantifying the impact of protein tertiary structure on molecular evolution.. Mol Biol Evol 24: 1769-1782.
  63. Vawter L, Brown WM (1993) Rates and patterns of base change in the small subunit ribosomal RNA gene.. Genetics 134: 597-608.
  64. Foster PG (2004) Modeling compositional heterogeneity.. Syst Biol 53: 485-495.
  65. Gao F, Zhang CT (2006) GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences.. Nucleic Acids Res 34: W686-W691.
  66. Carulli JP, Krane DE, Hartl DL, Ochman H (1993) Compositional heterogeneity and patterns of molecular evolution in the Drosophila genome.. Genetics 134: 837-845.
  67. Pond SK, Muse SV (2005) Site-to-site variation of synonymous substitution rates.. Mol Biol Evol 22: 2375-2385.
  68. Yang Z, Swanson WJ (2002) Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes.. Mol Biol Evol 19: 49-57.
  69. Bao L, Gu H, Dunn KA, Bielawski JP (2008) Likelihood-based clustering (LiBaC) for codon models, a method for grouping sites according to similarities in the underlying process of evolution.. Mol Biol Evol 25: 1995-2007.
  70. Yang Z, Nielsen R, Goldman N, Pedersen AM (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites.. Genetics 155: 431-449.
  71. Bird CP, Stranger BE, Liu M, Thomas DJ, Ingle CE (2007) Fast-evolving noncoding sequences in the human genome.. Genome Biol 8: R118.
  72. Stajich JE, Dietrich FS, Roy SW (2007) Comparative genomic analysis of fungal genomes reveals intron-rich ancestors.. Genome Biol 8: R223.
Please describe the reason for abuse:
Supplementary Materials
Comments