Bayesian coestimation of phylogeny and sequence alignment.

Gerton Lunter, István Miklós, Alexei Drummond, Jens Ledet Jensen, Jotun Hein

BMC Bioinformatics 2005

BACKGROUND: Two central problems in computational biology are the determination of the alignment and phylogeny of a set of biological sequences. The traditional approach to this problem is to first build a multiple alignment of these sequences, followed by a phylogenetic reconstruction step based on this multiple alignment. However, alignment and phylogenetic inference are fundamentally interdependent, and ignoring this fact leads to biased and overconfident estimations. Whether the main interest be in sequence alignment or phylogeny, a major goal of computational biology is the co-estimation of both.

RESULTS: We developed a fully Bayesian Markov chain Monte Carlo method for coestimating phylogeny and sequence alignment, under the Thorne-Kishino-Felsenstein model of substitution and single nucleotide insertion-deletion (indel) events. In our earlier work, we introduced a novel and efficient algorithm, termed the "indel peeling algorithm", which includes indels as phylogenetically informative evolutionary events, and resembles Felsenstein's peeling algorithm for substitutions on a phylogenetic tree. For a fixed alignment, our extension analytically integrates out both substitution and indel events within a proper statistical model, without the need for data augmentation at internal tree nodes, allowing for efficient sampling of tree topologies and edge lengths. To additionally sample multiple alignments, we here introduce an efficient partial Metropolized independence sampler for alignments, and combine these two algorithms into a fully Bayesian co-estimation procedure for the alignment and phylogeny problem. Our approach results in estimates for the posterior distribution of evolutionary rate parameters, for the maximum a-posteriori (MAP) phylogenetic tree, and for the posterior decoding alignment. Estimates for the evolutionary tree and multiple alignment are augmented with confidence estimates for each node height and alignment column. Our results indicate that the patterns in reliability broadly correspond to structural features of the proteins, and thus provides biologically meaningful information which is not existent in the usual point-estimate of the alignment. Our methods can handle input data of moderate size (10-20 protein sequences, each 100-200 bp), which we analyzed overnight on a standard 2 GHz personal computer.

CONCLUSION: Joint analysis of multiple sequence alignment, evolutionary trees and additional evolutionary parameters can be now done within a single coherent statistical framework.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Challenges in Septic Shock: From New Hemodynamics to Blood Purification Therapies.Fernando Ramasco et al.Journal of Personalized Medicine 2024 Februrary 4

Molecular Targets of Novel Therapeutics for Diabetic Kidney Disease: A New Era of Nephroprotection.Alessio Mazzieri et al.International Journal of Molecular Sciences 2024 April 4

The 'Ten Commandments' for the 2023 European Society of Cardiology guidelines for the management of endocarditis.Michael A Borger, Victoria DelgadoEuropean Heart Journal 2024 April 18

A Guide to the Use of Vasopressors and Inotropes for Patients in Shock.Anaas Moncef Mergoum et al.Journal of Intensive Care Medicine 2024 April 14

Diagnosis and Management of Cardiac Sarcoidosis: A Scientific Statement From the American Heart Association.Richard K Cheng et al.Circulation 2024 April 19

Essential thrombocythaemia: A contemporary approach with new drugs on the horizon.Francisca Ferrer-Marín et al.British Journal of Haematology 2024 April 9

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Bayesian coestimation of phylogeny and sequence alignment.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app