We have located links that may give you full text access.
A safe and complete algorithm for metagenomic assembly.
Background: Reconstructing the genome of a species from short fragments is one of the oldest bioinformatics problems. Metagenomic assembly is a variant of the problem asking to reconstruct the circular genomes of all bacterial species present in a sequencing sample. This problem can be naturally formulated as finding a collection of circular walks of a directed graph G that together cover all nodes, or edges, of G .
Approach: We address this problem with the "safe and complete" framework of Tomescu and Medvedev (Research in computational Molecular biology-20th annual conference, RECOMB 9649:152-163, 2016). An algorithm is called safe if it returns only those walks (also called safe ) that appear as subwalk in all metagenomic assembly solutions for G . A safe algorithm is called complete if it returns all safe walks of G .
Results: We give graph-theoretic characterizations of the safe walks of G , and a safe and complete algorithm finding all safe walks of G . In the node-covering case, our algorithm runs in time [Formula: see text], and in the edge-covering case it runs in time [Formula: see text]; n and m denote the number of nodes and edges, respectively, of G . This algorithm constitutes the first theoretical tight upper bound on what can be safely assembled from metagenomic reads using this problem formulation.
Approach: We address this problem with the "safe and complete" framework of Tomescu and Medvedev (Research in computational Molecular biology-20th annual conference, RECOMB 9649:152-163, 2016). An algorithm is called safe if it returns only those walks (also called safe ) that appear as subwalk in all metagenomic assembly solutions for G . A safe algorithm is called complete if it returns all safe walks of G .
Results: We give graph-theoretic characterizations of the safe walks of G , and a safe and complete algorithm finding all safe walks of G . In the node-covering case, our algorithm runs in time [Formula: see text], and in the edge-covering case it runs in time [Formula: see text]; n and m denote the number of nodes and edges, respectively, of G . This algorithm constitutes the first theoretical tight upper bound on what can be safely assembled from metagenomic reads using this problem formulation.
Full text links
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app