Read by QxMD icon Read


P M VanRaden, D M Bickhart, J R O'Connell
Whole-genome sequencing studies can identify causative mutations for subsequent use in genomic evaluations. Speed and accuracy of sequence alignment can be improved by accounting for known variant locations during alignment instead of calling the variants after alignment as in previous programs. The new programs Findmap and Findvar were compared with alignment using Burrows-Wheeler alignment (BWA) or SNAP and variant identification using Genome Analysis ToolKit (GATK) or SAMtools. Findmap stores the reference map and any known variant locations while aligning reads and counting reference and alternate alleles for each DNA source...
February 13, 2019: Journal of Dairy Science
Charlotte Herzeel, Pascal Costanza, Dries Decap, Jan Fostier, Wilfried Verachtert
We present elPrep 4, a reimplementation from scratch of the elPrep framework for processing sequence alignment map files in the Go programming language. elPrep 4 includes multiple new features allowing us to process all of the preparation steps defined by the GATK Best Practice pipelines for variant calling. This includes new and improved functionality for sorting, (optical) duplicate marking, base quality score recalibration, BED and VCF parsing, and various filtering options. The implementations of these options in elPrep 4 faithfully reproduce the outcomes of their counterparts in GATK 4, SAMtools, and Picard, even though the underlying algorithms are redesigned to take advantage of elPrep's parallel execution framework to vastly improve the runtime and resource use compared to these tools...
2019: PloS One
Wanfei Liu, Shuangyang Wu, Qiang Lin, Shenghan Gao, Feng Ding, Xiaowei Zhang, Hasan Awad Aljohi, Jun Yu, Songnian Hu
The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update...
December 21, 2018: Genomics, Proteomics & Bioinformatics
Omika Thakur, Gursharn Singh Randhawa
BACKGROUND: Guar [Cyamopsis tetragonoloba, L. Taub.] is an important industrial crop because of the commercial applications of the galactomannan gum contained in its seeds. Plant breeding programmes based on marker-assisted selection require a rich resource of molecular markers. As limited numbers of such markers are available for guar, molecular breeding programmes have not been undertaken for the genetic improvement of this important crop. Hence, the present work was done to enrich the molecular markers resource of guar by identifying high quality SSR, SNP and InDel markers from the RNA-Seq data of the roots of two guar varieties...
December 20, 2018: BMC Genomics
Hitesh Tikariha, Hemant J Purohit
Metagenome from refinery wastewater treatment plant running under nitrogen stress was analyzed for mining of novel aromatic hydrocarbon-degrading bacteria. The sequence data were assembled using metaspade followed by binning using the Metabat tool to assemble genome; where coverage and depth were calculated using bowtie and samtools. The analysis picked a novel genome belonging to family Bradyrhizobiaceae, identified based on 16S rDNA gene which was supported by CheckM and Kraken analysis. Using RAST, the assembled genome showed the capabilities for nitrogen fixation with the utilization of multiple hydrocarbon substrates with 14 different types of oxygenases as mapped by Minpath...
December 12, 2018: Genomics
Mohamed Salem, Rafet Al-Tobasei, Ali Ali, Daniela Lourenco, Guangtu Gao, Yniv Palti, Brett Kenney, Timothy D Leeds
Detection of coding/functional SNPs that change the biological function of a gene may lead to identification of putative causative alleles within QTL regions and discovery of genetic markers with large effects on phenotypes. This study has two-fold objectives, first to develop, and validate a 50K transcribed gene SNP-chip using RNA-Seq data. To achieve this objective, two bioinformatics pipelines, GATK and SAMtools, were used to identify ~21K transcribed SNPs with allelic imbalances associated with important aquaculture production traits including body weight, muscle yield, muscle fat content, shear force, and whiteness in addition to resistance/susceptibility to bacterial cold-water disease (BCWD)...
2018: Frontiers in Genetics
Jerome Kelleher, Mike Lin, C H Albach, Ewan Birney, Robert Davies, Marina Gourtovaia, David Glazer, Cristina Y Gonzalez, David K Jackson, Aaron Kemp, John Marshall, Andrew Nowak, Alexander Senf, Jaime M Tovar-Corona, Alexander Vikhorev, Thomas M Keane
Summary: Standardised interfaces for efficiently accessing high-throughput sequencing data are a fundamental requirement for large-scale genomic data sharing. We have developed htsget, a protocol for secure, efficient and reliable access to sequencing read and variation data. We demonstrate four independent client and server implementations, and the results of a comprehensive interoperability demonstration. Availability and implementation: http://samtools.github...
June 19, 2018: Bioinformatics
Guangtu Gao, Torfinn Nome, Devon E Pearse, Thomas Moen, Kerry A Naish, Gary H Thorgaard, Sigbjørn Lien, Yniv Palti
Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout ( Oncorhynchus mykiss ), SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL) and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway) that we previously used for SNP discovery...
2018: Frontiers in Genetics
Zhentang Li, Yi Wang, Fei Wang
BACKGROUND: The rapid development of next-generation sequencing (NGS) technology has continuously been refreshing the throughput of sequencing data. However, due to the lack of a smart tool that is both fast and accurate, the analysis task for NGS data, especially those with low-coverage, remains challenging. RESULTS: We proposed a decision-tree based variant calling algorithm. Experiments on a set of real data indicate that our algorithm achieves high accuracy and sensitivity for SNVs and indels and shows good adaptability on low-coverage data...
April 19, 2018: BMC Bioinformatics
Darrell O Ricke, Anna Shcherbina, Adam Michaleas, Philip Fremont-Smith
High-throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) enables additional DNA forensic capabilities not attainable using traditional STR panels. However, the inclusion of sets of loci selected for mixture analysis, extended kinship, phenotype, biogeographic ancestry prediction, etc., can result in large panel sizes that are difficult to analyze in a rapid fashion. GrigoraSNP was developed to address the allele-calling bottleneck that was encountered when analyzing SNP panels with more than 5000 loci using HTS...
November 2018: Journal of Forensic Sciences
Samira Asgharzade, Mohammad Amin Tabatabaiefar, Javad Mohammadi-Asl, Morteza Hashemzadeh Chaleshtori
BACKGROUND: Recent studies have confirmed the utility of targeted next-generation sequencing (NGS), providing a remarkable opportunity to find variants in known disease genes, especially in genetically heterogeneous disorders such as hearing loss (HL). METHODS: After excluding mutations in the most common autosomal recessive non-syndromic HL (ARNSHL) genes via Sanger sequencing and genetic linkage analysis, we performed NGS in the proband an Iranian family with ARNSHL...
May 2018: International Journal of Pediatric Otorhinolaryngology
Nam S Vo, Vinhthuy Phan
Motivation: The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately...
September 1, 2018: Bioinformatics
Taemook Kim, Hogyu David Seo, Lothar Hennighausen, Daeyoup Lee, Keunsoo Kang
Octopus-toolkit is a stand-alone application for retrieving and processing large sets of next-generation sequencing (NGS) data with a single step. Octopus-toolkit is an automated set-up-and-analysis pipeline utilizing the Aspera, SRA Toolkit, FastQC, Trimmomatic, HISAT2, STAR, Samtools, and HOMER applications. All the applications are installed on the user's computer when the program starts. Upon the installation, it can automatically retrieve original files of various epigenomic and transcriptomic data sets, including ChIP-seq, ATAC-seq, DNase-seq, MeDIP-seq, MNase-seq and RNA-seq, from the gene expression omnibus data repository...
May 18, 2018: Nucleic Acids Research
K Yu Tsukanov, A Yu Krasnenko, D A Plakhina, D O Korostin, A V Churov, O S Druzhilovskaya, D V Rebrikov, V V Ilinsky
We aimed to develop a pipeline for the bioinformatic analysis and interpretation of NGS data and detection of a wide range of single-nucleotide somatic mutations within tumor DNA. Initially, the NGS reads were submitted to a quality control check by the Cutadapt program. Low-quality 3¢-nucleotides were removed. After that the reads were mapped to the reference genome hg19 (GRCh37.p13) by BWA. The SAMtools program was used for exclusion of duplicates. MuTect was used for SNV calling. The functional effect of SNVs was evaluated using the algorithm, including annotation and evaluation of SNV pathogenicity by SnpEff and analysis of such databases as COSMIC, dbNSFP, Clinvar, and OMIM...
October 2017: Biomedit︠s︡inskai︠a︡ Khimii︠a︡
Jie Qiu, Wenwei Zhang, Qingsheng Xia, Fuxue Liu, Shuwei Zhao, Kailing Zhang, Min Chen, Chuanshan Zang, Ruifeng Ge, Dapeng Liang, Yan Sun
As the predominant thyroid cancer, papillary thyroid cancer (PTC) accounts for 75‑85% of thyroid cancer cases. This research aimed to investigate transcriptomic changes and key genes in PTC. Using RNA‑sequencing technology, the transcriptional profiles of 5 thyroid tumor tissues and 5 adjacent normal tissues were obtained. The single nucleotide polymorphisms (SNPs) were identified by SAMtools software and then annotated by ANNOVAR software. After differentially expressed genes (DEGs) were selected by edgR software, they were further investigated by enrichment analysis, protein domain analysis, and protein‑protein interaction (PPI) network analysis...
November 2017: Molecular Medicine Reports
Boyan Zhou, Shaoqing Wen, Lingxiang Wang, Li Jin, Hui Li, Hong Zhang
Ancient DNA obtained from ancient samples, such as sediments, bones, and teeth, is an important genetic resource that can be used to reconstruct an evolutional history of humans, animals, and plants. The application of high-throughput sequencing enables the research of ancient DNA to be conducted in a whole genome scale. However, post-mortem DNA damage mainly caused by deamination of cytosine to uracil (or methylated cytosine to thymine) may confound the variant calling and downstream analysis. In this article, we develop a Python program to implement a new variant caller, "AntCaller", which extracts the information on nucleotide substitutions from sequencing data and calculates the probability of each genotype based on a Bayesian rule...
December 2017: Molecular Genetics and Genomics: MGG
Bo-Young Kim, Jung Hoon Park, Hye-Yeong Jo, Soo Kyung Koo, Mi-Hyun Park
Insertion and deletion (INDEL) mutations, the most common type of structural variance, are associated with several human diseases. The detection of INDELs through next-generation sequencing (NGS) is becoming more common due to the decrease in costs, the increase in efficiency, and sensitivity improvements demonstrated by the various sequencing platforms and analytical tools. However, there are still many errors associated with INDEL variant calling, and distinguishing INDELs from errors in NGS remains challenging...
2017: PloS One
Rafet Al-Tobasei, Ali Ali, Timothy D Leeds, Sixin Liu, Yniv Palti, Brett Kenney, Mohamed Salem
BACKGROUND: Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait)...
August 7, 2017: BMC Genomics
Peizhou Liao, Glen A Satten, Yi-Juan Hu
A fundamental challenge in analyzing next-generation sequencing (NGS) data is to determine an individual's genotype accurately, as the accuracy of the inferred genotype is essential to downstream analyses. Correctly estimating the base-calling error rate is critical to accurate genotype calls. Phred scores that accompany each call can be used to decide which calls are reliable. Some genotype callers, such as GATK and SAMtools, directly calculate the base-calling error rates from phred scores or recalibrated base quality scores...
July 2017: Genetic Epidemiology
Arpita Konar, Olivia Choudhury, Rebecca Bullis, Lauren Fiedler, Jacqueline M Kruser, Melissa T Stephens, Oliver Gailing, Scott Schlarbaum, Mark V Coggeshall, Margaret E Staton, John E Carlson, Scott Emrich, Jeanne Romero-Severson
BACKGROUND: Restriction site associated DNA sequencing (RADseq) has the potential to be a broadly applicable, low-cost approach for high-quality genetic linkage mapping in forest trees lacking a reference genome. The statistical inference of linear order must be as accurate as possible for the correct ordering of sequence scaffolds and contigs to chromosomal locations. Accurate maps also facilitate the discovery of chromosome segments containing allelic variants conferring resistance to the biotic and abiotic stresses that threaten forest trees worldwide...
May 30, 2017: BMC Genomics
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"