Heng Li, Xiaowen Feng, Chong Chu
The recent advances in sequencing technologies enable the assembly of individual genomes to the quality of the reference genome. How to integrate multiple genomes from the same species and make the integrated representation accessible to biologists remains an open challenge. Here, we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate of the linear reference genome. We implement our ideas in the minigraph toolkit and demonstrate that we can efficiently construct a pangenome graph and compactly encode tens of thousands of structural variants missing from the current reference genome...
October 16, 2020: Genome Biology
Zhemin Zhou, Jane Charlesworth, Mark Achtman
Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications, and horizontal gene transfer. To reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus...
October 14, 2020: Genome Research
Roshan Kumar, Karen Register, Jane Christopher-Hennings, Paolo Moroni, Gloria Gioia, Nuria Garcia-Fernandez, Julia Nelson, Murray D Jelinski, Inna Lysnyansky, Darrell Bayles, David Alt, Joy Scaria
: Among more than twenty species belonging to the class Mollecutes, Mycoplasma bovis is the most common cause of bovine mycoplasmosis in North America and Europe. Bovine mycoplasmosis causes significant economic loss in the cattle industry. The number of M. bovis positive herds recently has increased in North America and Europe. Since antibiotic treatment is ineffective and no efficient vaccine is available, M. bovis induced mycoplasmosis is primarily controlled by herd management measures such as the restriction of moving infected animals out of the herds and culling of infected or shedders of M...
October 10, 2020: Microorganisms
Zara M Summers, Hassiba Belahbib, Nathalie Pradel, Manon Bartoli, Pooja Mishra, Christian Tamburini, Alain Dolla, Bernard Ollivier, Fabrice Armougom
Hot oil reservoirs harbor diverse microbial communities, with many of them inhabiting thermophilic or hyperthermophilic fermentative Thermotogae species. A new Thermotoga sp. strain TFO was isolated from an Californian offshore oil reservoir which is phylogenetically related to thermophilic species T. petrophila RKU-1T and T. naphthophila RKU-10T , isolated from the Kubiki oil reservoir in Japan. The average nucleotide identity and DNA-DNA hybridization measures provide evidence that the novel strain TFO is closely related to T...
September 1, 2020: Systematic and Applied Microbiology
Sarah E Jensen, Jean Rigaud Charles, Kebede Muleta, Peter J Bradbury, Terry Casstevens, Santosh P Deshpande, Michael A Gore, Rajeev Gupta, Daniel C Ilut, Lynn Johnson, Roberto Lozano, Zachary Miller, Punna Ramu, Abhishek Rathore, M Cinta Romay, Hari D Upadhyaya, Rajeev K Varshney, Geoffrey P Morris, Gael Pressoir, Edward S Buckler, Guillaume P Ramstein
Successful management and utilization of increasingly large genomic datasets is essential for breeding programs to accelerate cultivar development. To help with this, we developed a Sorghum bicolor Practical Haplotype Graph (PHG) pangenome database that stores haplotypes and variant information. We developed two PHGs in sorghum that were used to identify genome-wide variants for 24 founders of the Chibas sorghum breeding program from 0.01x sequence coverage. The PHG called single nucleotide polymorphisms (SNPs) with 5...
March 2020: Plant Genome
Yi Yang, Yaozhi Zhang, Natalie L Cápiro, Jun Yan
Dehalococcoidia ( Dia ) class microorganisms are frequently found in various pristine and contaminated environments. Metagenome-assembled genomes (MAGs) and single-cell amplified genomes (SAGs) studies have substantially improved the understanding of Dia microbial ecology and evolution; however, an updated thorough investigation on the genomic and evolutionary characteristics of Dia microorganisms distributed in geographically distinct environments has not been implemented. In this study, we analyzed available genomic data to unravel Dia evolutionary and metabolic traits...
2020: Frontiers in Microbiology
Maulik Patel, Hiral M Patel, Nasim Vohra, Sanjay Dave
We report the complete genome sequencing of novel Pseudomonas stutzeri strain MP4687 isolated from cattle rumen. Various strains of P. stutzeri have been reported from different environmental samples including oil-contaminated sites, crop roots, air, and human clinical samples, but not from rumen samples, which is being reported here for the first time. The genome of P. stutzeri MP4687 has a single replicon, 4.75 Mb chromosome and a G + C content of 63.45%. The genome encodes for 4,790 protein coding genes including 164 CAZymes and 345 carbohydrate processing genes...
December 2020: Biotechnology Reports
Dipesh Kumar Verma, Gunjan Vasudeva, Chandni Sidhu, Anil K Pinnaka, Senthil E Prasad, Krishan Gopal Thakur
Haloarchaea are salt-loving archaea and potential source of industrially relevant halotolerant enzymes. In the present study, three reddish-pink, extremely halophilic archaeal strains, namely wsp1 (wsp-water sample Pondicherry), wsp3, and wsp4, were isolated from the Indian Solar saltern. The phylogenetic analysis based on 16S rRNA gene sequences suggests that both wsp3 and wsp4 strains belong to Halogeometricum borinquense while wsp1 is closely related to Haloferax volcanii species. The comparative genomics revealed an open pangenome for both genera investigated here...
2020: Frontiers in Microbiology
Yaovi M G Hounmanou, Anders Dalsgaard, Tirzania Frannetta Sopacua, Gazi Md Noor Uddin, Pimlapas Leekitcharoenphon, Rene S Hendriksen, John E Olsen, Marianne Halberg Larsen
Salmonella Weltevreden is increasingly reported from aquatic environments, seafood, and patients in several Southeast Asian countries. Using genome-wide analysis, we characterized S . Weltevreden isolated from cultured shrimp and tilapia from Vietnam and China to study their genetic characteristics and relatedness to clinical isolates of S . Weltevreden ST-365. The phylogenetic analysis revealed up to 312 single-nucleotide polymorphism (SNP) difference between tilapia isolates, whereas isolates from shrimp were genetically more closely related...
2020: Frontiers in Microbiology
Yongming Chen, Wanjun Song, Xiaoming Xie, Zihao Wang, Panfeng Guan, Huiru Peng, Yuannian Jiao, Zhongfu Ni, Qixin Sun, Weilong Guo
Plant genome sequencing has dramatically increased, and some species even have multiple high-quality reference versions. Demands for clade-specific homology inference and analysis have increased in pangenomic era. We proposed a novel method, GeneTribe (, for homology inference among genetically similar genomes that incorporates gene collinearity and shows better performance than traditional sequence-similarity-based methods in terms of accuracy and scalability. The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops such as wheat, barley, and rye...
September 23, 2020: Molecular Plant
Yuqing Feng, Xuezheng Fan, Liangquan Zhu, Xinyue Yang, Yan Liu, Shiguang Gao, Xiaolu Jin, Dan Liu, Jiabo Ding, Yuming Guo, Yongfei Hu
Clostridium perfringens is associated with a variety of diseases in both humans and animals. Recent advances in genomic sequencing make it timely to re-visit this important pathogen. Although the genome sequence of C. perfringens was first determined in 2002, large-scale comparative genomics with isolates of different origins is still lacking. In this study, we used whole-genome sequencing of 45 C . perfringens isolates with isolation time spanning an 80-year period and performed comparative analysis of 173 genomes from worldwide strains...
September 25, 2020: Microbial Genomics
Mikko Rautiainen, Tobias Marschall
Genome graphs can represent genetic variation and sequence uncertainty. Aligning sequences to genome graphs is key to many applications, including error correction, genome assembly, and genotyping of variants in a pangenome graph. Yet, so far, this step is often prohibitively slow. We present GraphAligner, a tool for aligning long reads to genome graphs. Compared to the state-of-the-art tools, GraphAligner is 13x faster and uses 3x less memory. When employing GraphAligner for error correction, we find it to be more than twice as accurate and over 12x faster than extant tools...
September 24, 2020: Genome Biology
Miquel Sánchez-Osuna, Pilar Cortés, Montserrat Llagostera, Jordi Barbé, Ivan Erill
Trimethoprim is a synthetic antibacterial agent that targets folate biosynthesis by competitively binding to the di-hydrofolate reductase enzyme (DHFR). Trimethoprim is often administered synergistically with sulfonamide, another chemotherapeutic agent targeting the di-hydropteroate synthase (DHPS) enzyme in the same pathway. Clinical resistance to both drugs is widespread and mediated by enzyme variants capable of performing their biological function without binding to these drugs. These mutant enzymes were assumed to have arisen after the discovery of these synthetic drugs, but recent work has shown that genes conferring resistance to sulfonamide were present in the bacterial pangenome millions of years ago...
September 24, 2020: Microbial Genomics
Daniel Yero, Oscar Conchillo-Solé, Xavier Daura
There is still a lack of vaccines for many bacterial infections for which the best treatment option would be a prophylactic one. On the other hand, effectiveness has been questioned for some existing vaccines, prompting new developments. Therapeutic vaccines are also becoming a treatment option in specific cases where antibiotics tend to fail. In this scenario, refinement and extension of the classical reverse vaccinology approach is allowing scientists to find new and more effective antigens. In this chapter, we describe an in silico methodology that integrates pangenomic, immunoinformatic, structural, and evolutionary approaches for the screening of potential antigens in a given bacterial species...
2021: Methods in Molecular Biology
Jean Pierre González-Gómez, Sonia Soto-Rodriguez, Osvaldo López-Cuevas, Nohelia Castro-Del Campo, Cristóbal Chaidez, Bruno Gomez-Gil
Acute hepatopancreatic necrosis disease (AHPND) is a severe disease affecting recently stocked cultured shrimps. The disease is mainly caused by V. parahaemolyticus that harbors the pVA1 plasmid; this plasmid contains the pirA and pirB genes, which encode a delta-endotoxin. AHPND originated in China in 2009 and has since spread to several other Asian countries and recently to Latin America (2013). Many Asian strains have been sequenced, and their sequences are publicly accessible in scientific databases, but only four strains from Latin America have been reported...
September 21, 2020: Current Microbiology
Fotis E Psomopoulos, Jacques van Helden, Claudine Médigue, Anastasia Chasapi, Christos A Ouzounis
As genome sequencing efforts are unveiling the genetic diversity of the biosphere with an unprecedented speed, there is a need to accurately describe the structural and functional properties of groups of extant species whose genomes have been sequenced, as well as their inferred ancestors, at any given taxonomic level of their phylogeny. Elaborate approaches for the reconstruction of ancestral states at the sequence level have been developed, subsequently augmented by methods based on gene content. While these approaches of sequence or gene-content reconstruction have been successfully deployed, there has been less progress on the explicit inference of functional properties of ancestral genomes, in terms of metabolic pathways and other cellular processes...
September 14, 2020: Microbial Genomics
Ryan W Christian, Seanna L Hewitt, Grant Nelson, Eric H Roalson, Amit Dhingra
Subcellular relocalization of proteins determines an organism's metabolic repertoire and thereby its survival in unique evolutionary niches. In plants, the plastid and its various morphotypes import a large and varied number of nuclear-encoded proteins to orchestrate vital biochemical reactions in a spatiotemporal context. Recent comparative genomics analysis and high-throughput shotgun proteomics data indicate that there are a large number of plastid-targeted proteins that are either semi-conserved or non-conserved across different lineages...
2020: PeerJ
Inge M Ambros, Gian-Paolo Tonini, Ulrike Pötschger, Nicole Gross, Véronique Mosseri, Klaus Beiske, Ana P Berbegall, Jean Bénard, Nick Bown, Huib Caron, Valérie Combaret, Jerome Couturier, Raffaella Defferrari, Olivier Delattre, Marta Jeison, Per Kogner, John Lunec, Barbara Marques, Tommy Martinsson, Katia Mazzocco, Rosa Noguera, Gudrun Schleiermacher, Alexander Valent, Nadine Van Roy, Eva Villamon, Dasa Janousek, Ingrid Pribill, Evgenia Glogova, Edward F Attiyeh, Michael D Hogarty, Tom F Monclair, Keith Holmes, Dominique Valteau-Couanet, Victoria Castel, Deborah A Tweddle, Julie R Park, Sue Cohn, Ruth Ladenstein, Maja Beck-Popovic, Bruno De Bernardi, Jean Michon, Andrew D J Pearson, Peter F Ambros
PURPOSE: For localized, resectable neuroblastoma without MYCN amplification, surgery only is recommended even if incomplete. However, it is not known whether the genomic background of these tumors may influence outcome. PATIENTS AND METHODS: Diagnostic samples were obtained from 317 tumors, International Neuroblastoma Staging System stages 1/2A/2B, from 3 cohorts: Localized Neuroblastoma European Study Group I/II and Children's Oncology Group. Genomic data were analyzed using multi- and pangenomic techniques and fluorescence in-situ hybridization in 2 age groups (cutoff age, 18 months) and were quality controlled by the International Society of Pediatric Oncology European Neuroblastoma (SIOPEN) Biology Group...
September 9, 2020: Journal of Clinical Oncology
Vincenzo Bonnici, Emiliano Maresi, Rosalba Giugno
Given a group of genomes, represented as the sets of genes that belong to them, the discovery of the pangenomic content is based on the search of genetic homology among the genes for clustering them into families. Thus, pangenomic analyses investigate the membership of the families to the given genomes. This approach is referred to as the gene-oriented approach in contrast to other definitions of the problem that takes into account different genomic features. In the past years, several tools have been developed to discover and analyse pangenomic contents...
September 7, 2020: Briefings in Bioinformatics
Intikhab Alam, Allan A Kamau, Maxat Kulmanov, Łukasz Jaremko, Stefan T Arold, Arnab Pain, Takashi Gojobori, Carlos M Duarte
The spread of the novel coronavirus (SARS-CoV-2) has triggered a global emergency, that demands urgent solutions for detection and therapy to prevent escalating health, social, and economic impacts. The spike protein (S) of this virus enables binding to the human receptor ACE2, and hence presents a prime target for vaccines preventing viral entry into host cells. The S proteins from SARS and SARS-CoV-2 are similar, but structural differences in the receptor binding domain (RBD) preclude the use of SARS-specific neutralizing antibodies to inhibit SARS-CoV-2...
2020: Frontiers in Cellular and Infection Microbiology
