Read by QxMD icon Read

BioData Mining

Gregory Ditzler, Nidhal Bouaynaya, Roman Shterenberg, Hassan M Fathallah-Shaykh
Background: Most existing algorithms for modeling and analyzing molecular networks assume a static or time-invariant network topology. Such view, however, does not render the temporal evolution of the underlying biological process as molecular networks are typically "re-wired" over time in response to cellular development and environmental changes. In our previous work, we formulated the inference of time-varying or dynamic networks as a tracking problem, where the target state is the ensemble of edges in the network...
2019: BioData Mining
Youngja H Park, Taewoon Kong, James R Roede, Dean P Jones, Kichun Lee
Background: Analytic methods are available to acquire extensive metabolic information in a cost-effective manner for personalized medicine, yet disease risk and diagnosis mostly rely upon individual biomarkers based on statistical principles of false discovery rate and correlation. Due to functional redundancies and multiple layers of regulation in complex biologic systems, individual biomarkers, while useful, are inherently limited in disease characterization. Data reduction and discriminant analysis tools such as principal component analysis (PCA), partial least squares (PLS), or orthogonal PLS (O-PLS) provide approaches to separate the metabolic phenotypes, but do not offer a statistical basis for selection of group-wise metabolites as contributors to metabolic phenotypes...
2019: BioData Mining
Randall J Ellis, Zichen Wang, Nicholas Genes, Avi Ma'ayan
Background: The opioid epidemic in the United States is averaging over 100 deaths per day due to overdose. The effectiveness of opioids as pain treatments, and the drug-seeking behavior of opioid addicts, leads physicians in the United States to issue over 200 million opioid prescriptions every year. To better understand the biomedical profile of opioid-dependent patients, we analyzed information from electronic health records (EHR) including lab tests, vital signs, medical procedures, prescriptions, and other data from millions of patients to predict opioid substance dependence...
2019: BioData Mining
Fleur Jeanquartier, Claire Jean-Quartier, Andreas Holzinger
Background: A plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but also less-known and more specialized projects on pediatric diseases such as PCGP. However, in case of data on childhood cancer there is very little information openly available. Several web-based resources and tools offer general biomedical data which are not purpose-built, for neither pediatric nor cancer analysis...
2019: BioData Mining
Sebastian Bittrich, Marika Kaden, Christoph Leberecht, Florian Kaiser, Thomas Villmann, Dirk Labudde
Background: Machine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Meanwhile, algorithms improve in performance, but also gain complexity, and tend to neglect interpretability and comprehensiveness of the resulting models. Results: Generalized Matrix Learning Vector Quantization (GMLVQ) is a supervised, prototype-based machine learning method and provides comprehensive visualization capabilities not present in other classifiers which allow for a fine-grained interpretation of the data...
2019: BioData Mining
Seungyeoun Lee, Donghee Son, Yongkang Kim, Wenbao Yu, Taesung Park
Background: One strategy for addressing missing heritability in genome-wide association study is gene-gene interaction analysis, which, unlike a single gene approach, involves high-dimensionality. The multifactor dimensionality reduction method (MDR) has been widely applied to reduce multi-levels of genotypes into high or low risk groups. The Cox-MDR method has been proposed to detect gene-gene interactions associated with the survival phenotype by using the martingale residuals from a Cox model...
2018: BioData Mining
David Bouzaglo, Israel Chasida, Elishai Ezra Tsur
The integration of cloud resources with federated data retrieval has the potential of improving the maintenance, accessibility and performance of specialized databases in the biomedical field. However, such an integrative approach requires technical expertise in cloud computing, usage of a data retrieval engine and development of a unified data-model, which can encapsulate the heterogeneity of biological data. Here, a framework for the development of cloud-based biological specialized databases is proposed...
2018: BioData Mining
Daria Efimova, Alexander Tyakht, Anna Popenko, Anatoly Vasilyev, Ilya Altukhov, Nikita Dovidchenko, Vera Odintsova, Natalya Klimenko, Robert Loshkarev, Maria Pashkova, Anna Elizarova, Viktoriya Voroshilova, Sergei Slavskii, Yury Pekov, Ekaterina Filippova, Tatiana Shashkova, Evgenii Levin, Dmitry Alexeev
Background: Metagenomic surveys of human microbiota are becoming increasingly widespread in academic research as well as in food and pharmaceutical industries and clinical context. Intuitive tools for investigating experimental data are of high interest to researchers. Results: Knomics-Biota is a web-based resource for exploratory analysis of human gut metagenomes. Users can generate and share analytical reports corresponding to common experimental schemes (like case-control study or paired comparison)...
2018: BioData Mining
Weifu Li, Jing Liu, Chi Xiao, Hao Deng, Qiwei Xie, Hua Han
Background: It is becoming increasingly clear that the quantification of mitochondria and synapses is of great significance to understand the function of biological nervous systems. Electron microscopy (EM), with the necessary resolution in three directions, is the only available imaging method to look closely into these issues. Therefore, estimating the number of mitochondria and synapses from the serial EM images is coming into prominence. Since previous studies have achieved preferable 2D segmentation performance, it holds great promise to obtain the 3D connection relationship from the 2D segmentation results...
2018: BioData Mining
M Arabnejad, B A Dawkins, W S Bush, B C White, A R Harkness, B A McKinney
Background: ReliefF is a nearest-neighbor based feature selection algorithm that efficiently detects variants that are important due to statistical interactions or epistasis. For categorical predictors, like genotypes, the standard metric used in ReliefF has been a simple (binary) mismatch difference. In this study, we develop new metrics of varying complexity that incorporate allele sharing, adjustment for allele frequency heterogeneity via the genetic relationship matrix (GRM), and physicochemical differences of variants via a new transition/transversion encoding...
2018: BioData Mining
Eleonora Cappelli, Giovanni Felici, Emanuel Weitschek
Background: In the Next Generation Sequencing (NGS) era a large amount of biological data is being sequenced, analyzed, and stored in many public databases, whose interoperability is often required to allow an enhanced accessibility. The combination of heterogeneous NGS genomic data is an open challenge: the analysis of data from different experiments is a fundamental practice for the study of diseases. In this work, we propose to combine DNA methylation and RNA sequencing NGS experiments at gene level for supervised knowledge extraction in cancer...
2018: BioData Mining
Moshe Sipper, Ryan J Urbanowicz, Jason H Moore
No abstract text is available yet for this article.
2018: BioData Mining
Aida Mrzic, Pieter Meysman, Wout Bittremieux, Pieter Moris, Boris Cule, Bart Goethals, Kris Laukens
Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks...
2018: BioData Mining
Jing Zhang, Dan Wu, Yiqin Dai, Jianjiang Xu
Background: The genetic architecture underlying central cornea thickness (CCT) is far from understood. Most of the CCT-associated variants are located in the non-coding regions, raising the difficulty of following functional characterizations. Thus, integrative functional analyses on CCT-associated loci might benefit in overcoming these issues by prioritizing the hub genes that are located in the center of CCT genetic network. Methods: Integrative analyses including functional annotations, enrichment analysis, and protein-protein interaction analyses were performed on all reported CCT GWAS lead SNPs, together with their proxy variants...
2018: BioData Mining
Christina Brester, Jussi Kauhanen, Tomi-Pekka Tuomainen, Sari Voutilainen, Mauno Rönkkö, Kimmo Ronkainen, Eugene Semenkin, Mikko Kolehmainen
Background: The redundancy of information is becoming a critical issue for epidemiologists. High-dimensional datasets require new effective variable selection methods to be developed. This study implements an advanced evolutionary variable selection method which is applied for cardiovascular predictive modeling. The epidemiological follow-up study KIHD (Kuopio Ischemic Heart Disease Risk Factor Study) was used to compare the designed variable selection method based on an evolutionary search with conventional stepwise selection...
2018: BioData Mining
David Gutiérrez-Avilés, Raúl Giráldez, Francisco Javier Gil-Cumbreras, Cristina Rubio-Escudero
Background: Triclustering has shown to be a valuable tool for the analysis of microarray data since its appearance as an improvement of classical clustering and biclustering techniques. The standard for validation of triclustering is based on three different measures: correlation, graphic similarity of the patterns and functional annotations for the genes extracted from the Gene Ontology project (GO). Results: We propose TRIQ , a single evaluation measure that combines the three measures previously described: correlation, graphic validation and functional annotation, providing a single value as result of the validation of a tricluster solution and therefore simplifying the steps inherent to research of comparison and selection of solutions...
2018: BioData Mining
Cheng-Hong Yang, Kuo-Chuan Wu, Yu-Shiun Lin, Li-Yeh Chuang, Hsueh-Wei Chang
Background: The function of a protein is determined by its native protein structure. Among many protein prediction methods, the Hydrophobic-Polar (HP) model, an ab initio method, simplifies the protein folding prediction process in order to reduce the prediction complexity. Results: In this study, the ions motion optimization (IMO) algorithm was combined with the greedy algorithm (namely IMOG) and implemented to the HP model for the protein folding prediction based on the 2D-triangular-lattice model...
2018: BioData Mining
Jorge Parraga-Alava, Marcio Dorn, Mario Inostroza-Ponta
Background: Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised...
2018: BioData Mining
Jens Dörpinghaus, Sebastian Schaaf, Marc Jacobs
Background: In text mining, document clustering describes the efforts to assign unstructured documents to clusters, which in turn usually refer to topics. Clustering is widely used in science for data retrieval and organisation. Results: In this paper we present and discuss a novel graph-theoretical approach for document clustering and its application on a real-world data set. We will show that the well-known graph partition to stable sets or cliques can be generalized to pseudostable sets or pseudocliques...
2018: BioData Mining
Amani Al-Ajlan, Achraf El Allali
Background: Computational approaches, specifically machine-learning techniques, play an important role in many metagenomic analysis algorithms, such as gene prediction. Due to the large feature space, current de novo gene prediction algorithms use different combinations of classification algorithms to distinguish between coding and non-coding sequences. Results: In this study, we apply a filter method to select relevant features from a large set of known features instead of combining them using linear classifiers or ignoring their individual coding potential...
2018: BioData Mining
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"