We have located links that may give you full text access.
A machine learning-based method for feature reduction of methylation data for the classification of cancer tissue origin.
International Journal of Clinical Oncology 2024 September 18
BACKGROUND: Genome DNA methylation profiling is a promising yet costly method for cancer classification, involving substantial data. We developed an ensemble learning model to identify cancer types using methylation profiles from a limited number of CpG sites.
METHODS: Analyzing methylation data from 890 samples across 10 cancer types from the TCGA database, we utilized ANOVA and Gain Ratio to select the most significant CpG sites, then employed Gradient Boosting to reduce these to just 100 sites.
RESULTS: This approach maintained high accuracy across multiple machine learning models, with classification accuracy rates between 87.7% and 93.5% for methods including Extreme Gradient Boosting, CatBoost, and Random Forest. This method effectively minimizes the number of features needed without losing performance, helping to classify primary organs and uncover subgroups within specific cancers like breast and lung.
CONCLUSIONS: Using a gradient boosting feature selector shows potential for streamlining methylation-based cancer classification.
METHODS: Analyzing methylation data from 890 samples across 10 cancer types from the TCGA database, we utilized ANOVA and Gain Ratio to select the most significant CpG sites, then employed Gradient Boosting to reduce these to just 100 sites.
RESULTS: This approach maintained high accuracy across multiple machine learning models, with classification accuracy rates between 87.7% and 93.5% for methods including Extreme Gradient Boosting, CatBoost, and Random Forest. This method effectively minimizes the number of features needed without losing performance, helping to classify primary organs and uncover subgroups within specific cancers like breast and lung.
CONCLUSIONS: Using a gradient boosting feature selector shows potential for streamlining methylation-based cancer classification.
Full text links
Related Resources
Trending Papers
2024 Guideline for the Primary Prevention of Stroke: A Guideline From the American Heart Association/American Stroke Association.Stroke; a Journal of Cerebral Circulation 2024 October 21
Paroxysmal Nocturnal Hemoglobinuria, Pathophysiology, Diagnostics, and Treatment.Transfusion Medicine and Hemotherapy 2024 October
The Role of Natriuretic Peptides in the Management of Heart Failure with a Focus on the Patient with Diabetes.Journal of Clinical Medicine 2024 October 18
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app