Add like
Add dislike
Add to saved papers

Ensemble machine learning prediction of hyperuricemia based on a prospective health checkup population.

Objectives: An accurate prediction model for hyperuricemia (HUA) in adults remain unavailable. This study aimed to develop a stacking ensemble prediction model for HUA to identify high-risk groups and explore risk factors. Methods: A prospective health checkup cohort of 40899 subjects was examined and randomly divided into the training and validation sets with the ratio of 7:3. LASSO regression was employed to screen out important features and then the ROSE sampling was used to handle the imbalanced classes. An ensemble model using stacking strategy was constructed based on three individual models, including support vector machine, decision tree C5.0, and eXtreme gradient boosting. Model validations were conducted using the area under the receiver operating characteristic curve (AUC) and the calibration curve, as well as metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. A model agnostic instance level variable attributions technique (iBreakdown) was used to illustrate the black-box nature of our ensemble model, and to identify contributing risk factors. Results: Fifteen important features were screened out of 23 clinical variables. Our stacking ensemble model with an AUC of 0.854, outperformed the other three models, support vector machine, decision tree C5.0, and eXtreme gradient boosting with AUCs of 0.848, 0.851 and 0.849 respectively. Calibration accuracy as well as other metrics including accuracy, specificity, negative predictive value, and F1 score were also proved our ensemble model's superiority. The contributing risk factors were estimated using six randomly selected subjects, which showed that being female and relatively younger, together with having higher baseline uric acid, body mass index, γ-glutamyl transpeptidase, total protein, triglycerides, creatinine, and fasting blood glucose can increase the risk of HUA. To further validate our model's applicability in the health checkup population, we used another cohort of 8559 subjects that also showed our ensemble prediction model had favorable performances with an AUC of 0.846. Conclusion: In this study, the stacking ensemble prediction model for HUA was developed, and it outperformed three individual models that compose it (support vector machine, decision tree C5.0, and eXtreme gradient boosting). The contributing risk factors were identified with insightful ideas.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app