We have located links that may give you full text access.
Evaluating the Sensitivity, Specificity, and Accuracy of ChatGPT-3.5, ChatGPT-4, Bing AI, and Bard Against Conventional Drug-Drug Interactions Clinical Tools.
BACKGROUND: AI platforms are equipped with advanced algorithms that have the potential to offer a wide range of applications in healthcare services. However, information about the accuracy of AI chatbots against conventional drug-drug interaction tools is limited. This study aimed to assess the sensitivity, specificity, and accuracy of ChatGPT-3.5, ChatGPT-4, Bing AI, and Bard in predicting drug-drug interactions.
METHODS: AI-based chatbots (ie, ChatGPT-3.5, ChatGPT-4, Microsoft Bing AI, and Google Bard) were compared for their abilities to detect clinically relevant DDIs for 255 drug pairs. Descriptive statistics, such as specificity, sensitivity, accuracy, negative predictive value (NPV), and positive predictive value (PPV), were calculated for each tool.
RESULTS: When a subscription tool was used as a reference, the specificity ranged from a low of 0.372 (ChatGPT-3.5) to a high of 0.769 (Microsoft Bing AI). Also, Microsoft Bing AI had the highest performance with an accuracy score of 0.788, with ChatGPT-3.5 having the lowest accuracy rate of 0.469. There was an overall improvement in performance for all the programs when the reference tool switched to a free DDI source, but still, ChatGPT-3.5 had the lowest specificity (0.392) and accuracy (0.525), and Microsoft Bing AI demonstrated the highest specificity (0.892) and accuracy (0.890). When assessing the consistency of accuracy across two different drug classes, ChatGPT-3.5 and ChatGPT-4 showed the highest variability in accuracy. In addition, ChatGPT-3.5, ChatGPT-4, and Bard exhibited the highest fluctuations in specificity when analyzing two medications belonging to the same drug class.
CONCLUSION: Bing AI had the highest accuracy and specificity, outperforming Google's Bard, ChatGPT-3.5, and ChatGPT-4. The findings highlight the significant potential these AI tools hold in transforming patient care. While the current AI platforms evaluated are not without limitations, their ability to quickly analyze potentially significant interactions with good sensitivity suggests a promising step towards improved patient safety.
METHODS: AI-based chatbots (ie, ChatGPT-3.5, ChatGPT-4, Microsoft Bing AI, and Google Bard) were compared for their abilities to detect clinically relevant DDIs for 255 drug pairs. Descriptive statistics, such as specificity, sensitivity, accuracy, negative predictive value (NPV), and positive predictive value (PPV), were calculated for each tool.
RESULTS: When a subscription tool was used as a reference, the specificity ranged from a low of 0.372 (ChatGPT-3.5) to a high of 0.769 (Microsoft Bing AI). Also, Microsoft Bing AI had the highest performance with an accuracy score of 0.788, with ChatGPT-3.5 having the lowest accuracy rate of 0.469. There was an overall improvement in performance for all the programs when the reference tool switched to a free DDI source, but still, ChatGPT-3.5 had the lowest specificity (0.392) and accuracy (0.525), and Microsoft Bing AI demonstrated the highest specificity (0.892) and accuracy (0.890). When assessing the consistency of accuracy across two different drug classes, ChatGPT-3.5 and ChatGPT-4 showed the highest variability in accuracy. In addition, ChatGPT-3.5, ChatGPT-4, and Bard exhibited the highest fluctuations in specificity when analyzing two medications belonging to the same drug class.
CONCLUSION: Bing AI had the highest accuracy and specificity, outperforming Google's Bard, ChatGPT-3.5, and ChatGPT-4. The findings highlight the significant potential these AI tools hold in transforming patient care. While the current AI platforms evaluated are not without limitations, their ability to quickly analyze potentially significant interactions with good sensitivity suggests a promising step towards improved patient safety.
Full text links
Related Resources
Trending Papers
2024 AHA/ACC/ACS/ASNC/HRS/SCA/SCCT/SCMR/SVM Guideline for Perioperative Cardiovascular Management for Noncardiac Surgery: A Report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines.Circulation 2024 September 24
Pathophysiology and Treatment of Prediabetes and Type 2 Diabetes in Youth.Diabetes Care 2024 September 9
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app