A comparison of performance of mathematical predictive methods for medical diagnosis: identifying acute cardiac ischemia among emergency department patients

H P Selker, J L Griffith, S Patil, W J Long, R B D'Agostino
Journal of Investigative Medicine: the Official Publication of the American Federation for Clinical Research 1995, 43 (5): 468-76

BACKGROUND: There is increasing interest in mathematical methods for the prediction of medical outcomes. Three methods have attracted particular attention: logistic regression, classification trees (such as ID3 and CART), and neural networks. To compare their relative performance, we used a large clinical database to develop and compare models using these methods.

METHODS: Each modeling method was used to generate predictive instruments for acute cardiac ischemia (which includes acute myocardial infarction and unstable angina pectoris), using prospectivel-collected clinical data on 5773 patients, who presented over a two year period to six hospitals' emergency departments with chest pain or symptoms suggesting acute ischemia. This data set was then split into training (n = 3453) and test (n = 2320) sets. Of 200 available variables, modeling was restricted to those available within the first 10 minutes of emergency department care (history, physical exam, and electrocardiogram).

RESULTS: When the number of variables was limited to eight, representing a practical number for input in the real-time clinical setting, the logistic regression's receiver-operating characteristic (ROC) curve area, as a measure of diagnostic performance, was 0.887; the classification tree model's ROC curve area was 0.858, and the neural network's ROC curve area was 0.902. When the number of variables used by a model was not limited, the logistic regression's ROC area was 0.905, the classification tree model's 0.861, and the neural network's 0.923. Among these models the neural networks had noticeably poorer calibration. When the outputs from each of these unrestricted models were presented to each of the other methods as an additional independent variable, the ROC areas of the new "hybrid" models were not significantly better than the original unlimited models (ROC areas 0.858 to 0.920).

CONCLUSIONS: Logistic regression, classification tree, and neural network models all can provide excellent predictive performance of medical outcomes for clinical decision aids and policy models. Their ultimate limitations seem due to the availability of the information in data (a "data barrier") rather than their respective intrinsic properties. Choices between these methods would seem to be most appropriately based on the needs of the specific application, rather than on the premise that any one of these methods is intrinsically more powerful.

Full Text Links

Find Full Text Links for this Article


You are not logged in. Sign Up or Log In to join the discussion.

Trending Papers

Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"