The fragility of statistically significant findings from randomized trials in spine surgery: a systematic survey

Nathan Evaniew, Carly Files, Christopher Smith, Mohit Bhandari, Michelle Ghert, Michael Walsh, Philip J Devereaux, Gordon Guyatt
Spine Journal: Official Journal of the North American Spine Society 2015 October 1, 15 (10): 2188-97

BACKGROUND CONTEXT: Randomized controlled trials (RCTs) are the most trustworthy source for evaluating treatment effects, but RCTs of spine surgery interventions often produce discordant results. The Fragility Index is a novel metric to inform about the robustness of statistically significant results.

PURPOSE: The aim was to determine the robustness of statistically significant results from RCTs of spine surgery interventions.

STUDY DESIGN/SETTING: This was a systematic survey.

PATIENT SAMPLE: The sample included RCTs of spine surgery interventions.

OUTCOME MEASURES: The Fragility Index is the minimum number of patients in a trial whose status would have to change from a nonevent to an event to change a statistically significant result to a nonsignificant result. Events refer to the occurrence of any dichotomous outcome, such as successful fusion, incident fracture, adjacent segment degeneration, or achievement of a certain functional score. A small Fragility Index indicates that the statistical significance of a result hinges on only a few events, and a large Fragility Index increases one's confidence in the observed treatment effects.

METHODS: We systematically reviewed a database for evidence-based orthopedics and identified all the RCTs that reported at least one positive outcome (ie, p<.05). Two reviewers independently assessed eligibility and extracted data. We used the Fisher exact test to compute Fragility Index values and multivariable linear regression to evaluate potential associated factors.

RESULTS: We identified 40 eligible RCTs with a median sample size of 132 patients (interquartile range [IQR] 79-208) and a median total number of outcome events for the chosen outcome of 31 (IQR 13-63). The median Fragility Index was two (IQR 1-3), which means that adding two events to one of the trial's treatment arms eliminated its statistical significance. The Fragility Index was less than or equal to three events in 75% of the trials, and was less than or equal to the number of patients lost to follow-up in 65% of the trials. Fragility Index values correlated positively with total sample size (r=0.35; p<.05). When adjusted for losses to follow-up and risk of bias, increasing Fragility Index values were associated only with increasingly significant reported p values (p<.01).

CONCLUSIONS: Statistically significant results in spine surgery RCTs are frequently fragile. The addition of only a small number of outcome events can completely eliminate significance. Surgeons, researchers, and other evidence users should exercise caution when interpreting the findings from RCTs with low Fragility Index values and applying these results to patient care.


You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Available on the App Store

Available on the Play Store
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"