JOURNAL ARTICLE

The project data sphere initiative: accelerating cancer research by sharing data

Angela K Green, Katherine E Reeder-Hayes, Robert W Corty, Ethan Basch, Mathew I Milowsky, Stacie B Dusetzina, Antonia V Bennett, William A Wood
Oncologist 2015, 20 (5): 464-e20
25876994

BACKGROUND: In this paper, we provide background and context regarding the potential for a new data-sharing platform, the Project Data Sphere (PDS) initiative, funded by financial and in-kind contributions from the CEO Roundtable on Cancer, to transform cancer research and improve patient outcomes. Given the relatively modest decline in cancer death rates over the past several years, a new research paradigm is needed to accelerate therapeutic approaches for oncologic diseases. Phase III clinical trials generate large volumes of potentially usable information, often on hundreds of patients, including patients treated with standard of care therapies (i.e., controls). Both nationally and internationally, a variety of stakeholders have pursued data-sharing efforts to make individual patient-level clinical trial data available to the scientific research community.

POTENTIAL BENEFITS AND RISKS OF DATA SHARING: For researchers, shared data have the potential to foster a more collaborative environment, to answer research questions in a shorter time frame than traditional randomized control trials, to reduce duplication of effort, and to improve efficiency. For industry participants, use of trial data to answer additional clinical questions could increase research and development efficiency and guide future projects through validation of surrogate end points, development of prognostic or predictive models, selection of patients for phase II trials, stratification in phase III studies, and identification of patient subgroups for development of novel therapies. Data transparency also helps promote a public image of collaboration and altruism among industry participants. For patient participants, data sharing maximizes their contribution to public health and increases access to information that may be used to develop better treatments. Concerns about data-sharing efforts include protection of patient privacy and confidentiality. To alleviate these concerns, data sets are deidentified to maintain anonymity. To address industry concerns about protection of intellectual property and competitiveness, we illustrate several models for data sharing with varying levels of access to the data and varying relationships between trial sponsors and data access sponsors.

THE PROJECT DATA SPHERE INITIATIVE: PDS is an independent initiative of the CEO Roundtable on Cancer Life Sciences Consortium, built to voluntarily share, integrate, and analyze comparator arms of historical cancer clinical trial data sets to advance future cancer research. The aim is to provide a neutral, broad-access platform for industry and academia to share raw, deidentified data from late-phase oncology clinical trials using comparator-arm data sets. These data are likely to be hypothesis generating or hypothesis confirming but, notably, do not take the place of performing a well-designed trial to address a specific hypothesis. Prospective providers of data to PDS complete and sign a data sharing agreement that includes a description of the data they propose to upload, and then they follow easy instructions on the website for uploading their deidentified data. The SAS Institute has also collaborated with the initiative to provide intrinsic analytic tools accessible within the website itself. As of October 2014, the PDS website has available data from 14 cancer clinical trials covering 9,000 subjects, with hopes to further expand the database to include more than 25,000 subject accruals within the next year. PDS differentiates itself from other data-sharing initiatives by its degree of openness, requiring submission of only a brief application with background information of the individual requesting access and agreement to terms of use. Data from several different sponsors may be pooled to develop a comprehensive cohort for analysis. In order to protect patient privacy, data providers in the U.S. are responsible for deidentifying data according to standards set forth by the Privacy Rule of the U.S. Health Insurance Portability and Accountability Act of 1996. USING DATA SHARING TO IMPROVE OUTCOMES IN CANCER THE "PROSTATE CANCER CHALLENGE": Control-arm data of several studies among patients with metastatic castration-resistant prostate cancer (mCRPC) are currently available through PDS. These data sets have multiple potential uses. The "Prostate Cancer Challenge" will ask the cancer research community to use clinical trial data deposited in the PDS website to address key research questions regarding mCRPC. General themes that could be explored by the cancer community are described in this article: prognostic models evaluating the influence of pretreatment factors on survival and patient-reported outcomes; comparative effectiveness research evaluating the efficacy of standard of care therapies, as illustrated in our companion article comparing mitoxantrone plus prednisone with prednisone alone; effects of practice variation in dose, frequency, and duration of therapy; level of patient adherence to elements of trial protocols to inform the design of future clinical trials; and age of subjects, regional differences in health care, and other confounding factors that might affect outcomes.

POTENTIAL LIMITATIONS AND METHODOLOGICAL CHALLENGES: The number of data sets available and the lack of experimental-arm data limit the potential scope of research using the current PDS. The number of trials is expected to grow exponentially over the next year and may include multiple cancer settings, such as breast, colorectal, lung, hematologic malignancy, and bone marrow transplantation. Other potential limitations include the retrospective nature of the data analyses performed using PDS and its generalizability, given that clinical trials are often conducted among younger, healthier, and less racially diverse patient populations. Methodological challenges exist when combining individual patient data from multiple clinical trials; however, advancements in statistical methods for secondary database analysis offer many tools for reanalyzing data arising from disparate trials, such as propensity score matching. Despite these concerns, few if any comparable data sets include this level of detail across multiple clinical trials and populations.

CONCLUSION: Access to large, late-phase, cancer-trial data sets has the potential to transform cancer research by optimizing research efficiency and accelerating progress toward meaningful improvements in cancer care. This type of platform provides opportunities for unique research projects that can examine relatively neglected areas and that can construct models necessitating large amounts of detailed data. The full potential of PDS will be realized only when multiple tumor types and larger numbers of data sets are available through the website.

Full Text Links

Find Full Text Links for this Article

Discussion

You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Remove bar
Read by QxMD icon Read
25876994
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"