- O'Grady, Nicholas;
- Gibbs, David L;
- Abdilleh, Kawther;
- Asare, Adam;
- Asare, Smita;
- Venters, Sara;
- Brown-Swigart, Lamorna;
- Hirst, Gillian L;
- Wolf, Denise;
- Yau, Christina;
- van 't Veer, Laura J;
- Esserman, Laura;
- Basu, Amrita
Objectives
In this paper, we discuss leveraging cloud-based platforms to collect, visualize, analyze, and share data in the context of a clinical trial. Our cloud-based infrastructure, Patient Repository of Biomolecular Entities (PRoBE), has given us the opportunity for uniform data structure, more efficient analysis of valuable data, and increased collaboration between researchers.Materials and methods
We utilize a multi-cloud platform to manage and analyze data generated from the clinical Investigation of Serial Studies to Predict Your Therapeutic Response with Imaging And moLecular Analysis 2 (I-SPY 2 TRIAL). A collaboration with the Institute for Systems Biology Cancer Gateway in the Cloud has additionally given us access to public genomic databases. Applications to I-SPY 2 data have been built using R Shiny, while leveraging Google's BigQuery tables and SQL commands for data mining.Results
We highlight the implementation of PRoBE in several unique case studies including prediction of biomarkers associated with clinical response, access to the Pan-Cancer Atlas, and integrating pathology images within the cloud. Our data integration pipelines, documentation, and all codebase will be placed in a Github repository.Discussion and conclusion
We are hoping to develop risk stratification diagnostics by integrating additional molecular, magnetic resonance imaging, and pathology markers into PRoBE to better predict drug response. A robust cloud infrastructure and tool set can help integrate these large datasets to make valuable predictions of response to multiple agents. For that reason, we are continuously improving PRoBE to advance the way data is stored, accessed, and analyzed in the I-SPY 2 clinical trial.