International journal of radiation oncology, biology, physics | 2021

Development of a Big Data Radiation Oncology Dashboard.

 
 
 
 
 
 
 
 

Abstract


PURPOSE/OBJECTIVE(S)\nHealthcare data often exist in silos and in unstructured formats that limit interoperability and require tedious manual extraction. Our institution has adopted a flexible and scalable big data platform built on Hadoop that integrates data from Epic/Clarity as well as Aria and allows users to leverage modern data science tools to facilitate access. We hypothesize that a data analytics and visualization dashboard can be built using open-source tools that will (1) allow non-technical users to explore de-identified clinical data within our institutional big data platform and (2) connect with repositories of molecular data to demonstrate potential methods of integrating clinical and basic science data.\n\n\nMATERIALS/METHODS\nDe-identified patient-level radiation oncology data from the institutional big data platform (Hadoop) were extracted with the python packages pyodbc and pandas. For the purposes of this dashboard, radiation oncology specific clinical data elements were queried including the date of first radiation treatment, treatment location, treatment modality (SBRT, external beam, SRS, TBI, LDR/HDR brachytherapy), ICD10 codes, anatomic treatment site, number of fractions, treatment prescription, and dose per fraction. A python client connection with the publicly accessible instance of cBioPortal for Cancer Genomics was established using the Bravado library. Data transformation and cleaning was performed in python using panda s data frames. A web-based dashboard to facilitate user-defined visualizations was implemented using the Dash python library and interactive visualizations of subsets of extracted data were generated in real-time using the plotly plotting library.\n\n\nRESULTS\nWe developed a web-based dashboard that gives users without extensive programming expertise the ability to explore de-identified clinical data extracted from Hadoop. As proof of principle, the dashboard was used to visualize the clinical impact of the COVID-19 pandemic on radiation oncology patient volumes, revealing a significant decline in new radiation treatments in April and May of 2020 (-54% and -36% compared to 2019) during the initial COVID-19 surge. Furthermore, the dashboard allows users to interact with the cBioPortal for Cancer Genomics repository, which currently houses clinical and molecular data from 301 publicly available studies spanning 869 different cancer types. This interface with cBioPortal illustrates the potential for future integration of clinically meaningful sequencing results with clinical outcomes data.\n\n\nCONCLUSION\nWe built an interactive web-based dashboard to enable general users easy access to de-identified clinical data stored within the institutional big data platform. Additional data sources, including external molecular data can be connected to the dashboard allowing for future integration.

Volume 111 3S
Pages \n e89\n
DOI 10.1016/j.ijrobp.2021.07.468
Language English
Journal International journal of radiation oncology, biology, physics

Full Text