cgmquantify: Python and R packages for comprehensive analysis of interstitial glucose and glycemic variability from continuous glucose monitor data
ccgmquantify: Python and R packages for comprehensive analysis of interstitial glucose and glycemic variability from continuous glucose monitor data
Brinnae Bent , Maria Henriquez , Jessilyn Dunn Department of Biomedical Engineering, Duke University, Durham, North Carolina Department of Statistical Science, Duke University, Durham, North Carolina Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina TAGS Open source, Python, R, bioinformatics, data analysis, glucose, glucose variability, continuous glucose monitoring, type I diabetes, type II diabetes, diabetes, prediabetes, Open APS SUMMARY Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day (typically values are recorded every 5 minutes). CGMs are commonly used in diabetes management by clinicians and patients and in research to understand how factors of longitudinal glucose and glucose variability relate to disease onset and severity and the efficacy of interventions. CGM data presents unique bioinformatic challenges because the data is longitudinal, temporal, and there are nearly infinite possible ways to summarize and use this data. There are over 20 metrics of glucose variability, no standardization of metrics, and little validation across studies. Here we present open source python and R packages called cgmquantify, which contains over 20 functions with over 25 clinically validated metrics of glucose and glucose variability and functions for visualizing longitudinal CGM data. This is expected to be useful for researchers and may provide additional insights to patients and clinicians about glucose patterns. NTRODUCTION Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day. CGMs are commonly used in diabetes management, with 1.2 million diabetic patients using a CGM . CGM use has be associated with improved glycemic control in adults with type 1 diabetes . These devices have been used extensively by the T1D community, including in the Open Artificial Pancreas System Project (OpenAPS) , a project developed to create a patient-implemented closed loop system between a CGM and an insulin pump. CGM data is commonly provided from CGM manufacturers as either raw glucose values (in a .csv format) or in summary reports that utilize proprietary methods to plot and summarize glucose statistics (e.g. Dexcom Clarity currently shows overall mean glucose, standard deviation of glucose, time in range, and hypoglycemia risk and daily minimum, maximum, mean, and standard deviation of glucose). Because these algorithms are proprietary, they cannot be properly validated by clinical researchers . Additionally, the provided glucose summaries are extremely limited and do not usually contain any information about an important clinical metric, glycemic variability. Glycemic variability, also known as glucose variability, is an established risk factor for hypoglycemia and has been shown to be a risk factor in diabetes complications . Glucose variability can be found in over 26,000 publications indexed in PubMed at the time of this publication and is a significant metric in clinical research . Over 20 metrics of glucose variability have been identified (Table 1), which makes it difficult to examine and compare results across numerous research studies analyzing and drawing conclusions about glucose variability. There is a need for an open source resource with algorithms that are utilized and validated in clinical research studies. This would enable standardized glucose variability metrics and the ability to compare findings from studies that utilize different metrics of glucose variability. This resource should be available in an open source programming language with a low barrier to entry to encourage researchers, clinicians, and patients alike to explore CGM data. Previous open-source resources have been implemented in Excel and R . There is currently no comprehensive resource for CGM data in Python, the third most common programming language used globally and the leading language among newcomers . Additionally, previous implementations of open source CGM data analysis have limited metrics of glucose variability. Further, these methods are typically developed for a specific purpose and are therefore not extensible (e.g. do not have simple functions so users can customize their metrics and visualizations). We have developed a package written and published in Python under the MIT license and a package written and published in R under the MIT license. The packages, both named cgmquantify, contain over 20 functions with more than 25 metrics summarizing glucose and glucose variability. There is also includes code for visualizing CGM data in both packages. Both ackages are available in the Digital Biomarker Discovery Pipeline (DBDP) , the open source software platform for digital biomarker discovery. The python package is available under the Python Package Index (PyPI) (https://pypi.org/project/cgmquantify/). Source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify. The R package is available in the Comprehensive R Archive Network (CRAN) (https://CRAN.R-project.org/package=cgmquantify) and source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify-R. METHODS cgmquantify is a Python package and an R package composed of 20+ functions with over 25 clinically validated metrics of glucose and glucose variability, as shown in Table 1. Customizable visualizations (Figure 1, Figure 2) are also included as easy to implement functions. cgmquantify is version controlled through GitHub and PyPI or CRAN. This allows for single-line installation in either language. Source code and an extensive user guide are available on GitHub to facilitate ease of use and enable customization based on user needs. Issue tracking on GitHub is monitored closely by the Digital Biomarker Discovery Pipeline to allow for rapid feedback. Tests are available in GitHub under the tests subdirectory to allow for manual testing of all functions. RESULTS We have included import functions to format data for use with the cgmquantify package. These functions currently support Dexcom CGM devices, with plans to add additional import functions for other CGM manufacturers, including Medtronic and Abbott. Our user guide also outlines how one can easily format data to make any data input compatible with the functions in cgmquantify. Functions are available for all the commonly studied glucose and glucose variability metrics (Table 1). Additionally, functions for data visualization of the longitudinal CGM data are provided. These visualizations are easily customizable. We have also implemented a function that enables LOWESS smoothing over the CGM data (Figure 1). DISCUSSION cgmquantify is a package that simplifies the process of calculating metrics and thus allows for easy comparison across different research studies that use different metrics summarizing glucose and glucose variability. Functions have been developed using equations from clinically validated research studies so users can compare their results to previous findings. The cgmquantify package is easily implemented with a one-line installation and an extensive user guide in both the python and R languages. Detailed documentation facilitates modification of xisting code for customization of input and visualizations. This package also has the ability to build a community of developers to contribute to the literature in this burgeoning field. This is a much-needed resource for the community of researchers, clinicians, and patients using CGM. Currently, little is understood about the relationships between glucose and glucose variability metrics from CGM data and relationships to diseases including but not limited to prediabetes, T2D, and severity of symptoms in T1D. As more researchers and clinicians start looking to CGM data to answer these questions, the need for a standardized resource in a nearly ubiquitous programming language is necessary. As we have seen with the Open APS community, analysis of CGM data is not limited to researchers and clinicians but includes patients themselves . By providing this as an open source resource, we hope to encourage patients to interact with their own data, determine personalized insights, and make meaningful contributions to the digital health landscape. FUTURE IMPLEMENTATIONS Future contributions will include additional import functions customized to all the CGM manufacturers, including but not limited to Medtronic and Abbott. We are exploring methods to incorporate food logs into visualizations of CGM data. CODE AVAILABILITY The cgmquantify python package is available under the Python Package Index (PyPI) (https://pypi.org/project/cgmquantify/). Source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify. The cgmquantify R package is available in the Comprehensive R Archive Network (CRAN) (https://CRAN.R-project.org/package=cgmquantify) and source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify-R. We encourage others to expand on our ideas and contribute their own glucose and glucose variability metrics to cgmquantify. We have documentation for contributing available in our User Guide. REFERENCES 1. Bent B, Wang K, Grzesiak E, et al. The Digital Biomarker Discovery Pipeline: An open source software platform for the development of digital biomarkers using mHealth and wearables data. J Clin Transl Sci . 2020;(11):1-28. doi:10.1017/cts.2020.511. 2. Cho P, Bent B, Wittmann A, et al. Expanding the Definition of Intraday Glucose Variability. In:
Diabetes . American Diabetes Association; 2020. 3. eAG/A1C Conversion Calculator | American Diabetes Association. ttps://professional.diabetes.org/diapro/glucose_calc. Accessed February 15, 2020. 4. Goldsack J, Coravos A, Bakker J, Bent B, Dowling AV, Fitzer-Attas C, Godfrey A, Godino JG, Gujar N, Izmailova E, Manta C, Peterson B, Vandendressche BV, Wood WA, Wang KW DJ. Verification, Analytical Validation, and Clinical Validation (V3): The Foundation of Determining Fit-for-Purpose for Biometric Monitoring Technologies (BioMeTs).
JMIR Prepr . 2020. https://preprints.jmir.org/preprint/17264. 5. De Groot M, Drangsholt M, Martin-Sanchez F, Wolf G. Single Subject (N-of-1) Research Design, Data Processing, and Personal Science.
Methods Inf Med . 2017;56(06):416-418. doi:10.3414/ME17-03-0001. 6. Hill NR, Oliver NS, Choudhary P, Levy JC, Hindmarsh P, Matthews DR. Normal reference range for mean tissue glucose and glycemic variability derived from continuous glucose monitoring for subjects without diabetes in different ethnic groups.
Diabetes Technol Ther . 2011;13(9):921-928. doi:10.1089/dia.2010.0247. 7. Kovatchev B. Glycemic Variability: Risk Factors, Assessment, and Control.
J Diabetes Sci Technol . 2019;13(4):627-635. doi:10.1177/1932296819826111. 8. Kovatchev BP. Metrics for glycaemic control β from HbA1c to continuous glucose monitoring. Nat Rev Endocrinol β Diabetes Technol Ther . 2011;13(12):1241-1248. doi:10.1089/dia.2011.0099. 12. Service FJ. Glucose variability.
Diabetes . 2013;62(5):1398-1404. doi:10.2337/db12-1396. 13. Suh S, Kim JH. Glycemic variability: How do we measure it and why is it important?
Diabetes Metab J . 2015;39(4):273-282. doi:10.4093/dmj.2015.39.4.273. 14. Tamborlane W V., Beck RW, Bode BW, et al. Continuous Glucose Monitoring and Intensive Treatment of Type 1 Diabetes.
N Engl J Med . 2008;359(14):1464-1476. doi:10.1056/NEJMoa0805017. 15. Umpierrez GE, P. Kovatchev B. Glycemic Variability: How to Measure and Its Clinical Implication for Type 2 Diabetes.
Am J Med Sci . 2018;356(6):518-527. doi:10.1016/j.amjms.2018.09.010. 16. Vigers T, Chan CL, Snell-Bergeon J, et al. cgmanalysis: An R package for descriptive analysis of continuous glucose monitor data. Bethin KE, ed.
PLoS One . 019;14(10):e0216851. doi:10.1371/journal.pone.0216851. 17.
Wojcicki JM. βJβ -index. A new proposition of the assessment of current glucose control in diabetic patients.
Horm Metab Res
Bioinformatics . 2018;34(9):1609-1611. doi:10.1093/bioinformatics/btx826.
Figure 1. Visualizing longitudinal CGM data with the cgmquantify Python package.
Shown are a visualization with indicators of 1 SD from the mean and the mean glucose level (a), a visualization with indicators of hyperglycemic (>180 mg/dL glucose) and hypoglycemic (<70 mg/dL glucose) (b), and a plot with LOWESS smoothing of the glucose data (c).
Figure 2. Visualizing longitudinal CGM data with the cgmquantify R package.
Shown is a visualization available in the cgmquantify R package that enables visualization of CGM data by time of day for each day specified. able 1.
Glucose and Glucose Variability Metrics
Metric Description Equation interdaySD
Interday standard deviation of glucose π πππ‘πππππ¦ = ββ(πΊ π β π) π Where N = total days, G= glucose value interdayCV Interday coefficient of variation of glucose πΆπ πππ‘πππππ¦ = π πππ‘πππππ¦ π intradaySD Mean Intraday standard deviation of glucose (mean across all days) π πππ‘πππππ¦ ππππ = β π ππ intradayCV Mean Intraday coefficient of variation of glucose (mean across all days) πΆπ πππ‘πππππ¦ ππππ = β πΆπ ππ intradaySD Median Intraday standard deviation of glucose (median across all days) π πππ‘πππππ¦ ππππππ = ππππππ (π π ) intradayCV Median Intraday coefficient of variation of glucose (median across all days) πΆπ πππ‘πππππ¦ ππππππ = ππππππ (πΆπ π ) intradaySD Standard Deviation Intraday standard deviation of glucose (standard deviation across all days) π πππ‘πππππ¦ π π‘ππππππ πππ£πππ‘πππ = ππ· (π π ) intradayCV Standard Deviation Intraday coefficient of variation of glucose (standard deviation across all days) πΆπ πππ‘πππππ¦ π π‘ππππππ πππ£πππ‘πππ = ππ· (πΆπ π ) CONGA24
Continuous overall net glycemic action over 24 hours
πΆπππΊπ΄24 = ππ· (|πΊ π‘ β πΊ π‘β24βππ’ππ | GMI Glucose management indicator
πΊππΌ = 3.31 + (0.02392 β π (ππ ππΏβ ))
HBGI
High Blood Glucose Index β π π π π π = π )) ππ π(πΊ π ) β€ 0, π π = π(πΊ π ) = ln(πΊ π ) + 5.381 LBGI
Low Blood Glucose Index β π β π π β = π )) ππ π(πΊ π ) > 0, π π = π(πΊ π ) = ln(πΊ π ) + 5.381 ADRR Average Daily Risk Range, assessment of total daily glucose variations within risk space
π΄π·π π = β (πΏπ π + π»π ππππ πππ¦π )π πππ¦π π€βπππ πΏπ π = max(π π ) πππ π»π π = max (π β ) J-index
Measure of both the mean level and variability of glycemia
π½ = 0.001 β (π + π) MAGE
Mean amplitude of glucose excursions (default = 1SD) 1.
Local minima/maxima determined 2.
Assess max/min pairs against SD .
If difference from min to max > SD, mean measure is retained 4.
Otherwise excluded 5.
Troughs are retained and summed MGE Mean of glucose outside range (default = 1SD) π πππ’πππ π ππ’π‘π πππ Where π πππ’πππ π πππ πππ MODD
Mean of daily differences in glucose
πππ·π· = β|πΊ π‘ β πΊ π‘β24βππ’ππ |π‘ππ‘ππ πππ‘πβππ πππ πππ£ππ‘ππππ TIR Time spent in range (minutes), default = 1SD
ππΌπ = β π‘πππ πππ πππ π TOR Time spent outside range (minutes), default = 1SD
πππ = β π‘πππ ππ’π‘π πππ π POR Percent of time spent outside range
πππ = πππ π‘ππ‘ππ π‘πππ π₯ 100%
PIR Percent of time spent inside range, default = 1SD
ππΌπ = ππΌπ π‘ππ‘ππ π‘πππ π₯ 100% eA1c Estimated A1c (according to American Diabetes Association) ππ΄1π = (46.7 + π )28.7 meanG Mean glucose over all days π = β π₯ π Μ π π medianG Median glucose over all days median (πΊππ’πππ π) minG Minimum glucose over all days min (πΊππ’πππ π) maxG Maximum glucose over all days max (πΊππ’πππ π)
Q1G First quartile glucose value over all days first quartile (πΊππ’πππ π)
Q3G Third quartile glucose value over all days third quartile (πΊππ’πππ π)third quartile (πΊππ’πππ π)