Researchain Logo Researchain
  • Decentralized Journals

    A

    Archives
  • Avatar
    Welcome to Researchain!
    Feedback Center
Decentralized Journals
A
Archives Updated
Archive Your Research
Quantitative Biology Quantitative Methods

cgmquantify: Python and R packages for comprehensive analysis of interstitial glucose and glycemic variability from continuous glucose monitor data

Brinnae Bent,  Maria Henriquez,  Jessilyn Dunn

Abstract
Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day (typically values are recorded every 5 minutes). CGMs are commonly used in diabetes management by clinicians and patients and in research to understand how factors of longitudinal glucose and glucose variability relate to disease onset and severity and the efficacy of interventions. CGM data presents unique bioinformatic challenges because the data is longitudinal, temporal, and there are nearly infinite possible ways to summarize and use this data. There are over 20 metrics of glucose variability, no standardization of metrics, and little validation across studies. Here we present open source python and R packages called cgmquantify, which contains over 20 functions with over 25 clinically validated metrics of glucose and glucose variability and functions for visualizing longitudinal CGM data. This is expected to be useful for researchers and may provide additional insights to patients and clinicians about glucose patterns.
Full PDF

ccgmquantify: Python and R packages for comprehensive analysis of interstitial glucose and glycemic variability from continuous glucose monitor data

Brinnae Bent , Maria Henriquez , Jessilyn Dunn Department of Biomedical Engineering, Duke University, Durham, North Carolina Department of Statistical Science, Duke University, Durham, North Carolina Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina TAGS Open source, Python, R, bioinformatics, data analysis, glucose, glucose variability, continuous glucose monitoring, type I diabetes, type II diabetes, diabetes, prediabetes, Open APS SUMMARY Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day (typically values are recorded every 5 minutes). CGMs are commonly used in diabetes management by clinicians and patients and in research to understand how factors of longitudinal glucose and glucose variability relate to disease onset and severity and the efficacy of interventions. CGM data presents unique bioinformatic challenges because the data is longitudinal, temporal, and there are nearly infinite possible ways to summarize and use this data. There are over 20 metrics of glucose variability, no standardization of metrics, and little validation across studies. Here we present open source python and R packages called cgmquantify, which contains over 20 functions with over 25 clinically validated metrics of glucose and glucose variability and functions for visualizing longitudinal CGM data. This is expected to be useful for researchers and may provide additional insights to patients and clinicians about glucose patterns. NTRODUCTION Continuous glucose monitoring (CGM) systems provide real-time, dynamic glucose information by tracking interstitial glucose values throughout the day. CGMs are commonly used in diabetes management, with 1.2 million diabetic patients using a CGM . CGM use has be associated with improved glycemic control in adults with type 1 diabetes . These devices have been used extensively by the T1D community, including in the Open Artificial Pancreas System Project (OpenAPS) , a project developed to create a patient-implemented closed loop system between a CGM and an insulin pump. CGM data is commonly provided from CGM manufacturers as either raw glucose values (in a .csv format) or in summary reports that utilize proprietary methods to plot and summarize glucose statistics (e.g. Dexcom Clarity currently shows overall mean glucose, standard deviation of glucose, time in range, and hypoglycemia risk and daily minimum, maximum, mean, and standard deviation of glucose). Because these algorithms are proprietary, they cannot be properly validated by clinical researchers . Additionally, the provided glucose summaries are extremely limited and do not usually contain any information about an important clinical metric, glycemic variability. Glycemic variability, also known as glucose variability, is an established risk factor for hypoglycemia and has been shown to be a risk factor in diabetes complications . Glucose variability can be found in over 26,000 publications indexed in PubMed at the time of this publication and is a significant metric in clinical research . Over 20 metrics of glucose variability have been identified (Table 1), which makes it difficult to examine and compare results across numerous research studies analyzing and drawing conclusions about glucose variability. There is a need for an open source resource with algorithms that are utilized and validated in clinical research studies. This would enable standardized glucose variability metrics and the ability to compare findings from studies that utilize different metrics of glucose variability. This resource should be available in an open source programming language with a low barrier to entry to encourage researchers, clinicians, and patients alike to explore CGM data. Previous open-source resources have been implemented in Excel and R . There is currently no comprehensive resource for CGM data in Python, the third most common programming language used globally and the leading language among newcomers . Additionally, previous implementations of open source CGM data analysis have limited metrics of glucose variability. Further, these methods are typically developed for a specific purpose and are therefore not extensible (e.g. do not have simple functions so users can customize their metrics and visualizations). We have developed a package written and published in Python under the MIT license and a package written and published in R under the MIT license. The packages, both named cgmquantify, contain over 20 functions with more than 25 metrics summarizing glucose and glucose variability. There is also includes code for visualizing CGM data in both packages. Both ackages are available in the Digital Biomarker Discovery Pipeline (DBDP) , the open source software platform for digital biomarker discovery. The python package is available under the Python Package Index (PyPI) (https://pypi.org/project/cgmquantify/). Source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify. The R package is available in the Comprehensive R Archive Network (CRAN) (https://CRAN.R-project.org/package=cgmquantify) and source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify-R. METHODS cgmquantify is a Python package and an R package composed of 20+ functions with over 25 clinically validated metrics of glucose and glucose variability, as shown in Table 1. Customizable visualizations (Figure 1, Figure 2) are also included as easy to implement functions. cgmquantify is version controlled through GitHub and PyPI or CRAN. This allows for single-line installation in either language. Source code and an extensive user guide are available on GitHub to facilitate ease of use and enable customization based on user needs. Issue tracking on GitHub is monitored closely by the Digital Biomarker Discovery Pipeline to allow for rapid feedback. Tests are available in GitHub under the tests subdirectory to allow for manual testing of all functions. RESULTS We have included import functions to format data for use with the cgmquantify package. These functions currently support Dexcom CGM devices, with plans to add additional import functions for other CGM manufacturers, including Medtronic and Abbott. Our user guide also outlines how one can easily format data to make any data input compatible with the functions in cgmquantify. Functions are available for all the commonly studied glucose and glucose variability metrics (Table 1). Additionally, functions for data visualization of the longitudinal CGM data are provided. These visualizations are easily customizable. We have also implemented a function that enables LOWESS smoothing over the CGM data (Figure 1). DISCUSSION cgmquantify is a package that simplifies the process of calculating metrics and thus allows for easy comparison across different research studies that use different metrics summarizing glucose and glucose variability. Functions have been developed using equations from clinically validated research studies so users can compare their results to previous findings. The cgmquantify package is easily implemented with a one-line installation and an extensive user guide in both the python and R languages. Detailed documentation facilitates modification of xisting code for customization of input and visualizations. This package also has the ability to build a community of developers to contribute to the literature in this burgeoning field. This is a much-needed resource for the community of researchers, clinicians, and patients using CGM. Currently, little is understood about the relationships between glucose and glucose variability metrics from CGM data and relationships to diseases including but not limited to prediabetes, T2D, and severity of symptoms in T1D. As more researchers and clinicians start looking to CGM data to answer these questions, the need for a standardized resource in a nearly ubiquitous programming language is necessary. As we have seen with the Open APS community, analysis of CGM data is not limited to researchers and clinicians but includes patients themselves . By providing this as an open source resource, we hope to encourage patients to interact with their own data, determine personalized insights, and make meaningful contributions to the digital health landscape. FUTURE IMPLEMENTATIONS Future contributions will include additional import functions customized to all the CGM manufacturers, including but not limited to Medtronic and Abbott. We are exploring methods to incorporate food logs into visualizations of CGM data. CODE AVAILABILITY The cgmquantify python package is available under the Python Package Index (PyPI) (https://pypi.org/project/cgmquantify/). Source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify. The cgmquantify R package is available in the Comprehensive R Archive Network (CRAN) (https://CRAN.R-project.org/package=cgmquantify) and source code can be found at https://github.com/DigitalBiomarkerDiscoveryPipeline/cgmquantify-R. We encourage others to expand on our ideas and contribute their own glucose and glucose variability metrics to cgmquantify. We have documentation for contributing available in our User Guide. REFERENCES 1. Bent B, Wang K, Grzesiak E, et al. The Digital Biomarker Discovery Pipeline: An open source software platform for the development of digital biomarkers using mHealth and wearables data. J Clin Transl Sci . 2020;(11):1-28. doi:10.1017/cts.2020.511. 2. Cho P, Bent B, Wittmann A, et al. Expanding the Definition of Intraday Glucose Variability. In:

Diabetes . American Diabetes Association; 2020. 3. eAG/A1C Conversion Calculator | American Diabetes Association. ttps://professional.diabetes.org/diapro/glucose_calc. Accessed February 15, 2020. 4. Goldsack J, Coravos A, Bakker J, Bent B, Dowling AV, Fitzer-Attas C, Godfrey A, Godino JG, Gujar N, Izmailova E, Manta C, Peterson B, Vandendressche BV, Wood WA, Wang KW DJ. Verification, Analytical Validation, and Clinical Validation (V3): The Foundation of Determining Fit-for-Purpose for Biometric Monitoring Technologies (BioMeTs).

JMIR Prepr . 2020. https://preprints.jmir.org/preprint/17264. 5. De Groot M, Drangsholt M, Martin-Sanchez F, Wolf G. Single Subject (N-of-1) Research Design, Data Processing, and Personal Science.

Methods Inf Med . 2017;56(06):416-418. doi:10.3414/ME17-03-0001. 6. Hill NR, Oliver NS, Choudhary P, Levy JC, Hindmarsh P, Matthews DR. Normal reference range for mean tissue glucose and glycemic variability derived from continuous glucose monitoring for subjects without diabetes in different ethnic groups.

Diabetes Technol Ther . 2011;13(9):921-928. doi:10.1089/dia.2010.0247. 7. Kovatchev B. Glycemic Variability: Risk Factors, Assessment, and Control.

J Diabetes Sci Technol . 2019;13(4):627-635. doi:10.1177/1932296819826111. 8. Kovatchev BP. Metrics for glycaemic control β€” from HbA1c to continuous glucose monitoring. Nat Rev Endocrinol – Diabetes Technol Ther . 2011;13(12):1241-1248. doi:10.1089/dia.2011.0099. 12. Service FJ. Glucose variability.

Diabetes . 2013;62(5):1398-1404. doi:10.2337/db12-1396. 13. Suh S, Kim JH. Glycemic variability: How do we measure it and why is it important?

Diabetes Metab J . 2015;39(4):273-282. doi:10.4093/dmj.2015.39.4.273. 14. Tamborlane W V., Beck RW, Bode BW, et al. Continuous Glucose Monitoring and Intensive Treatment of Type 1 Diabetes.

N Engl J Med . 2008;359(14):1464-1476. doi:10.1056/NEJMoa0805017. 15. Umpierrez GE, P. Kovatchev B. Glycemic Variability: How to Measure and Its Clinical Implication for Type 2 Diabetes.

Am J Med Sci . 2018;356(6):518-527. doi:10.1016/j.amjms.2018.09.010. 16. Vigers T, Chan CL, Snell-Bergeon J, et al. cgmanalysis: An R package for descriptive analysis of continuous glucose monitor data. Bethin KE, ed.

PLoS One . 019;14(10):e0216851. doi:10.1371/journal.pone.0216851. 17.

Wojcicki JM. ’J’ -index. A new proposition of the assessment of current glucose control in diabetic patients.

Horm Metab Res

Bioinformatics . 2018;34(9):1609-1611. doi:10.1093/bioinformatics/btx826.

Figure 1. Visualizing longitudinal CGM data with the cgmquantify Python package.

Shown are a visualization with indicators of 1 SD from the mean and the mean glucose level (a), a visualization with indicators of hyperglycemic (>180 mg/dL glucose) and hypoglycemic (<70 mg/dL glucose) (b), and a plot with LOWESS smoothing of the glucose data (c).

Figure 2. Visualizing longitudinal CGM data with the cgmquantify R package.

Shown is a visualization available in the cgmquantify R package that enables visualization of CGM data by time of day for each day specified. able 1.

Glucose and Glucose Variability Metrics

Metric Description Equation interdaySD

Interday standard deviation of glucose 𝜎 π‘–π‘›π‘‘π‘’π‘Ÿπ‘‘π‘Žπ‘¦ = βˆšβˆ‘(𝐺 𝑖 βˆ’ πœ‡) 𝑁 Where N = total days, G= glucose value interdayCV Interday coefficient of variation of glucose 𝐢𝑉 π‘–π‘›π‘‘π‘’π‘Ÿπ‘‘π‘Žπ‘¦ = 𝜎 π‘–π‘›π‘‘π‘’π‘Ÿπ‘‘π‘Žπ‘¦ πœ‡ intradaySD Mean Intraday standard deviation of glucose (mean across all days) 𝜎 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘šπ‘’π‘Žπ‘› = βˆ‘ 𝜎 𝑖𝑁 intradayCV Mean Intraday coefficient of variation of glucose (mean across all days) 𝐢𝑉 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘šπ‘’π‘Žπ‘› = βˆ‘ 𝐢𝑉 𝑖𝑁 intradaySD Median Intraday standard deviation of glucose (median across all days) 𝜎 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘šπ‘’π‘‘π‘–π‘Žπ‘› = π‘šπ‘’π‘‘π‘–π‘Žπ‘› (𝜎 𝑖 ) intradayCV Median Intraday coefficient of variation of glucose (median across all days) 𝐢𝑉 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘šπ‘’π‘‘π‘–π‘Žπ‘› = π‘šπ‘’π‘‘π‘–π‘Žπ‘› (𝐢𝑉 𝑖 ) intradaySD Standard Deviation Intraday standard deviation of glucose (standard deviation across all days) 𝜎 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘ π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π‘‘π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘› = 𝑆𝐷 (𝜎 𝑖 ) intradayCV Standard Deviation Intraday coefficient of variation of glucose (standard deviation across all days) 𝐢𝑉 π‘–π‘›π‘‘π‘Ÿπ‘Žπ‘‘π‘Žπ‘¦ π‘ π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π‘‘π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘› = 𝑆𝐷 (𝐢𝑉 𝑖 ) CONGA24

Continuous overall net glycemic action over 24 hours

𝐢𝑂𝑁𝐺𝐴24 = 𝑆𝐷 (|𝐺 𝑑 βˆ’ 𝐺 π‘‘βˆ’24β„Žπ‘œπ‘’π‘Ÿπ‘  | GMI Glucose management indicator

𝐺𝑀𝐼 = 3.31 + (0.02392 βˆ— πœ‡ (π‘šπ‘” 𝑑𝐿⁄ ))

HBGI

High Blood Glucose Index βˆ‘ π‘Ÿ 𝑙 𝑛 π‘Ÿ 𝑙 = 𝑖 )) 𝑖𝑓 𝑓(𝐺 𝑖 ) ≀ 0, π‘Ÿ 𝑙 = 𝑓(𝐺 𝑖 ) = ln(𝐺 𝑖 ) + 5.381 LBGI

Low Blood Glucose Index βˆ‘ π‘Ÿ β„Ž 𝑛 π‘Ÿ β„Ž = 𝑖 )) 𝑖𝑓 𝑓(𝐺 𝑖 ) > 0, π‘Ÿ 𝑙 = 𝑓(𝐺 𝑖 ) = ln(𝐺 𝑖 ) + 5.381 ADRR Average Daily Risk Range, assessment of total daily glucose variations within risk space

𝐴𝐷𝑅𝑅 = βˆ‘ (𝐿𝑅 𝑗 + 𝐻𝑅 π‘—π‘Žπ‘™π‘™ π‘‘π‘Žπ‘¦π‘  )𝑁 π‘‘π‘Žπ‘¦π‘  π‘€β„Žπ‘’π‘Ÿπ‘’ 𝐿𝑅 𝑗 = max(π‘Ÿ 𝑙 ) π‘Žπ‘›π‘‘ 𝐻𝑅 𝑗 = max (π‘Ÿ β„Ž ) J-index

Measure of both the mean level and variability of glycemia

𝐽 = 0.001 βˆ— (πœ‡ + 𝜎) MAGE

Mean amplitude of glucose excursions (default = 1SD) 1.

Local minima/maxima determined 2.

Assess max/min pairs against SD .

If difference from min to max > SD, mean measure is retained 4.

Otherwise excluded 5.

Troughs are retained and summed MGE Mean of glucose outside range (default = 1SD) πœ‡ π‘”π‘™π‘’π‘π‘œπ‘ π‘’ π‘œπ‘’π‘‘π‘ π‘–π‘‘π‘’ Where πœ‡ π‘”π‘™π‘’π‘π‘œπ‘ π‘’ 𝑖𝑛𝑠𝑖𝑑𝑒 MODD

Mean of daily differences in glucose

𝑀𝑂𝐷𝐷 = βˆ‘|𝐺 𝑑 βˆ’ 𝐺 π‘‘βˆ’24β„Žπ‘œπ‘’π‘Ÿπ‘  |π‘‘π‘œπ‘‘π‘Žπ‘™ π‘šπ‘Žπ‘‘π‘β„Žπ‘’π‘‘ π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘›π‘  TIR Time spent in range (minutes), default = 1SD

𝑇𝐼𝑅 = βˆ‘ π‘‘π‘–π‘šπ‘’ 𝑖𝑛𝑠𝑖𝑑𝑒 𝑁 TOR Time spent outside range (minutes), default = 1SD

𝑇𝑂𝑅 = βˆ‘ π‘‘π‘–π‘šπ‘’ π‘œπ‘’π‘‘π‘ π‘–π‘‘π‘’ 𝑁 POR Percent of time spent outside range

𝑃𝑂𝑅 = π‘‡π‘‚π‘…π‘‘π‘œπ‘‘π‘Žπ‘™ π‘‘π‘–π‘šπ‘’ π‘₯ 100%

PIR Percent of time spent inside range, default = 1SD

𝑃𝐼𝑅 = π‘‡πΌπ‘…π‘‘π‘œπ‘‘π‘Žπ‘™ π‘‘π‘–π‘šπ‘’ π‘₯ 100% eA1c Estimated A1c (according to American Diabetes Association) 𝑒𝐴1𝑐 = (46.7 + πœ‡ )28.7 meanG Mean glucose over all days πœ‡ = βˆ‘ π‘₯ 𝑖 Μ… 𝑁 𝑁 medianG Median glucose over all days median (πΊπ‘™π‘’π‘π‘œπ‘ π‘’) minG Minimum glucose over all days min (πΊπ‘™π‘’π‘π‘œπ‘ π‘’) maxG Maximum glucose over all days max (πΊπ‘™π‘’π‘π‘œπ‘ π‘’)

Q1G First quartile glucose value over all days first quartile (πΊπ‘™π‘’π‘π‘œπ‘ π‘’)

Q3G Third quartile glucose value over all days third quartile (πΊπ‘™π‘’π‘π‘œπ‘ π‘’)third quartile (πΊπ‘™π‘’π‘π‘œπ‘ π‘’)

Related Researches

From sleep medicine to medicine during sleep: A clinical perspective
by Nitai Bar
What should patients do if they miss a dose of medication? A probabilistic analysis
by Elijah D Counterman
Unmasking the mask studies: why the effectiveness of surgical masks in preventing respiratory infections has been underestimated
by Pratyush K. Kollepara
An overview of continuous and discrete phasor analysis of binned or time-gated periodic decays
by Xavier Michalet
Fine-tuning neural excitation/inhibition for tailored ketamine use in treatment-resistant depression
by Erik D. Fagerholm
Sarc-Graph: Automated segmentation, tracking, and analysis of sarcomeres in hiPSC-derived cardiomyocytes
by Bill Zhao
Dietary Supplements and Nutraceuticals Under Investigation for COVID-19 Prevention and Treatment
by Ronan Lordan
Rate-dependent effects of lidocaine on cardiac dynamics: Development and analysis of a low-dimensional drug-channel interaction model
by Steffen S. Docken
Effect of stress on cardiorespiratory synchronization of Ironmen athletes
by Maia Angelova
Deep learning can differentiate IDH-mutant from IDH-wild type GBM
by Luca Pasquini
Feature set optimization by clustering, univariate association, Deep & Machine learning omics Wide Association Study (DMWAS) for Biomarkers discovery as tested on GTEx pilot dataset for death due to heart attack
by Abhishek Narain Singh
Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography
by Xuefeng Du
Bayesian uncertainty quantification for data-driven equation learning
by Simon Martina-Perez
Interpretation of the Area Under the ROC Curve for Risk Prediction Models
by Ralph H. Stern
Metabolic alterations caused by smoking: the use of 1H-NMR in blood plasma analysis to unravel underlying mechanisms of lung cancers leading risk factor
by Juul Goossens
Deep Neural Network Based Differential Equation Solver for HIV Enzyme Kinetics
by Joseph Stember
A Novel Stochastic Epidemic Model with Application to COVID-19
by Edilson F. Arruda
Application of Quantitative Systems Pharmacology to guide the optimal dosing of COVID-19 vaccines
by Mario Giorgi
Fluid-solid interaction in the rate-dependent failure of brain tissue and biomimicking gels
by Michele Terzano
The effects of temperature acclimation on swimming performance in the pelagic Mahi-mahi (Coryphaena hippurus)
by Rachael M. Heuer
Prediction of Influenza A virus infections in humans using an Artificial Neural Network learning approach
by Charalambos Chrysostomou
Structural identifiability analysis of PDEs: A case study in continuous age-structured epidemic models
by Marissa Renardy
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder
by Quentin Garrido
Comparison of Machine Learning Classifiers to Predict Patient Survival and Genetics of GBM: Towards a Standardized Model for Clinical Implementation
by Luca Pasquini
Evaluation of the Penetration Process of Fluorescent Collagenase Nanocapsules in a 3D Collagen Gel
by Victor M. Moreno

  • «
  • 1
  • 2
  • 3
  • 4
  • »
Submitted on 8 Feb 2021 Updated

arXiv.org Original Source
NASA ADS
Google Scholar
Semantic Scholar
How Researchain Works
Researchain Logo
Decentralizing Knowledge