CardioLearn: A Cloud Deep Learning Service for Cardiac Disease Detection from Electrocardiogram
Shenda Hong, Zhaoji Fu, Rongbo Zhou, Jie Yu, Yongkui Li, Kai Wang, Guanlin Cheng
CCardioLearn : A Cloud Deep Learning Service for Cardiac DiseaseDetection from Electrocardiogram
Shenda Hong , Zhaoji Fu , , Rongbo Zhou , Jie Yu , Yongkui Li , Kai Wang , Guanlin Cheng HeartVoice Medical Technology, Hefei, China University of Science and Technology of China, Hefei, China
ABSTRACT
Electrocardiogram (ECG) is one of the most convenient and non-invasive tools for monitoring peoples’ heart condition, which canuse for diagnosing a wide range of heart diseases, including CardiacArrhythmia, Acute Coronary Syndrome, et al. However, traditionalECG disease detection models show substantial rates of misdiag-nosis due to the limitations of the abilities of extracted features.Recent deep learning methods have shown significant advantages,but they do not provide publicly available services for those whohave no training data or computational resources.In this paper, we demonstrate our work on building, training, andserving such out-of-the-box cloud deep learning service for cardiacdisease detection from ECG named
CardioLearn . The analyticability of any other ECG recording devices can be enhanced byconnecting to the Internet and invoke our open API. As a practicalexample, we also design a portable smart hardware device alongwith an interactive mobile program, which can collect ECG anddetect potential cardiac diseases anytime and anywhere.
KEYWORDS
Deep learning, Healthcare, Electrocardiogram
The Electrocardiogram (ECG) is one of the most convenient andnon-invasive tools for monitoring peoples’ heart condition. It is akind of physiological signal that records electrical activities of car-diac muscle over a period of time by placing electrodes and leads onthe human body. ECG can use for diagnosing a wide range of heartdiseases, including Cardiac Arrhythmia, Acute Coronary Syndrome,et al [2, 18]. It is estimated that more than 300 million ECGs arerecorded worldwide every year [4], which is a tremendous amountof data for Cardiologists to analyze. Thus, many computer-aidedECG disease detection methods based on feature extraction and ma-chine learning have been proposed over the past 50 years, and theyhave been used in commercial medical devices. However, existingcommercial ECG disease detection methods still show substantialrates of misdiagnosis [3, 13, 14], due to the limitations of the abili-ties of extracted features, and the lack of generalizability which aretuned for their specific medical devices.Recently, deep learning methods have shown great potential inhealthcare and medical area [11, 16]. Specifically, there are some pi-oneer works that show successes of deep learning methods on ECGdisease detection [1, 4, 6, 7, 15, 17, 19] (see [8] for a survey). How-ever, these methods are still far away from practical applicationsbecause none of these models have been deployed for providingpublicly available ECG disease detection services. Besides, most of
Figure 1: The workflow of
CardioLearn
ECG disease detec-tion cloud deep learning service. these models only trained on single lead ECG data thus, they onlysupport single lead ECG disease detection, which is insufficient inmost medical area applications.This demonstration provides
CardioLearn , a publicly availableout-of-the-box cloud deep learning service that can be used forcardiac disease detection from ECG. Any existing ECG recordingdevices can be enhanced with cardiac disease detection ability byconnecting to the Internet and invoke our open API. To furtherdemonstrate such practical usage, we also design a portable smarthardware device along with an interactive mobile program, so thatpeople can easily collect their ECG records and detect potentialcardiac diseases anytime and anywhere.
This section introduces our system in detail. We first introduce thedetails of our deep learning model and the performance tested ona publicly available open-source dataset. Then we show how wedeploy the model and serve it as a cloud service.The framework of
CardioLearn is shown in Figure 1. We firstbuild and train two deep neural network models to support applica-tions in the healthcare environment (outside the hospital, usuallysingle lead) and medical environment (inside the hospital, usually12-lead). We are then serving the model by providing an open APIusing the HTTP protocol. Moreover, we also design a portable hard-ware device along with an interactive mobile program as a practicalapplication. Here “lead” means “channel”. a r X i v : . [ ee ss . SP ] J u l .1 A Deep Learning Model for ECG DiseaseDetection The ECG data usually has two forms of inputs, which are single lead(inside the hospital, see Figure 5) and 12-lead (outside the hospital,see Figure 6). Here “lead” has the same meaning as “channel”. Tohandle them both, we build two deep neural network models foreach kind using TensorFlow. Their input layers are different, whilethe other layers remain the same. In detail, as shown in Figure 2, theinput ECG recording is segmented into several short segments, andeach segment goes through 32-layers of stacked one-dimensionalconvolutional layers (CNN) to capture local ECG patterns and shifts.One recurrent layer (RNN) is then built on top of the convolutionallayers to capture long term variations. Finally, the model appliesmultiple dense layers (Dense) on the output of the recurrent layerto get the predictions of each disease. The objective of the model isa multi-label learning task because multiple diseases might occurin the same ECG recording. Moreover, we also introduce shortcutconnections [5] at every two convolutional layers to address theproblems of vanishing/exploding gradients when training a verydeep neural network. The input dimension is downsampled at everyfour convolutional layers, and the number of filters increases atevery eight convolutional layers. The CNNs have shared weightsbetween segments.
Figure 2: Model architecture.
To train the deep model, we collected the training data fromseveral hospitals, which are 12-lead ECG recordings lasting from 20s to several minutes. The corresponding diagnosis results were writ-ten by cardiologists using narrative language; we extract keywordsand integrate them into diagnostic labels based on standard ECGdisease detection systems like [10]. We use Lead I as single leaddata while also collect extra single lead data from mobile devices.Finally, our 12-lead model can support 43 types of diseases, and asingle-lead model can support 18 types of diseases, both coveringover 99% of total abnormalities in our training data. We optimze theloss function by reducing the learning rate by a factor of 0.1 whenthe validation performance is not improving for 5,000 batches ofeach task. We are continuously saving and updating the best modelfor each label as the final model.We tested our model on a publicly available open-source datasetfrom 2018 China Physiological Signal Challenge [9]. The challengeECG recordings were collected from 11 hospitals sampled as 500Hz, which contains 6,877 (3178 female, 3699 male) 12-lead ECGrecordings lasting from 6 s to just 60 s. These recordings are never being used to train our model. We test the model performanceon detecting Atrial fibrillation (AF), First-degree atrioventricular http://2018.icbeb.org/Challenge.html block (AVBI), Left bundle branch block (LBBB), Right bundle branchblock (RBBB), Premature atrial contraction (PAC) and Prematureventricular contraction (PVC). We report the Receiver OperatingCharacteristic curve (ROC curve) and the area under ROC (ROC-AUC score) for each disease. The results are shown in Table 1 andFigure 3. We can see that both models achieve higher than 0.93ROC-AUC scores on almost all diseases. We also notice that AFdetection is even higher than 0.97. AF AVBI LBBB RBBB PAC PVCSingle Lead 0.9857 0.9508 0.9597 0.8927 0.9343 0.957812-lead 0.9789 0.9579 0.9385 0.9655 0.9462 0.9609
Table 1: ROC-AUC scores on 2018 China Physiological Sig-nal Challenge dataset.Figure 3: ROC curves on 2018 China Physiological SignalChallenge dataset.
We deploy our models using TensorFlow Serving [12] on four cloudservers. Specifically, we use the Java Client TensorFlow ServinggRPC API. Each cloud server is equipped with 4-core Intel XeonSkylake 6146 3.2 GHz CPU 16GB RAM, and 3 Tesla P4 GPU formodel inference. The information transmission between serversand clients is based on HTTP protocol. One request of HTTP includ-ing HEADER and POST. The HEADER includes content type like“JSON” and authorization information. The authority is a uniquetoken given by the server. The POST is content type format includes sampleRate (HZ, sample frequence of ECG signal), adcGain (Analog-to-Digital Converter gain), dataI , dataII , dataIII , dataAVR , dataAVL , dataAVF , dataV1 , dataV2 , dataV3 , dataV4 , dataV5 , dataV6 represents12 standard leads. One can only fill in dataI and leave others nullfor requesting single lead model. The returned result is also in JSONformat. One can easily parse the JSON result and integrate to theirown systems.The master server maintains a global task queue Q , which con-sists of all requests from clients. The task queue implements theFirst In First Out (FIFO) order. Each request contains authorizationtoken T , analysis parameters P (sampling rate, for example), andECG data D . The disease detection process includes three steps:(1) authorization, (2) preprocessing, and (3) invoking deep model.Authorization validates the legality of the request by checking itsunique token T . Once authorization success, the requests are added igure 4: Results of stress test. into the Q and wait for the computation resources. Then, the engineresamples the original ECG and removes high-frequency noise aswell as low frequency wandering by band-pass filters. After that,the engine invokes GPU and inference to get the results. If any stepraises exception due to some errors, the request is added into taskqueue Q again and wait for the next round analysis. If the processfails or timeout, the server returns a failure execution information.Notice that models are served by multiple concurrent processes sothat the time delay of retry is expected to be short.We run the stress test to validate the performance of cloud serv-ing using Apache Jmeter . We prepare 20,000 samples 12-lead 30seconds ECG recordings. The number of concurrent processes ofeach server is set to 15. The results are shown in Figure 4. Theleft figure shows the distribution of process time, and we can seethat almost 99% of requests can be returned within 2.5 seconds.This time cost shows a promising real-world application becauseit reduces one report of ECG disease detection from minutes (byCardiologists) to seconds. The right figure shows the test summary.We can see that the total throughput is 11.5 records per second,which means CardioLearn can analyze nearly 1 million 30-secondECG recordings per day. Notice that when handing long term ECGrecordings, we can still keep a comparable execution time, by cut-ting long term ECG recordings into short recordings and batchingthem, where each batch can contain hundreds of short recordings.
Our demonstration consists of two parts: 1) a website of an ECGanalytic tool that provides an out-of-the-box cloud deep learningservice, and 2) a portable hardware device to record ECG as well asan interactive mobile application to display results .Figure 5 shows the webpage of the ECG analytics tool. It consistsof three steps to get analysis results. First of all, users should pro-vide necessary parameters of their ECG data, including samplingfrequency, ADC Gain, and baseline voltage. Then, users have twoways to upload their ECG data: 1) upload comma-separated values(CSV) file from their computer or 2) copy and paste CSV file intothe website textbox directly. Besides, we also provide a variety ofECG records as example data. Finally, after clicking “Analysis” bot-tom, the formatted ECG report, including ECG records and analysisresults, shows on the bottom as in Figure 5. In the ECG report,the middle main part shows ECG recordings, with pink mesh grid https://jmeter.apache.org For more information, please visit https://github.com/hsd1503/CardioLearn.
Figure 5: Demo of Webpage for Cloud Deep Learning Service(12-lead ECG). background that helps cardiologists to measure and review thereports. The left bottom shows disease detection results given by
CardioLearn , which is Atrial Fibrillation in this case. The upperpart shows ECG measurements like PR interval, QRS width, et al.,which also help cardiologists for a better review.Figure 6 shows our portable hardware device and interactivemobile application. The portable hardware device weighs around10 g, and the size is 75 mm length, 25 mm width, and 4.5 mm height,which is very convenient to carry on in daily life. A Bluetooth mod-ule is equipped on the chip for connecting with a mobile phone.The interactive mobile application is developed based on WeChatMini Program so that it can support any mobile phone, includingAndroid or iOS, if they can install WeChat. The usage is simplifiedto only one step: just put the user’s fingers on the metal electrodesand wait for 30 seconds. The device records ECG and sends thedata to the application via Bluetooth transmission, the mobile appli-cation then sending the request to
CardioLearn for analysis, andfinally display the returned results on the user’s mobile phone. TheECG records and results can also be retrospected from the mobileapplication.
In this paper, we introduce
CardioLearn , a publicly available out-of-the-box cloud deep learning service for cardiac disease detec-tion from ECG, which can help improve the analytic ability ofexisting ECG recording devices. Besides, we also design a portable igure 6: Demo of Portable Hardware Device and Mobile Application (Single Lead ECG). smart hardware device along with an interactive mobile programto demonstrate such practical usage. We wish everyone can easilyand early detect potential cardiac diseases anytime and anywhere.
REFERENCES [1] Zachi I Attia, Suraj Kapa, Francisco Lopez-Jimenez, Paul M McKie, Dorothy JLadewig, Gaurav Satam, Patricia A Pellikka, Maurice Enriquez-Sarano, Peter ANoseworthy, Thomas M Munger, et al. 2019. Screening for cardiac contractiledysfunction using an artificial intelligence–enabled electrocardiogram.
Naturemedicine
25, 1 (2019), 70.[2] Tomas B Garcia. 2014.
Introduction to 12-lead ECG: The art of interpretation . Jones& Bartlett Publishers.[3] Maya E Guglin and Deepak Thatai. 2006. Common errors in computer elec-trocardiogram interpretation.
International journal of cardiology
Nature medicine
25, 1 (2019), 65.[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappingsin deep residual networks. In
European conference on computer vision . Springer,630–645.[6] Shenda Hong, Meng Wu, Yuxi Zhou, Qingyun Wang, Junyuan Shang, Hongyan Li,and Junqing Xie. 2017. ENCASE: An ENsemble ClASsifiEr for ECG classificationusing expert features and deep neural networks. In . IEEE, 1–4.[7] Shenda Hong, Cao Xiao, Tengfei Ma, Hongyan Li, and Jimeng Sun. 2019. MINA:multilevel knowledge-guided attention for modeling electrocardiography signals.In
Proceedings of the 28th International Joint Conference on Artificial Intelligence .AAAI Press, 5888–5894.[8] Shenda Hong, Yuxi Zhou, Junyuan Shang, Cao Xiao, and Jimeng Sun. 2019.Opportunities and Challenges in Deep Learning Methods on ElectrocardiogramData: A Systematic Review. arXiv preprint arXiv:2001.01550 (2019).[9] Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, XiaoyanXu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, et al. 2018. An OpenAccess Database for Evaluating the Algorithms of Electrocardiogram Rhythmand Morphology Abnormality Detection.
Journal of Medical Imaging and HealthInformatics
8, 7 (2018), 1368–1373. [10] Jay W Mason, E William Hancock, and Leonard S Gettes. 2007. Recommenda-tions for the standardization and interpretation of the electrocardiogram: partII: electrocardiography diagnostic statement list a scientific statement from theAmerican Heart Association Electrocardiography and Arrhythmias Committee,Council on Clinical Cardiology; the American College of Cardiology Foundation;and the Heart Rhythm Society Endorsed by the International Society for Com-puterized Electrocardiology.
Journal of the American College of Cardiology
49, 10(2007), 1128–1135.[11] Riccardo Miotto, Fei Wang, Shuang Wang, Xiaoqian Jiang, and Joel T Dudley. 2017.Deep learning for healthcare: review, opportunities and challenges.
Briefings inbioinformatics
19, 6 (2017), 1236–1246.[12] Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fang-wei Li, Vinu Rajashekhar, Sukriti Ramesh, and Jordan Soyke. 2017. Tensorflow-serving: Flexible, high-performance ml serving. arXiv preprint arXiv:1712.06139 (2017).[13] Jürg Schläpfer and Hein J Wellens. 2017. Computer-interpreted electrocardio-grams: benefits and limitations.
Journal of the American College of Cardiology
Journal of electrocardiology
40, 5(2007), 385–390.[15] Supreeth P Shashikumar, Amit J Shah, Gari D Clifford, and Shamim Nemati. 2018.Detection of paroxysmal atrial fibrillation using attention-based bidirectionalrecurrent neural networks. In
Proceedings of the 24th ACM SIGKDD InternationalConference on Knowledge Discovery & Data Mining . ACM, 715–723.[16] Cao Xiao, Edward Choi, and Jimeng Sun. 2018. Opportunities and challenges indeveloping deep learning models using electronic health records data: a system-atic review.
Journal of the American Medical Informatics Association
25, 10 (2018),1419–1428.[17] Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, and JimengSun. 2018. Raim: Recurrent attentive and intensive model of multimodal patientmonitoring data. In
Proceedings of the 24th ACM SIGKDD International Conferenceon Knowledge Discovery & Data Mining . ACM, 2565–2573.[18] Frank G Yanowitz. 2012. Introduction to ECG interpretation.
LDS Hospital andIntermountain Medical Center (2012).[19] Yuxi Zhou, Shenda Hong, Junyuan Shang, Meng Wu, Qingyun Wang, HongyanLi, and Junqing Xie. 2019. K-margin-based residual-convolution-recurrent neuralnetwork for atrial fibrillation detection. In