Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches
Philip John Gorinski, Honghan Wu, Claire Grover, Richard Tobin, Conn Talbot, Heather Whalley, Cathie Sudlow, William Whiteley, Beatrice Alex
NNamed Entity Recognition for Electronic Health Records: A Comparison ofRule-based and Machine Learning Approaches
Philip John Gorinski , Honghan Wu , Claire Grover , Richard Tobin , Conn Talbot ,Heather Whalley , Cathie Sudlow , William Whiteley , Beatrice Alex , Institute for Language, Cognition and Computation, School of Informatics, University ofEdinburgh; Usher Institute, University of Edinburgh; Centre for Clinical Brain Sciences,University of Edinburgh; Edinburgh Futures Institute, University of Edinburgh
Abstract
This work investigates multiple approaches to Named Entity Recognition (NER) for text in Electronic Health Record(EHR) data. In particular, we look into the application of (i) rule-based, (ii) deep learning and (iii) transfer learningsystems for the task of NER on brain imaging reports with a focus on records from patients with stroke. We explorethe strengths and weaknesses of each approach, develop rules and train on a common dataset, and evaluate eachsystem’s performance on common test sets of Scottish radiology reports from two sources (brain imaging reports inESS – Edinburgh Stroke Study data collected by NHS Lothian as well as radiology reports created in NHS Tayside).Our comparison shows that a hand-crafted system is the most accurate way to automatically label EHR, but machinelearning approaches can provide a feasible alternative where resources for a manual system are not readily available.
Introduction
Named Entity Recognition (NER) is an area of Natural Language Processing (NLP) that addresses the identificationand classification of entities in written text. It has been employed using large variety of methods and on a multitude ofdomains and methods.
Electronic Health Records (EHR) typically contain not only structured information about a patient but also written,unstructured text describing health professionals’ opinions. Named entities in this domain include names of diseases,symptoms and anatomical locations. A radiology report is the opinion of a radiologist on a scan or X-ray. Figure 1shows an example of an anonymised radiology report of a brain scan with identified named entities.
Report:There is loss of the neuronal tissue in the left inferior frontal and superior temporal lobes, consistent with a prior infarct. There isgeneralised cerebral volume loss which appears within normal limits for the patient’s age, with no focal element to the generalisedatrophy. Major intracranial vessels appear patent. White matter of the brain appears largely normal, with no evidence of significantsmall vessel disease. No mass lesion, hydrocephalus or extra axial collection.Summary:Old left hemispheric infarct. No other significant finding.location:cortical time:old ischaemic stroke atrophy small vessel disease tumour subdural haematoma
Figure 1:
Example of a brain imaging report with annotated entities and their types below.
Related Work
NER is a well-studied field of NLP.
In 2003, Tjong and et al. introduced a shared NER task at the Conferenceon Computational Natural Language Learning (CoNLL), which established a widely-accepted benchmark for theevaluation of NER systems. This led to research into machine learning methods, such as the Stanford NER tagger. Other NER systems follow a rule-based approach, such as the ANNIE NER tagger. NLP for the medical domain has been an active field of research since the early 2000’s. BioCreative and BioNLP pro-vided shared tasks for NER and Relation Extraction (RE), with several systems applying NLP to biomedical text.
An overview of approaches to information extraction from EHR data was conducted by Meystre et al. (2008) , andPons et al. (2016) provide a recent review of NLP in radiology. a r X i v : . [ c s . C L ] J un ost relevant for the systems investigated in the current work in terms of domain is work conducted by Flynn etal. (2010) , who present a system for the analysis of brain scan radiology reports. While not dealing with NERas a their main task, the authors applied keyword matching to analyse reports from the Tayside dataset, and assigndocument-level labels differentiating between stroke type (ischaemic stroke versus intracerebral haemorrhage).There are many Machine Learning/Deep Learning architectures proposed in the literature for NER. In this paper, wedraw from the line of work presented in Huang at al. (2015) , who employed Conditional Random Fields (CRF) on top of bidirectional Long Short-Term Memories (LSTM). Cornegruta et al. (2016) evaluated a NER method onradiology reports. They employed a bidirectional LSTM (BiLSTM) neural network architecture, which they contrastedwith a simple baseline of string matching against external lexicons.Transfer learning methods reuse machine learning models originally trained for a source task in a new target task.This idea has been adopted for doing NER tasks in a transferable manner, e.g. Arnold et al (2008) used featurehierarchy, Nothman et al. (2013) utilised the text and structure of Wikipedia, and Collobert and others created aconvolutional neural network to jointly train multiple tasks. Data
The datasets we used to perform NER consist of anonymised radiology reports from brain MRI and CT scansconducted as part of the Edinburgh Stroke Study (ESS) (n=1,168) and routine scans conducted by NHS Tayside(n=156,619). From the ESS data, a subset of 630 reports were annotated. From the Tayside collection, two sub-sets (Tay and TayExt) were selected and annotated. Each subset consists of reports for training/development of NERsystems, as well as held-out test reports for testing and system comparison (see Table 1 for some statistics). ESS dev ESS test Tay dev Tay test TayExt dev TayExt test
Table 1:
Number of reports, sentences and named entities per subset and development/training (dev) and test splits.ESS was the first set to be annotated by domain experts, and the rule-based EdIE-R system was developed on thisdataset. Data from NHS Tayside (Tay) was subsequently annotated with the same annotation scheme. This not onlyprovides us with additional data, but also introduces different distributions of entities. This difference in data wasfurther amplified by a second round of annotation on Tayside (TayExt), with reports specifically selected to includelow-frequency entities. Detailed frequency counts for entities annotated in ESS, Tay and TayExt are shown in Table 2.Each set contains rich annotations of named entities in the text but also includes negated entities, entity relations anddocument-level labels. In this paper, we only focus on the entity annotation, not distinguishing between positive andnegative entities. To ensure consistency, a first round of annotations from different annotators were compared beforeannotators carried out their work for the full datasets. Annotators showed very high inter-annotator agreement (IAA)on the test data sets (see
IAA column in Tables 4 and 5 in the Experiments section). We report IAA figures for theentire ESS test data and for a subset of 100 reports from the Tayside test data. NER System Descriptions
For our comparative experiments on NER performance, we chose to evaluate a rule-based, a deep learning and atransfer learning system which we introduce here. TayExt is a subset of Tayside brain imaging reports which mention one of a list of keywords (e.g. bleed*, subarachnoid, subdural, haemorrh*,hemorrh*, mass, tumour or tumor). This filtering was done to ensure that certain entities which appeared infrequently in the previous datasets wouldbe more frequent. We do not report results for the TayExt data because double annotation for that dataset has yet to be carried out. ntity Type ESS dev ESS test Tay dev Tay test TayExt dev TayExt testischaemic stroke 697 455 369 306 668 214haemorrhagic stroke 344 267 428 294 890 280stroke 60 26 32 9 33 5glioma tumour 0 0 10 9 32 12meningioma tumour 4 8 9 2 32 6metastasis tumour 24 12 61 119 117 35tumour 297 166 146 303 432 117subdural haematoma 244 109 75 95 968 309small vessel disease 427 276 61 173 288 74atrophy 246 153 105 168 350 90microhaemorrhage 12 10 0 6 1 2subarachnoid haemorrhage 13 10 49 16 135 54haemorrhagic transformation 5 2 16 1 44 10location:cortical 516 412 924 476 1775 665location:deep 524 343 299 574 697 273time:old 527 321 250 158 558 218time:recent 392 354 163 277 622 273
Table 2:
Per-entity frequency counts in the ESS, Tay and TayExt development (dev) and test datasets.
EdIE-R
EdIE-R (Edinburgh Information Extraction for Radiology reports) is a rule-based system. It consists of a full pipelinethat starts with the raw input text, and subsequently adds sectioning, tokenisation, sentence-splitting and linguisticannotation such as part-of-speech (POS) tagging and shallow syntactic analysis. Figure 2 provides a schematicoverview of the
EdIE-R pipeline.
Conversionto XMLRawInputEHR DocumentZoning Tokenization,SentencesplittingChunking Named EntityRecognitionPOS Tagging,LemmatizationRelationExtractionDocumentLabellingAnnotatedOutputEHR
Figure 2:
Overview of the
EdIE-R pipeline.Of particular interest to the comparative evaluation presented in this work is the NER step of
EdIE-R . At this stage inthe pipeline, the raw text has already been tokenised and POS tagged. Making use of hand-crafted rules and lexiconscreated in consultation with radiology experts, the system then uses the information derived during the previous stepsto perform NER for specific target entities.
EdIE-R has been shown to recognise named entities reliably accurately in brain imaging reports in the ESS data, whichwas used to write the original NER rules for this domain. We have subsequently updated the rules based on the newdevelopment data from Tayside and all the results reported here are from the updated version. Performance on the ESSdata has dropped very slightly from the earlier version but
EdIE-R performs very well on the new data. The relianceon hand-crafted rules makes it potentially costly and time-consuming to adapt the system to a different dataset, forexample, radiology reports for scans of other body parts or for other diseases as well as other types of raw text recordssuch as pathology reports.The initial
EdIE-R rule writing was done iteratively in parallel with rounds of annotation done by domain expertsbefore settling on an annotation scheme. Several rounds of annotation were carried out to create gold data for systemdevelopment and evaluation (ESS, Tay and TayExt). Having this annotated gold data available provided us with the The tokenised and POS tagged output of EdIE-R was used to prepare the datasets used for evaluation by all systems described in this paper. pportunity to try and test machine learning based methods which are typically used for NER on standard evaluationdatasets (e.g. CoNLL or ACE data).
EdIE-N
EdIE-N represents a machine learning based approach to the problem of NER for radiology reports. As opposedto
EdIE-R ’s hand-crafted rules,
EdIE-N infers named entity annotation from training data automatically, and appliesthese learned “rules” captured by the trained model to new data. character embedding word embeddingc , c , … , c m w , w , … , w n v v v m x LSTM LSTMLSTM LSTM LSTMLSTMLSTM LSTMLSTM LSTM LSTMLSTM x x n character level representation...... ......h h h n o o ... o n ... Figure 3:
Schematic of
EdIE-N entity recognition network.In particular, the system makes use of deep learning via a neural network architecture (see Figure 3).
EdIE-N employsa Conditional Random Field (CRF) on top of a bi-directional LSTM architecture to perform NER by assigning thebest score s of consecutive of output labels o to a given sentence. The input features x i of the classification networkare comprised of word embeddings, which get concatenated with word representations derived from a character-levelLSTM. This architecture is similar that of Huang et al. (2015) and Zhou et al. (2015) (see Figure 3).Both word embeddings as well as character embeddings can either be learned during training from randomly initializedembedding matrices, or looked up in pre-trained models. At training time, the CRF output layer is conditioned on theLSTM hidden layer representations h i . At test time, the system assigns entity annotations to each word according tothe most likely entity type as determined by the CRF.EdIE-N models can be trained either as a “monolithic” NER model, i.e. taking all possible entity types into accountand potentially making use of interactions between them, or as a “bag-of-models” system, where one model is trainedper unique entity type. We only report the results of the “bag-of-models” setup as using this approach makes it easierto add new entity types to an existing architecture. However, we have experimented with “monolythic” NER modelswhich resulted in broadly similar performances to the latter.As opposed to EdIE-R , the machine learning approach employed by
EdIE-N does not rely on hand-crafted rules forNER. Instead, the system is trained on an annotated gold standard, from which entity type assignments are learnedautomatically. This alleviates the need for expert knowledge for designing new rules, making it both fast and inexpen-sive to learn and abstract from any given dataset. However, as a fully supervised machine learning approach, it doesintroduce the need for annotated gold data for training. Moreover, there is a common understanding that a sufficientlylarge training dataset is needed for a model to learn enough examples so that it performs reasonably well on newdata. Creating such data is time-consuming. The other limitation to a machine learning based system is that it is verydifficult to conduct error analysis and determine the exact reason for system errors. emEHR
The third approach we chose to compare to is a NER tool which was originally developed and trained for otherpurposes. The main goal is to compare the above two approaches with a generic portable tool that is able to be adaptedfor this particular stroke subtyping task. The tool picked for this purpose is
SemEHR , which is an open source toolkitthat integrates text mining and semantic computing for identifying mentions of UMLS (Unified Medical LanguageSystem) concepts from clinical documents. Specifically, we adopted a SemEHR instance that has been trained onEHR data of South London and Maudsley, a psychiatric hospital in London. This instance was trained for identifyingphysical illnesses, such as liver diseases, HIV, diabetes etc. Each mention identified by
SemEHR was associated withthree-dimensional contextual information, i.e. negation (whether the condition was negated or affirmed), temporality(whether it was a recent or past event) and experiencer (whether the sufferer was the patient or other people).
SemEHR is based on GATE Bio-YODIE and was adapted in two steps. The first step involved generating a mappingfrom what SemEHR identifies (i.e. mentions of UMLS concepts) to what this study is looking for (i.e. the entity typeslisted in Table 2). For those entity types not in UMLS vocabulary (e.g. small vessel disease ), an additional dictionaryis generated and combined with
SemEHR ’s existing gazetteer. There are cases where one UMLS concept is mappedto different entity types (e.g.
C0038454 is mapped to stroke and ischaemic stroke ). To disambiguate them, the secondadaptation step was to train a machine learning model on those cases. Details and source code of the second step aremade available on GitHub. Experiments
We used strict CoNLL-style NER evaluation to compare the performance of the three systems on the different datasetsand report individual scores per entity type and overall NER scores. We report precision (P), recall (R) and balancedF-score (F1), the harmonic mean of precision and recall. In the case of EdIE-N we average scores over 5 runs toaccount for fluctuations in classification results due to random initializations in the network models.For EdIE-N first we report overall scores when training is performed on the development data of ESS, ESS plus Tayand all three development sets combined (ESS+Tay+TayExt). While the model trained on ESS data performs best onits own test data, we consider EdIE-N trained on all three datasets to be the better one as it performs best on the othertwo test sets and only slightly worse on ESS test (see Table 3). We use this model for the subsequent comparison.
Evaluation on Test data ESS Tay TayExtTraining data P R F1 P R F1 P R F1ESS
Table 3:
EdIE-N performance under different combinations of training and test data (best scores in boldface ).In our final experiment, we compare the rule-based
EdIE-R system against the
EdIE-N
LSTM-CRF architecture andthe
SemEHR transfer learning approach on the ESS, Tayside, and extended Tayside test sets (see Tables 4, 5 and 6).The rule-based system
EdIE-R outperforms both machine learning approaches, even reaching near IAA levels on theESS data. The gap between
EdIE-R and
SemEHR , the transfer learning approach incorporating rich information fromout-of-domain resources, is relatively small (ranging between 0.03 and 0.06 points in F1), especially on the ESS andextended Tayside data. It is a little more pronounced on the original Tayside test set.
EdIE-R clearly benefits fromtailored domain and data specific rules. Overall,
SemEHR performs remarkably accurately on the test data, matchingthe hand-crafted system for a few of the entity types. On all datasets, the machine learning approach applied by
EdIE-N falls behind the two other systems. However, these results are of little surprise, as
EdIE-N uses no externalknowledge such as access to an ontology, and relies entirely on features that are being derived automatically from thetarget texts. It is very likely that the performance of
EdIE-N can be further improved by using additional training data,incorporating additional domain knowledge, or optimising model parameters further. While
EdIE-R is the best overallsystem, there are certain labels, e.g. subarachnoid haemorrhage , for which
SemEHR consistently performs better. https://gate.ac.uk/applications/bio-yodie.html valuation on ESS test EdIE-R EdIE-N SemEHR IAAEntity Type P R F1 P R F1 P R F1 P R F1ischaemic stroke Table 4:
NER results and IAA scores on the ESS test data.
Evaluation on Tay test EdIE-R EdIE-N SemEHR IAAEntity Type P R F1 P R F1 P R F1 P R F1ischaemic stroke - - -meningioma tumour - - -metastasis tumour
Table 5:
NER results and IAA scores on the Tay test dataset.
Conclusions
We have presented a system comparison for the task of labelling Named Entities in Electronic Health Records. Threeapproaches to the task were evaluated on three data sets. A hand-written system engineered by domain experts wasable to consistently outperform a transfer learning system, applying previously established rules to new data, and adata-driven machine learning system.The results confirm previously established findings that a hand-written, rule-based approach is able to perform NERon written EHR data very accurately, albeit for a high development cost in terms of time and effort afforded by domainexperts. While the machine learning approach performed worse in our comparison, we are still slightly optimistic thatsuch an approach can be reasonably employed where there are either no experts readily available, or a system has to valuation on TayExt test EdIE-R EdIE-N SemEHREntity Type P R F1 P R F1 P R F1ischaemic stroke glioma tumour metastasis tumour tumour microhaemorrhage haemorrhagic transformation 0.28 location:cortical time:old
Table 6:
NER results on the TayExt test dataset.be developed quickly for a relatively low cost. The transfer learning approach showed impressive results, presenting aviable alternative to an entirely hand-written system, though still requiring a good deal of human manipulation.In the future, we would like to further improve the ways to reliably and automatically label Named Entities in EHRs.Of particular interest are more experiments on the machine learning approach, where especially more fine-grainedtuning of hyper parameters promises to be able to yield better performance results. Additionally, we would like toexplore the possibility of combining the approaches presented in this paper. The modular nature of the overall EHRprocessing pipeline could enable us to employ the different NER systems according to their individual strengths andweaknesses, potentially leading to a better overall performance downstream, e.g, at the document labelling stage.Another interesting future direction is the rapid development of new systems in multiple iterations. When required,one could start by rapidly and inexpensively adding a new label to the system using the machine learning approach,and subsequently improving on it by utilising its results to guide the transfer or hand-crafting of reliable rules.
Acknowledgements
Gorinski, Tobin, Grover, Alex and Whalley are supported by the MRC Mental Health Data Pathfinder Award (MRC- MC PC 17209). Wu is MRC/Rutherford fellow of HRD UK (MR/S004149/1). Grover was and Alex is supportedby The Alan Turing Institute (EPSRC grant EP/N510129/1). Whiteley was supported by an MRC Clinician ScientistAward (G0902303) and is supported by a Scottish Senior Clinical Fellowship (CAF/17/01). Sudlow is Chief Scientistof UK Biobank and Director of HDR UK Scotland.
References
1. Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticae Investigationes. 2007, 30(1):3-26.2. Ritter A, Clark S, Etzioni O. Named entity recognition in tweets: an experimental study. In: Proceedings of the conference onempirical methods in natural language processing, 2011, pp. 1524-34.3. Rockt¨aschel T, Weidlich M, Leser U. ChemSpot: a hybrid system for chemical named entity recognition. Bioinformatics. 2012,28(12):1633-40.4. Leaman R, Gonzalez G. BANNER: an executable survey of advances in biomedical named entity recognition. In: Biocomputing2008, pp. 652-63.5. Schuster M, Kuldip KP. Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing,1997, pp. 2673-81.6. Cornegruta S, Bakewell R, Withey S, Montana G. Modelling radiological language with bidirectional long short-term memorynetworks. In: Proceedings of the 7th International Workshop on Health Text Mining and Information Analysis, 2016, pp. 17-27.. Tjong KS, Erik F, De Meulder F. Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recog-nition. In: Proceedings of CoNLL, 2003, pp. 142-7.8. Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition. In: Proceedings of the ThirteenthConference on Computational Natural Language Learning, 2009, pp. 147-55.9. McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4,2003, pp. 188-91.10. Finkel JR, Grenager T, Manning C. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sam-pling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005, pp. 363-70.11. Cunningham H, Maynard D, Bontcheva K, Tablan V. GATE: A Framework and Graphical Development Environment for Ro-bust NLP Tools and Applications. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002, pp. 168-75.12. Settles B. Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of theinternational joint workshop on natural language processing in biomedicine and its applications 2004 Aug 28, pp. 104-7.13. Ogren PV, Savova GK, Chute CG. Constructing evaluation corpora for automated clinical named entity recognition. InMedinfo2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems 2007,IOS Press, p. 2325.14. Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, Chute CG. Mayo clinical Text Analysis and Knowl-edge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American MedicalInformatics Association. 2010 Sep 1, 17(5):507-13.15. Alex B, Haddow B, Grover C. Recognising nested named entities in biomedical text. In: Proceedings of the Workshop onBioNLP 2007: Biological, Translational, and Clinical Language Processing. 2007, pp. 65-72.16. Grover C, Haddow B, Klein E, Matthews M, Nielsen LA, Tobin R, Wang X. Adapting a relation extraction pipeline for theBioCreAtIvE II task. Proceedings of the BioCreAtIvE II Workshop. 2007, pp. 273-86.17. Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronichealth record: a review of recent research. Yearbook of Medical Informatics. 2008(01);17:128-44.18. Pons E, Braun LMM, Hunink MGM, Kors JA. Natural language processing in radiology: a systematic review. Radiology.2016(2);279:329-343.19. Flynn RWV, Macdonald TM, Schembri N, Murray GD, Doney ASF. Automated data capture from free-text radiology reportsto enhance accuracy of hospital inpatient stroke codes. Pharmacoepidemiology and drug safety. 2010, 19(8):843-7.20. Pan SJ, Yang Q. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 2010 Oct 1,22(10):1345-59.21. Nothman J, Ringland N, Radford W, Murphy T, Curran JR. Learning multilingual named entity recognition from Wikipedia.Artificial Intelligence. 2013 Jan 1, 194:151-75.22. Arnold A, Nallapati R, Cohen WW. Exploiting feature hierarchy for transfer learning in named entity recognition. In: Proceed-ings of ACL-08: HLT. 2008:245-53.23. Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning.In: Proceedings of the 25th International Conference on Machine Learning, ACM, 2008, pp. 160-7.24. Jackson C, Crossland L, Dennis M, Wardlaw J, Sudlow C. Assessing the impact of the requirement for explicit consent in ahospital-based stroke study. QJM: Monthly Journal of the Association of Physicians. 2008, 101(4):281-9.25. Alex B, Grover C, Tobin R, Sudlow C, Mair G, Whiteley W. Text Mining Brain Imaging Reports. Journal of BiomedicalSemantics, accepted for a special issue to appear in 2019, preprint available at .26. Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data.In: Proceedings of the 18th International Conference on Machine Learning, Morgan Kaufmann, 2001, pp. 282-9.27. Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv preprint arXiv:1508.01991, 2015.28. Zhou J, Xu W. End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks. In: Proceedings of the53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on NaturalLanguage Processing, Beijing, China, 2015, pp. 1127-37,29. Wu H, Toti G, Morley KI, Ibrahim ZM, Folarin A, Jackson R, Kartoglu I, Agrawal A, Stringer C, Gale D, Gorrell G, etal. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trialrecruitment, and clinical research. Journal of the American Medical Informatics Association, 2018, 25(5):530-7.30. Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research,2004, 32:D267–D270.31..26. Lafferty J, McCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data.In: Proceedings of the 18th International Conference on Machine Learning, Morgan Kaufmann, 2001, pp. 282-9.27. Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv preprint arXiv:1508.01991, 2015.28. Zhou J, Xu W. End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks. In: Proceedings of the53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on NaturalLanguage Processing, Beijing, China, 2015, pp. 1127-37,29. Wu H, Toti G, Morley KI, Ibrahim ZM, Folarin A, Jackson R, Kartoglu I, Agrawal A, Stringer C, Gale D, Gorrell G, etal. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trialrecruitment, and clinical research. Journal of the American Medical Informatics Association, 2018, 25(5):530-7.30. Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research,2004, 32:D267–D270.31.