Kerstin Denecke
Leipzig University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kerstin Denecke.
Information Sciences | 2009
Kerstin Denecke; Wolfgang Nejdl
It is still an open question where to search for complying a specific information need due to the large amount and diversity of information available. In this paper, a content analysis of health-related information provided in the Web is performed to get an overview on the medical content available. In particular, the content of medical Question & Answer Portals, medical weblogs, medical reviews and Wikis is compared. For this purpose, medical concepts are extracted from the text material with existing extraction technology. Based on these concepts, the content of the different knowledge resources is compared. Since medical weblogs describe experiences as well as information, it is of large interest to be able to distinguish between informative and affective posts. For this reason, a method to classify blogs based on their information content is presented, which exploits high-level features describing the medical and affective content of blog posts. The results show that there are substantial differences in the content of various health-related Web resources. Weblogs and answer portals mainly deal with diseases and medications. The Wiki and the encyclopedia provide more information on anatomy and procedures. While patients and nurses describe personal aspects of their life, doctors aim to present health-related information in their blog posts. The knowledge on content differences and information content can be exploited by search engines to improve ranking, search and to direct users to appropriate knowledge sources.
Milbank Quarterly | 2014
Edward Velasco; Tumacha Agheneza; Kerstin Denecke; Göran Kirchner; Tim Eckmanns
Context: The exchange of health information on the Internet has been heralded as an opportunity to improve public health surveillance. In a field that has traditionally relied on an established system of mandatory and voluntary reporting of known infectious diseases by doctors and laboratories to governmental agencies, innovations in social media and so-called user-generated information could lead to faster recognition of cases of infectious disease. More direct access to such data could enable surveillance epidemiologists to detect potential public health threats such as rare, new diseases or early-level warnings for epidemics. But how useful are data from social media and the Internet, and what is the potential to enhance surveillance? The challenges of using these emerging surveillance systems for infectious disease epidemiology, including the specific resources needed, technical requirements, and acceptability to public health practitioners and policymakers, have wide-reaching implications for public health surveillance in the 21st century. Methods: This article divides public health surveillance into indicator-based surveillance and event-based surveillance and provides an overview of each. We did an exhaustive review of published articles indexed in the databases PubMed, Scopus, and Scirus between 1990 and 2011 covering contemporary event-based systems for infectious disease surveillance. Findings: Our literature review uncovered no event-based surveillance systems currently used in national surveillance programs. While much has been done to develop event-based surveillance, the existing systems have limitations. Accordingly, there is a need for further development of automated technologies that monitor health-related information on the Internet, especially to handle large amounts of data and to prevent information overload. The dissemination to health authorities of new information about health events is not always efficient and could be improved. No comprehensive evaluations show whether event-based surveillance systems have been integrated into actual epidemiological work during real-time health events. Conclusions: The acceptability of data from the Internet and social media as a regular part of public health surveillance programs varies and is related to a circular challenge: the willingness to integrate is rooted in a lack of effectiveness studies, yet such effectiveness can be proved only through a structured evaluation of integrated systems. Issues related to changing technical and social paradigms in both individual perceptions of and interactions with personal health data, as well as social media and other data from the Internet, must be further addressed before such information can be integrated into official surveillance systems.
Artificial Intelligence in Medicine | 2015
Kerstin Denecke; Yihan Deng
OBJECTIVE Clinical documents reflect a patients health status in terms of observations and contain objective information such as descriptions of examination results, diagnoses and interventions. To evaluate this information properly, assessing positive or negative clinical outcomes or judging the impact of a medical condition on patients well being are essential. Although methods of sentiment analysis have been developed to address these tasks, they have not yet found broad application in the medical domain. METHODS AND MATERIAL In this work, we characterize the facets of sentiment in the medical sphere and identify potential use cases. Through a literature review, we summarize the state of the art in healthcare settings. To determine the linguistic peculiarities of sentiment in medical texts and to collect open research questions of sentiment analysis in medicine, we perform a quantitative assessment with respect to word usage and sentiment distribution of a dataset of clinical narratives and medical social media derived from six different sources. RESULTS Word usage in clinical narratives differs from that in medical social media: Nouns predominate. Even though adjectives are also frequently used, they mainly describe body locations. Between 12% and 15% of sentiment terms are determined in medical social media datasets when applying existing sentiment lexicons. In contrast, in clinical narratives only between 5% and 11% opinionated terms were identified. This proves the less subjective use of language in clinical narratives, requiring adaptations to existing methods for sentiment analysis. CONCLUSIONS Medical sentiment concerns the patients health status, medical conditions and treatment. Its analysis and extraction from texts has multiple applications, even for clinical narratives that remained so far unconsidered. Given the varying usage and meanings of terms, sentiment analysis from medical documents requires a domain-specific sentiment source and complementary context-dependent features to be able to correctly interpret the implicit sentiment.
international world wide web conferences | 2010
Mikalai Tsytsarau; Themis Palpanas; Kerstin Denecke
Our study addresses the problem of large-scale contradiction detection and management, from data extracted from the Web. We describe the first systematic solution to the problem, based on a novel statistical measure for contradictions, which exploits first- and second-order moments of sentiments. Our approach enables the interactive analysis and online identification of contradictions under multiple levels of time granularity. The proposed algorithm can be used to analyze and track opinion evolution over time and to identify interesting trends and patterns. It uses an incrementally updatable data structure to achieve computational efficiency and scalability. Experiments with real datasets show promising time performance and accuracy.
Methods of Information in Medicine | 2008
Kerstin Denecke
OBJECTIVES This paper introduces SeReMeD (Semantic Representation of Medical Documents), a method for automatically generating knowledge representations from natural language documents. The suitability of the Unified Medical Language System (UMLS) as domain knowledge for this method is analyzed. METHODS SeReMeD combines existing language engineering methods and semantic transformation rules for mapping syntactic information to semantic roles. In this way, the relevant content of medical documents is mapped to semantic structures. In order to extract specific data, these semantic structures are searched for concepts and semantic roles. A study is carried out that uses SeReMeD to detect specific data in medical narratives such as documented diagnoses or procedures. RESULTS The system is tested on chest X-ray reports. In first evaluations of the systems performance, the generation of semantic structures achieves a correctness of 80%, whereas the extraction of documented findings obtains values of 93% precision and 83% recall. CONCLUSIONS The results suggest that the methods described here can be used to accurately extract data from medical narratives, although there is also some potential for improving the results. The proposed methods provide two main benefits. By using existing language engineering methods, the effort required to construct a medical information extraction system is reduced. It is also possible to change the domain knowledge and therefore to create a more (or less) specialized system, capable of handling various medical sub-domains.
conference on information and knowledge management | 2010
Marco Fisichella; Avaré Stewart; Kerstin Denecke; Wolfgang Nejdl
Recent pandemics such as Swine Flu have caused concern for public health officials. Given the ever increasing pace at which infectious diseases can spread globally, officials must be prepared to react sooner and with greater epidemic intelligence gathering capabilities. However, state-of-the-art systems for Epidemic Intelligence have not kept the pace with the growing need for more robust public health event detection. In this paper, we propose a game-changing approach where public health events are detected in an unsupervised manner. We address the problems associated with adapting an unsupervised learner to the medical domain and in doing so, propose an approach which combines aspects from different feature-based event detection methods. We evaluate our approach with a real world dataset with respect to the quality of article clusters. Our results show that we are able to achieve a precision of 66% and a recall of 81% when evaluated using manually annotated, real-world data. This shows promising results for the use of such techniques in this new problem setting.
international world wide web conferences | 2012
Ernesto Diaz-Aviles; Avaré Stewart; Edward Velasco; Kerstin Denecke; Wolfgang Nejdl
In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capabilities? In May 2011, Germany reported one of the largest described outbreaks of Enterohemorrhagic Escherichia coli (EHEC). By end of June, 47 persons had died. After the detection of the outbreak, authorities investigating the cause and the impact in the population were interested in the analysis of micro-blog data related to the event. Since Thousands of tweets related to this outbreak were produced every day, this task was overwhelming for experts participating in the investigation. In this work, we propose a Personalized Tweet Ranking algorithm for Epidemic Intelligence (PTR4EI), that provides users a personalized, short list of tweets based on the users context. PTR4EI is based on a learning to rank framework and exploits as features, complementary context information extracted from the social hash-tagging behavior in Twitter. Our experimental evaluation on a dataset, collected in real-time during the EHEC outbreak, shows the superior ranking performance of PTR4EI. We believe our work can serve as a building block for an open early warning system based on Twitter, helping to realize the vision of Epidemic Intelligence for the Crowd, by the Crowd.
Social Media Tools and Platforms in Learning Environments | 2011
Kerstin Denecke; Avaré Stewart
The amount of social media data dealing with medical and health issues increased significantly in the last couple of years. Patients, physicians, and other health professionals are willing to share their knowledge and experiences in the Web. Medical social media data now provides a new source of information within a learning context, for various learners. The variety of such content provides opportunities for a broad range of applications to exploit this data and support these learners in gaining knowledge. A potential benefit is that communication barriers are much lower for social media tools than communication through traditional channels. The objective of this chapter is to highlight the potentials for learning from medical social media data. Various characteristics of learning from this data will be presented and their impact to groups of learners is highlighted. Further, potential real-world applications are described. Taking this as a basis, the challenges for technology development in this context will be discussed.
Methods of Information in Medicine | 2017
Stefan Kropf; P. Krücken; W. Mueller; Kerstin Denecke
BACKGROUND Clinical information is often stored as free text, e.g. in discharge summaries or pathology reports. These documents are semi-structured using section headers, numbered lists, items and classification strings. However, it is still challenging to retrieve relevant documents since keyword searches applied on complete unstructured documents result in many false positive retrieval results. OBJECTIVES We are concentrating on the processing of pathology reports as an example for unstructured clinical documents. The objective is to transform reports semi-automatically into an information structure that enables an improved access and retrieval of relevant data. The data is expected to be stored in a standardized, structured way to make it accessible for queries that are applied to specific sections of a document (section-sensitive queries) and for information reuse. METHODS Our processing pipeline comprises information modelling, section boundary detection and section-sensitive queries. For enabling a focused search in unstructured data, documents are automatically structured and transformed into a patient information model specified through openEHR archetypes. The resulting XML-based pathology electronic health records (PEHRs) are queried by XQuery and visualized by XSLT in HTML. RESULTS Pathology reports (PRs) can be reliably structured into sections by a keyword-based approach. The information modelling using openEHR allows saving time in the modelling process since many archetypes can be reused. The resulting standardized, structured PEHRs allow accessing relevant data by retrieving data matching user queries. CONCLUSIONS Mapping unstructured reports into a standardized information model is a practical solution for a better access to data. Archetype-based XML enables section-sensitive retrieval and visualisation by well-established XML techniques. Focussing the retrieval to particular sections has the potential of saving retrieval time and improving the accuracy of the retrieval.
Archive | 2013
Kerstin Denecke; Nazli Soltani
Medical social-media data provides a wealth of data generated by both healthcare professionals and patients alike. In fact, there are many medical social-media sites such as forums, where patients freely dialog with a healthcare professional or with other patients, often posing questions and responding to advice, or Weblogs, where groups of people describe their experiences with medical conditions and the various treatment plans to treat those conditions. All in all, one can no longer ignore the fact that social media has dramatically changed the structure of healthcare delivery in many ways. Simply from a medical data standpoint alone, social-media platforms have altered the way medical information is disseminated. That is, important medical information is no longer found exclusively in patients’ clinical narratives, commonly shared by physicians and other healthcare workers at regular professional meetings and conferences. Instead, user-generated content on the Web has become a new source of useful information to be added to the conventional methods of collecting clinical data. The challenge we face, however, is to design information extraction tools that can make the rich resources of medical data found in social-media postings exploitable. In this chapter we analyze the linguistic features of medical social-media postings juxtaposed to the linguistic features of both clinical narratives (e.g., discharge summaries, chart reviews, and operative reports) and biomedical literature, for which there already exists tools for performing information extraction. We show the shortcomings of these mapping tools when applied to medical social-media postings, and propose ways to improve such tools so that the wealth of medical data located in medical social-media can be made available to healthcare providers, pharmaceutical companies, and government-supported epidemiological agencies.