Gergely Héja | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gergely Héja is active.

Explore More

Publication

Featured researches published by Gergely Héja.

International Journal of Medical Informatics | 2003

Using n-gram method in the decomposition of compound medical diagnoses

Gergely Héja; György Surján

OBJECTIVE Our goal in this study was to find an easy to implement method to detect compound medical diagnosis in Hungarian medical language and decompose them into expressions referring to a single disease. METHODS A corpus of clinical diagnoses extracted form discharge reports (3,079 expressions, each of them referring to only one disease) was represented in an n-gram tree (a series of n consecutive word). A matching algorithm was implemented in a software, which is able to identify sensible n-grams existing both in test expressions and in the n-gram tree. A test sample of another 92 diagnoses was decomposed by two independent humans and by the software. The decompositions were compared with measure the recall and the precision of the method. RESULTS There was not full agreement between the decompositions of the humans, (which underlines the relevance of the problem). A consensus was arrived in all disagreed point by a third opinion and open discussion. The resulting decomposition was used as a gold standard and compared with the decomposition produced by the computer. The recall was 82.6% the precision 37.2%. After correction of spelling errors in the test sample the recall increased to 88.6% while the precision slightly decreased to 36.7%. CONCLUSION The proposed method seems to be useful in decomposition of compound diagnostic expressions and can improve quality of diagnostic coding of clinical cases. Other statistical methods (like vector space methods or neural networks) usually offer a ranked list of candidate codes either for single or compound expressions, and do not warn the user how many codes should be chosen. We propose our method especially in a situation where formal NLP techniques are not available, as it is the case with scarcely spoken languages like Hungarian.Compound diagnoses are often assigned to just one disease code. This is a known cause of coding error. Our paper outlines an efficient, cheap and easy to implement method for semi-atutomatic decomposition of such diagnostic expressions. The proposed method is based on n-grams. To verify the method two human encoders were asked to analyse the same set of 92 clinical diagnoses. Their results were compared to the analysis produced by the method. The results demonstrate the reasonability of the approach.

computer based medical systems | 2002

Semi-automatic classification of clinical diagnoses with hybrid approach

Gergely Héja; György Surján

The authors present a hybrid approach to assist the laborious work of coding of medical reports. The system consists of four components: an n-gram based module, a modified vector-space module, a neural module, and an XML representation of the ICD coding system. It supports the coding of clinical diagnoses to ICD.

BMC Medical Informatics and Decision Making | 2008

Ontological analysis of SNOMED CT

Gergely Héja; György Surján; Péter Varga

International Journal of Medical Informatics | 2007

GALEN based formal representation of ICD10

Gergely Héja; György Surján; Gergely Lukácsy; Péter Pallinger; Miklós Gergely

medical informatics europe | 2006

Restructuring the foundational model of anatomy.

Gergely Héja; Péter Varga; Péter Pallinger; György Surján

medical informatics europe | 2008

Design principles of DOLCE-based formal representation of ICD10.

Gergely Héja; Péter Varga; György Surján

medical informatics europe | 2003

About the language of Hungarian discharge reports.

György Surján; Gergely Héja

medical informatics europe | 1999

Maintenance of self-consistency of coding tables by statistical analysis of word co-occurrences.

György Surján; Gergely Héja

medical informatics europe | 2005

GALEN Based Formal Representation of ICD10.

Gergely Héja; György Surján; Gergely Lukácsy; Péter Pallinger; Miklós Gergely

Studies in health technology and informatics | 2001

Indexing of medical diagnoses by word affinity method.

György Surján; Gergely Héja

Explore More

Collaboration

Dive into the Gergely Héja's collaboration.

Top Co-Authors

Péter Pallinger

Hungarian Academy of Sciences

View shared research outputs

Top Co-Authors

Péter Varga

Eötvös Loránd University

View shared research outputs

Top Co-Authors

Gergely Lukácsy

Budapest University of Technology and Economics

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Gergely Héja is active.

Publication

Featured researches published by Gergely Héja.

Using n-gram method in the decomposition of compound medical diagnoses

Semi-automatic classification of clinical diagnoses with hybrid approach

Ontological analysis of SNOMED CT

GALEN based formal representation of ICD10

Restructuring the foundational model of anatomy.

Design principles of DOLCE-based formal representation of ICD10.

About the language of Hungarian discharge reports.

Maintenance of self-consistency of coding tables by statistical analysis of word co-occurrences.

GALEN Based Formal Representation of ICD10.

Indexing of medical diagnoses by word affinity method.

Collaboration

Dive into the Gergely Héja's collaboration.