Sophia Katrenko
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sophia Katrenko.
Genome Biology | 2008
Larry Smith; Lorraine K. Tanabe; Rie Johnson nee Ando; Cheng-Ju Kuo; I-Fang Chung; Chun-Nan Hsu; Yu-Shi Lin; Roman Klinger; Christoph M. Friedrich; Kuzman Ganchev; Manabu Torii; Hongfang Liu; Barry Haddow; Craig A. Struble; Richard J. Povinelli; Andreas Vlachos; William A. Baumgartner; Lawrence Hunter; Bob Carpenter; Richard Tzong-Han Tsai; Hong-Jie Dai; Feng Liu; Yifei Chen; Chengjie Sun; Sophia Katrenko; Pieter W. Adriaans; Christian Blaschke; Rafael Torres; Mariana Neves; Preslav Nakov
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.
Bioinformatics | 2011
Quoc-Chinh Bui; Sophia Katrenko; Peter M. A. Sloot
MOTIVATION Protein-protein interactions (PPIs) play an important role in understanding biological processes. Although recent research in text mining has achieved a significant progress in automatic PPI extraction from literature, performance of existing systems still needs to be improved. RESULTS In this study, we propose a novel algorithm for extracting PPIs from literature which consists of two phases. First, we automatically categorize the data into subsets based on its semantic properties and extract candidate PPI pairs from these subsets. Second, we apply support vector machines (SVMs) to classify candidate PPI pairs using features specific for each subset. We obtain promising results on five benchmark datasets: AIMed, BioInfer, HPRD50, IEPA and LLL with F-scores ranging from 60% to 84%, which are comparable with the state-of-the-art PPI extraction systems. Furthermore, our system achieves the best performance on cross-corpora evaluation and comparative performance in terms of computational efficiency. AVAILABILITY The source code and scripts used in this article are available for academic use at http://staff.science.uva.nl/~bui/PPIs.zip CONTACT [email protected].
international semantic web conference | 2005
Willem Robert van Hage; Sophia Katrenko; Guus Schreiber
We discuss four linguistic ontology-mapping techniques and evaluate them on real-life ontologies in the domain of food. Furthermore we propose a method to combine ontology-mapping techniques with high Precision and Recall to reduce the necessary amount of manual labor and computation.
KDECB'06 Proceedings of the 1st international conference on Knowledge discovery and emergent complexity in bioinformatics | 2006
Sophia Katrenko; Pieter W. Adriaans
In this paper we address the relation learning problem in the biomedical domain. We propose a representation which takes into account the syntactic information and allows for using different machine learning methods. To carry out the syntactic analysis, three parsers, LinkParser, Minipar and Charniak parser were used. The results we have obtained are comparable to the performance of relation learning systems in the biomedical domain and in some cases out-perform them. In addition, we have studied the impact of ensemble methods on learning relations using the representation we proposed. Given that recall is very important for the relation learning, we explored the ways of improving it. It has been shown that ensemble methods provide higher recall and precision than individual classifiers alone.
Journal of Artificial Intelligence Research | 2010
Sophia Katrenko; Pieter W. Adriaans; Maarten van Someren
This paper discusses the problem of marrying structural similarity with semantic relatedness for Information Extraction from text. Aiming at accurate recognition of relations, we introduce local alignment kernels and explore various possibilities of using them for this task. We give a definition of a local alignment (LA) kernel based on the Smith-Waterman score as a sequence similarity measure and proceed with a range of possibilities for computing similarity between elements of sequences. We show how distributional similarity measures obtained from unlabeled data can be incorporated into the learning task as semantic knowledge. Our experiments suggest that the LA kernel yields promising results on various biomedical corpora outperforming two baselines by a large margin. Additional series of experiments have been conducted on the data sets of seven general relation types, where the performance of the LA kernel is comparable to the current state-of-the-art results.
international conference on computational linguistics | 2008
Sophia Katrenko; Pieter W. Adriaans
This paper discusses local alignment kernels in the context of the relation extraction task. We define a local alignment kernel based on the Smith-Waterman measure as a sequence similarity metric and proceed with a range of possibilities for computing a similarity between elements of sequences. We propose to use distributional similarity measures on elements and by doing so we are able to incorporate extra information from the unlabeled data into a learning task. Our experiments suggest that a LA kernel provides promising results on some biomedical corpora largely outperforming a baseline.
meeting of the association for computational linguistics | 2008
Sophia Katrenko; Pieter W. Adriaans
There is disclosed a management system comprising an information management apparatus for managing maintenance information of image forming apparatuses, and a transmitter connected to the image forming apparatuses. In the management system, the transmitter transmits the maintenance information to the information management apparatus through a telephone line and an exchange unit, and a line controller connects the telephone to the telephone line so as to speak using the telephone normally, and further, cancels the connection of the telephone to the telephone line and connects the transmitter to the telephone line when the transmitter transmits the maintenance information to the information management apparatus.
meeting of the association for computational linguistics | 2007
Sophia Katrenko; Pieter W. Adriaans
A cooling system for an electron microscope specimen comprises a refrigerant container, a passage for introducing the refrigerant into a heat-exchanging position, a flow rate-adjusting valve mounted in an exhaust passage and a control unit. The flow rate of gas produced by evaporation is adjusted by a rate-adjusting valve. The control unit establishes either a maximum flow rate mode for rapid cooling or a minimum flow rate mode for observation of an image.
Information Systems | 2004
Sophia Katrenko
This paper primarily focuses on applying and evaluation of phrase-based representation used while classifying documents. This issue has been discussed over last decades but unfortunately not in all cases the usage of it improved accuracy of existing systems. We try to give an explanation for this and to carry out some experiments aiming at improving document categorization results.
Annals of information systems | 2010
M. Scott Marshall; Marco Roos; Edgar Meij; Sophia Katrenko; Willem Robert van Hage; Pieter W. Adriaans
The Virtual Laboratory for e-Science (VL-e) project serves as a backdrop for the ideas described in this chapter. VL-e is a project with academic and industrial partners where e-science has been applied to several domains of scientific research. Adaptive Information Disclosure (AID), a subprogram within VL-e, is a multi-disciplinary group that concentrates expertise in information extraction, machine learning, and Semantic Web – a powerful combination of technologies that can be used to extract and store knowledge in a Semantic Web framework. In this chapter, the authors explain what “semantic disclosure” means and how it is essential to knowledge sharing in e-Science. The authors describe several Semantic Web applications and how they were built using components of the AIDA Toolkit (AID Application Toolkit). The lessons learned and the future of e-Science are also discussed.