Tamara Polajnar
University of Glasgow
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tamara Polajnar.
Statistics and Computing | 2010
Simon Rogers; Mark A. Girolami; Tamara Polajnar
Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a pre-processing step (normally assigning the majority class) is performed. We propose Bayesian models for classification and ordinal regression that naturally incorporate multiple expert opinions in defining predictive distributions. The models make use of Gaussian process priors, resulting in great flexibility and particular suitability to text based problems where the number of covariates can be far greater than the number of data instances. We show that using all labels rather than just the majority improves performance on a recent biological dataset.
information interaction in context | 2010
Richard Glassey; Desmond Elliott; Tamara Polajnar; Leif Azzopardi
This paper presents an interaction-based information filtering system designed for the needs of children accessing multiple streams of information. This is an emerging problem due to the increased information access and engagement by children for their education and entertainment, and the explosion of stream-based information sources on most topics. It has been shown that children have difficulties formulating text-based queries and using interfaces primarily designed for adults. The in-progress system presented in this paper attempts to address these difficulties by employing an interaction-based interface that simplifies the expression of information needs and adapts itself to user interests over time. To overcome issues of content moderation, the system aggregates multiple child-friendly information feeds and performs offline processing to facilitate topic filtering. A set of standing topics are created for initial interaction and subsequent interactions are used to infer and refine which topics the child would most likely want to have presented. A simple and easy-to-use interface is presented which uses relevance information to determine the appropriate size of the document title to display to act as a relevance-cue to the user. The planned research focuses on validating the interaction-based approach with both child and adult populations to discover the differences and similarities that may exist.
european conference on information retrieval | 2013
Tamara Polajnar; Nitish Aggarwal; Kartik Asooja; Paul Buitelaar
Explicit semantic analysis (ESA) is a technique for computing semantic relatedness between natural language texts. It is a document-based distributional model similar to latent semantic analysis (LSA), which is often built on the Wikipedia database when it is required for general English usage. Unlike LSA, however, ESA does not use dimensionality reduction, and therefore it is sometimes unable to account for similarity between words that do not co-occur with same concepts, even if their concepts themselves cover similar subjects. In the Wikipedia implementation ESA concepts are Wikipedia articles, and the Wikilinks between the articles are used to overcome the concept-similarity problem. In this paper, we provide two general solutions for integration of concept-concept similarities into the ESA model, ones that do not rely on a particular corpus structure and do not alter the explicit concept-mapping properties that distinguish ESA from models like LSA and latent Dirichlet allocation (LDA).
information interaction in context | 2012
Carsten Eickhoff; Leif Azzopardi; Djoerd Hiemstra; Franciska de Jong; Arjen P. de Vries; Doug Dowie; Sérgio Duarte; Richard Glassey; Karl Gyllstrom; Frea Kruisinga; Kelly Marshall; Sien Moens; Tamara Polajnar; Frans van der Sluis
When undergoing medical treatment in combination with extended stays in hospitals, children have been frequently found to develop an interest in their condition and the course of treatment. A natural means of searching for related information would be to use a web search engine. The medical domain, however, imposes several key challenges on young and inexperienced searchers, such as difficult terminology, potentially frightening topics or non-objective information offered by lobbyists or pharmaceutical companies. To address these problems, we present the design and usability study of EmSe, a search service for children in a hospital environment.
european conference on information retrieval | 2011
Carsten Eickhoff; Tamara Polajnar; Karl Gyllstrom; Sergio Duarte Torres; Richard Glassey
The Internet plays an important role in peoples daily lives. This is not only true for adults, but also holds for children; however, current web search engines are designed with adult users and their cognitive abilities in mind. Consequently, children face considerable barriers when using these information systems. In this work, we demonstrate the use of query assistance and search moderation techniques as well as appropriate interface design to overcome or mitigate these challenges.
Journal of Biomedical Semantics | 2011
Tamara Polajnar; Theodoros Damoulas; Mark A. Girolami
BackgroundDetection of sentences that describe protein-protein interactions (PPIs) in biomedical publications is a challenging and unresolved pattern recognition problem. Many state-of-the-art approaches for this task employ kernel classification methods, in particular support vector machines (SVMs). In this work we propose a novel data integration approach that utilises semantic kernels and a kernel classification method that is a probabilistic analogue to SVMs. Semantic kernels are created from statistical information gathered from large amounts of unlabelled text using lexical semantic models. Several semantic kernels are then fused into an overall composite classification space. In this initial study, we use simple features in order to examine whether the use of combinations of kernels constructed using word-based semantic models can improve PPI sentence detection.ResultsWe show that combinations of semantic kernels lead to statistically significant improvements in recognition rates and receiver operating characteristic (ROC) scores over the plain Gaussian kernel, when applied to a well-known labelled collection of abstracts. The proposed kernel composition method also allows us to automatically infer the most discriminative kernels.ConclusionsThe results from this paper indicate that using semantic information from unlabelled text, and combinations of such information, can be valuable for classification of short texts such as PPI sentences. This study, however, is only a first step in evaluation of semantic kernels and probabilistic multiple kernel learning in the context of PPI detection. The method described herein is modular, and can be applied with a variety of feature types, kernels, and semantic models, in order to facilitate full extraction of interacting proteins.
data mining in bioinformatics | 2011
Tamara Polajnar; Simon Rogers; Mark A. Girolami
The non-parametric deterministic Support Vector Machines (SVMs) produce high levels of performances in text classification. This article offers a much needed evaluation of the Gaussian Process (GP) classifier, as a non-parametric probabilistic analogue to SVMs, which has been rarely applied to text classification. We provide an extensive experimental comparison of the performance and properties of these competing classifiers on the challenging problem of protein interaction detection in biomedical publications. Our results show that GPs can match the performance of SVMs without the need for costly margin parameter tuning, whilst offering the advantage of an extendable probabilistic framework for text classification.
international acm sigir conference on research and development in information retrieval | 2010
Desmond Elliot; Richard Glassey; Tamara Polajnar; Leif Azzopardi
Children face several challenges when using information access systems. These include formulating queries, judging the relevance of documents, and focusing attention on interface cues, such as query suggestions, while typing queries. It has also been shown that children want a personalised Web experience and prefer content presented to them that matches their long-term entertainment and education needs. To this end, we have developed an interaction-based information filtering system to address these challenges.
european conference on information retrieval | 2012
Tamara Polajnar; Richard Glassey; Leif Azzopardi
Identifying child-appropriate web content is an important yet difficult classification task. This novel task is characterised by attempting to determine age/child appropriateness (which is not necessarily topic-based), despite the presence of unbalanced class sizes and the lack of quality training data with human judgements of appropriateness. Classification of feeds, a subset of web content, presents further challenges due to their temporal nature and short document format. In this paper, we discuss these challenges and present baseline results for this task through an empirical study that classifies incoming news stories as appropriate (or not) for children. We show that while the naive Bayes approach produces a higher AUC it is vulnerable to the imbalanced data problem, and that support vector machine provides a more robust overall solution. Our research shows that classifying childrens content is a non-trivial task that has greater complexities than standard text based classification. While the F-score values are consistent with other research examining age-appropriate text classification, we introduce a new problem with a new dataset.
european conference on information retrieval | 2012
Leif Azzopardi; Doug Dowie; Sérgio Duarte; Carsten Eickhoff; Richard Glassey; Karl Gyllstrom; Djoerd Hiemstra; Franciska de Jong; Frea Kruisinga; Kelly Marshall; Sien Moens; Tamara Polajnar; Frans van der Sluis; Arjen P. de Vries
The Emma Search (EmSe) demonstrator developed for the Emma Childrens Hospital showcases the PuppyIR project and PuppyIR framework for building information services for children.