Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Diarmuid Ó Séaghdha is active.

Publication


Featured researches published by Diarmuid Ó Séaghdha.


web search and data mining | 2012

Auralist: introducing serendipity into music recommendation

Yuan Cao Zhang; Diarmuid Ó Séaghdha; Daniele Quercia; Tamas Jambor

Recommendation systems exist to help users discover content in a large body of items. An ideal recommendation system should mimic the actions of a trusted friend or expert, producing a personalised collection of recommendations that balance between the desired goals of accuracy, diversity, novelty and serendipity. We introduce the Auralist recommendation framework, a system that - in contrast to previous work - attempts to balance and improve all four factors simultaneously. Using a collection of novel algorithms inspired by principles of serendipitous discovery, we demonstrate a method of successfully injecting serendipity, novelty and diversity into recommendations whilst limiting the impact on accuracy. We evaluate Auralist quantitatively over a broad set of metrics and, with a user study on music recommendation, show that Auralists emphasis on serendipity indeed improves user satisfaction.


north american chapter of the association for computational linguistics | 2009

SemEval-2010 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions

Cristina Butnariu; Su Nam Kim; Preslav Nakov; Diarmuid Ó Séaghdha; Stan Szpakowicz; Tony Veale

We present a brief overview of the main challenges in understanding the semantics of noun compounds and consider some known methods. We introduce a new task to be part of SemEval-2010: the interpretation of noun compounds using paraphrasing verbs and prepositions. The task is meant to provide a standard testbed for future research on noun compound semantics. It should also promote paraphrase-based approaches to the problem, which can benefit many NLP applications.


meeting of the association for computational linguistics | 2017

Neural Belief Tracker: Data-Driven Dialogue State Tracking

Nikola Mrksic; Diarmuid Ó Séaghdha; Tsung-Hsien Wen; Blaise Thomson; Steve J. Young

One of the core components of modern spoken dialogue systems is the belief tracker, which estimates the users goal at every step of the dialogue. However, most current approaches have difficulty scaling to larger, more complex dialogue domains. This is due to their dependency on either: a) Spoken Language Understanding models that require large amounts of annotated training data; or b) hand-crafted lexicons for capturing some of the linguistic variation in users language. We propose a novel Neural Belief Tracking (NBT) framework which overcomes these problems by building on recent advances in representation learning. NBT models reason over pre-trained word vectors, learning to compose them into distributed representations of user utterances and dialogue context. Our evaluation on two datasets shows that this approach surpasses past limitations, matching the performance of state-of-the-art models which rely on hand-crafted semantic lexicons and outperforming them when such lexicons are not provided.


international conference on computational linguistics | 2008

Semantic Classification with Distributional Kernels

Diarmuid Ó Séaghdha; Ann A. Copestake

Distributional measures of lexical similarity and kernel methods for classification are well-known tools in Natural Language Processing. We bring these two methods together by introducing distributional kernels that compare co-occurrence probability distributions. We demonstrate the effectiveness of these kernels by presenting state-of-the-art results on datasets for three semantic classification: compound noun interpretation, identification of semantic relations between nominals and semantic classification of verbs. Finally, we consider explanations for the impressive performance of distributional kernels and sketch some promising generalisations.


Proceedings of the Workshop on A Broader Perspective on Multiword Expressions | 2007

Co-occurrence Contexts for Noun Compound Interpretation

Diarmuid Ó Séaghdha; Ann A. Copestake

Contextual information extracted from corpora is frequently used to model semantic similarity. We discuss distinct classes of context types and compare their effectiveness for compound noun interpretation. Contexts corresponding to word-word similarity perform better than contexts corresponding to relation similarity, even when relational co-occurrences are extracted from a much larger corpus. Combining word-similarity and relation-similarity kernels further improves SVM classification performance.


international joint conference on natural language processing | 2015

Multi-domain Dialog State Tracking using Recurrent Neural Networks

Nikola Mrksic; Diarmuid Ó Séaghdha; Blaise Thomson; Milica Gasic; Pei-Hao Su; David Vandyke; Tsung-Hsien Wen; Steve J. Young

Dialog state tracking is a key component of many modern dialog systems, most of which are designed with a single, well-defined domain in mind. This paper shows that dialog data drawn from different dialog domains can be used to train a general belief tracking model which can operate across all of these domains, exhibiting superior performance to each of the domain-specific models. We propose a training procedure which uses out-of-domain data to initialise belief tracking models for entirely new domains. This procedure leads to improvements in belief tracking performance regardless of the amount of in-domain data available for training the model.


PLOS ONE | 2012

Text Mining for Literature Review and Knowledge Discovery in Cancer Risk Assessment and Research

Anna Korhonen; Diarmuid Ó Séaghdha; Ilona Silins; Lin Sun; Johan Högberg; Ulla Stenius

Research in biomedical text mining is starting to produce technology which can make information in biomedical literature more accessible for bio-scientists. One of the current challenges is to integrate and refine this technology to support real-life scientific tasks in biomedicine, and to evaluate its usefulness in the context of such tasks. We describe CRAB – a fully integrated text mining tool designed to support chemical health risk assessment. This task is complex and time-consuming, requiring a thorough review of existing scientific data on a particular chemical. Covering human, animal, cellular and other mechanistic data from various fields of biomedicine, this is highly varied and therefore difficult to harvest from literature databases via manual means. Our tool automates the process by extracting relevant scientific data in published literature and classifying it according to multiple qualitative dimensions. Developed in close collaboration with risk assessors, the tool allows navigating the classified dataset in various ways and sharing the data with other users. We present a direct and user-based evaluation which shows that the technology integrated in the tool is highly accurate, and report a number of case studies which demonstrate how the tool can be used to support scientific discovery in cancer risk assessment and research. Our work demonstrates the usefulness of a text mining pipeline in facilitating complex research tasks in biomedicine. We discuss further development and application of our technology to other types of chemical risk assessment in the future.


meeting of the association for computational linguistics | 2009

Using Lexical and Relational Similarity to Classify Semantic Relations

Diarmuid Ó Séaghdha; Ann A. Copestake

Many methods are available for computing semantic similarity between individual words, but certain NLP tasks require the comparison of word pairs. This paper presents a kernel-based framework for application to relational reasoning tasks of this kind. The model presented here combines information about two distinct types of word pair similarity: lexical similarity and relational similarity. We present an efficient and flexible technique for implementing relational similarity and show the effectiveness of combining lexical and relational models by demonstrating state-of-the-art results on a compound noun interpretation task.


BMC Bioinformatics | 2011

Exploring subdomain variation in biomedical language

Thomas Lippincott; Diarmuid Ó Séaghdha; Anna Korhonen

BackgroundApplications of Natural Language Processing (NLP) technology to biomedical texts have generated significant interest in recent years. In this paper we identify and investigate the phenomenon of linguistic subdomain variation within the biomedical domain, i.e., the extent to which different subject areas of biomedicine are characterised by different linguistic behaviour. While variation at a coarser domain level such as between newswire and biomedical text is well-studied and known to affect the portability of NLP systems, we are the first to conduct an extensive investigation into more fine-grained levels of variation.ResultsUsing the large OpenPMC text corpus, which spans the many subdomains of biomedicine, we investigate variation across a number of lexical, syntactic, semantic and discourse-related dimensions. These dimensions are chosen for their relevance to the performance of NLP systems. We use clustering techniques to analyse commonalities and distinctions among the subdomains.ConclusionsWe find that while patterns of inter-subdomain variation differ somewhat from one feature set to another, robust clusters can be identified that correspond to intuitive distinctions such as that between clinical and laboratory subjects. In particular, subdomains relating to genetics and molecular biology, which are the most common sources of material for training and evaluating biomedical NLP tools, are not representative of all biomedical subdomains. We conclude that an awareness of subdomain variation is important when considering the practical use of language processing applications by biomedical researchers.


meeting of the association for computational linguistics | 2007

Annotating and Learning Compound Noun Semantics

Diarmuid Ó Séaghdha

There is little consensus on a standard experimental design for the compound interpretation task. This paper introduces well-motivated general desiderata for semantic annotation schemes, and describes such a scheme for in-context compound annotation accompanied by detailed publicly available guidelines. Classification experiments on an open-text dataset compare favourably with previously reported results and provide a solid baseline for future research.

Collaboration


Dive into the Diarmuid Ó Séaghdha's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Preslav Nakov

Qatar Computing Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Su Nam Kim

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zornitsa Kozareva

Information Sciences Institute

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge