Derek Greene | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Derek Greene is active.

Explore More

Publication

Featured researches published by Derek Greene.

advances in social networks analysis and mining | 2010

Tracking the Evolution of Communities in Dynamic Social Networks

Derek Greene; Dónal Doyle; Pádraig Cunningham

Real-world social networks from a variety of domains can naturally be modelled as dynamic graphs. However, approaches to detecting communities have largely focused on identifying communities in static graphs. Recently, researchers have begun to consider the problem of tracking the evolution of groups of users in dynamic scenarios. Here we describe a model for tracking the progress of communities over time in a dynamic network, where each community is characterised by a series of significant evolutionary events. This model is used to motivate a community-matching strategy for efficiently identifying and tracking dynamic communities. Evaluations on synthetic graphs containing embedded events demonstrate that this strategy can successfully track communities over time in volatile networks. In addition, we describe experiments exploring the dynamic communities detected in a real mobile operator network containing millions of users.

Molecular Cell | 2012

Hierarchical Modularity and the Evolution of Genetic Interactomes across Species

Colm J. Ryan; Assen Roguev; Kristin L. Patrick; Jiewei Xu; Harlizawati Jahari; Zongtian Tong; Pedro Beltrao; Michael Shales; Hong Qu; Sean R. Collins; Joseph I. Kliegman; Lingli Jiang; Dwight Kuo; Elena Tosti; Hyun Soo Kim; Winfried Edelmann; Michael Christopher Keogh; Derek Greene; Chao Tang; Pádraig Cunningham; Kevan M. Shokat; Gerard Cagney; J. Peter Svensson; Christine Guthrie; Peter J. Espenshade; Trey Ideker; Nevan J. Krogan

To date, cross-species comparisons of genetic interactomes have been restricted to small or functionally related gene sets, limiting our ability to infer evolutionary trends. To facilitate a more comprehensive analysis, we constructed a genome-scale epistasis map (E-MAP) for the fission yeast Schizosaccharomyces pombe, providing phenotypic signatures for ~60% of the nonessential genome. Using these signatures, we generated a catalog of 297 functional modules, and we assigned function to 144 previously uncharacterized genes, including mRNA splicing and DNA damage checkpoint factors. Comparison with an integrated genetic interactome from the budding yeast Saccharomyces cerevisiae revealed a hierarchical model for the evolution of genetic interactions, with conservation highest within protein complexes, lower within biological processes, and lowest between distinct biological processes. Despite the large evolutionary distance and extensive rewiring of individual interactions, both networks retain conserved features and display similar levels of functional crosstalk between biological processes, suggesting general design principles of genetic interactomes.

web search and data mining | 2013

Unsupervised graph-based topic labelling using dbpedia

Ioana Hulpuş; Conor Hayes; Marcel Karnstedt; Derek Greene

Automated topic labelling brings benefits for users aiming at analysing and understanding document collections, as well as for search engines targetting at the linkage between groups of words and their inherent topics. Current approaches to achieve this suffer in quality, but we argue their performances might be improved by setting the focus on the structure in the data. Building upon research for concept disambiguation and linking to DBpedia, we are taking a novel approach to topic labelling by making use of structured data exposed by DBpedia. We start from the hypothesis that words co-occuring in text likely refer to concepts that belong closely together in the DBpedia graph. Using graph centrality measures, we show that we are able to identify the concepts that best represent the topics. We comparatively evaluate our graph-based approach and the standard text-based approach, on topics extracted from three corpora, based on results gathered in a crowd-sourcing experiment. Our research shows that graph-based analysis of DBpedia can achieve better results for topic labelling in terms of both precision and topic coverage.

Expert Systems With Applications | 2012

SMS spam filtering

Sarah Jane Delany; Mark Buckley; Derek Greene

Highlights? We motivate the need for content-based SMS spam filtering. ? We discuss similarities/differences between email and SMS spam filtering. ? We review recent research in SMS spam filtering. ? We analyse recent SMS spam messages and make a dataset available. ? Early days, no consensus yet on best techniques but significant challenges exist. Mobile or SMS spam is a real and growing problem primarily due to the availability of very cheap bulk pre-pay SMS packages and the fact that SMS engenders higher response rates as it is a trusted and personal service. SMS spam filtering is a relatively new task which inherits many issues and solutions from email spam filtering. However it poses its own specific challenges. This paper motivates work on filtering SMS spam and reviews recent developments in SMS spam filtering. The paper also discusses the issues with data collection and availability for furthering research in this area, analyses a large corpus of SMS spam, and provides some initial benchmark results.

european conference on artificial intelligence | 2010

Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

Anthony Brew; Derek Greene; Pádraig Cunningham

Tracking sentiment in the popular media has long been of interest to media analysts and pundits. With the availability of news content via online syndicated feeds, it is now possible to automate some aspects of this process. There is also great potential to crowdsource Crowdsourcing is a term, sometimes associated with Web 2.0 technologies, that describes outsourcing of tasks to a large often anonymous community. much of the annotation work that is required to train a machine learning system to perform sentiment scoring. We describe such a system for tracking economic sentiment in online media that has been deployed since August 2009. It uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items. We report on the design challenges addressed in managing the effort of the annotators and in making annotation an interesting experience.

international conference on machine learning | 2006

Practical solutions to the problem of diagonal dominance in kernel document clustering

Derek Greene; Pádraig Cunningham

In supervised kernel methods, it has been observed that the performance of the SVM classifier is poor in cases where the diagonal entries of the Gram matrix are large relative to the off-diagonal entries. This problem, referred to as diagonal dominance, often occurs when certain kernel functions are applied to sparse high-dimensional data, such as text corpora. In this paper we investigate the implications of diagonal dominance for unsupervised kernel methods, specifically in the task of document clustering. We propose a selection of strategies for addressing this issue, and evaluate their effectiveness in producing more accurate and stable clusterings.

computer-based medical systems | 2004

Ensemble clustering in medical diagnostics

Derek Greene; Alexey Tsymbal; Nadia Bolshakova; Pádraig Cunningham

Ensemble techniques have been successfully applied in the context of supervised learning to increase the accuracy and stability of classification. Recently, analogous techniques for cluster analysis have been suggested. Research has demonstrated that, by combining a collection of dissimilar clusterings, an improved solution can be obtained. In this paper, we examine the potential of applying ensemble clustering techniques with a focus on the area of medical diagnostics. We present several ensemble generation and integration strategies, and evaluate each approach on a number of synthetic and real-world datasets. In addition, we show that diversity among ensemble members is necessary, but not sufficient to yield an improved solution without the selection of an appropriate integration method.

knowledge discovery and data mining | 2010

Distortion as a validation criterion in the identification of suspicious reviews

Guangyu Wu; Derek Greene; Barry Smyth; Pádraig Cunningham

Assessing the trustworthiness of reviews is a key issue for the maintainers of opinion sites such as TripAdvisor. In this paper we propose a distortion criterion for assessing the impact of methods for uncovering suspicious hotel reviews in TripAdvisor. The principle is that dishonest reviews will distort the overall popularity ranking for a collection of hotels. Thus a mechanism that deletes dishonest reviews will distort the popularity ranking significantly, when compared with the removal of a similar set of reviews at random. This distortion can be quantified by comparing popularity rankings before and after deletion, using rank correlation. We present an evaluation of this strategy in the assessment of shill detection mechanisms on a dataset of hotel reviews collected from TripAdvisor.

Machine Learning Techniques for Multimedia | 2008

Unsupervised Learning and Clustering

Derek Greene; Pádraig Cunningham; Rudolf Mayer

Unsupervised learning is very important in the processing of multimedia content as clustering or partitioning of data in the absence of class labels is often a requirement. This chapter begins with a review of the classic clustering techniques of k-means clustering and hierarchical clustering. Modern advances in clustering are covered with an analysis of kernel-based clustering and spectral clustering. One of the most popular unsupervised learning techniques for processing multimedia content is the self-organizing map, so a review of self-organizing maps and variants is presented in this chapter. The absence of class labels in unsupervised learning makes the question of evaluation and cluster quality assessment more complicated than in supervised learning. So this chapter also includes a comprehensive analysis of cluster validity assessment techniques.

web science | 2013

Producing a unified graph representation from multiple social network views

Derek Greene; Pádraig Cunningham

In many social networks, several different link relations will exist between the same set of users. Additionally, attribute or textual information will be associated with those users, such as demographic details or user-generated content. For many data analysis tasks, such as community finding and data visualisation, the provision of multiple heterogeneous types of user data makes the analysis process more complex. We propose an unsupervised method for integrating multiple data views to produce a single unified graph representation, based on the combination of the k-nearest neighbour sets for users derived from each view. These views can be either relation-based or feature-based. The proposed method is evaluated on a number of annotated multi-view Twitter datasets, where it is shown to support the discovery of the underlying community structure in the data.

Explore More