Ilias N. Flaounas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ilias N. Flaounas is active.

Explore More

Publication

Featured researches published by Ilias N. Flaounas.

Digital journalism | 2013

RESEARCH METHODS IN THE AGE OF DIGITAL JOURNALISM

Ilias N. Flaounas; Omar Ali; Thomas Lansdall-Welfare; Tijl De Bie; Nicholas Alexander Mosdell; Justin Matthew Wren Lewis; Nello Cristianini

News content analysis is usually preceded by a labour-intensive coding phase, where experts extract key information from news items. The cost of this phase imposes limitations on the sample sizes that can be processed, and therefore to the kind of questions that can be addressed. In this paper we describe an approach that incorporates text-analysis technologies for the automation of some of these tasks, enabling us to analyse data sets that are many orders of magnitude larger than those normally used. The patterns detected by our method include: (1) similarities in writing style among several outlets, which reflect reader demographics; (2) gender imbalance in media content and its relation with topic; (3) the relationship between topic and popularity of articles.

international conference on management of data | 2011

NOAM: news outlets analysis and monitoring system

Ilias N. Flaounas; Omar Ali; Marco Turchi; Tristan Snowsill; Florent Nicart; Tijl De Bie; Nello Cristianini

We present NOAM, an integrated platform for the monitoring and analysis of news media content. NOAM is the data management system behind various applications and scientific studies aiming at modelling the mediasphere. The system is also intended to address the need in the AI community for platforms where various AI technologies are integrated and deployed in the real world. It combines a relational database (DB) with state of the art AI technologies, including data mining, machine learning and natural language processing. These technologies are organised in a robust, distributed architecture of collaborating modules, that are used to populate and annotate the DB. NOAM manages tens of millions of news items in multiple languages, automatically annotating them in order to enable queries based on their semantic properties. The system also includes a unified user interface for interacting with its various modules.

PLOS ONE | 2010

The Structure of the EU Mediasphere

Ilias N. Flaounas; Marco Turchi; Omar Ali; Nick Fyson; Tijl De Bie; Nicholas Alexander Mosdell; Justin Matthew Wren Lewis; Nello Cristianini

Background A trend towards automation of scientific research has recently resulted in what has been termed “data-driven inquiry” in various disciplines, including physics and biology. The automation of many tasks has been identified as a possible future also for the humanities and the social sciences, particularly in those disciplines concerned with the analysis of text, due to the recent availability of millions of books and news articles in digital format. In the social sciences, the analysis of news media is done largely by hand and in a hypothesis-driven fashion: the scholar needs to formulate a very specific assumption about the patterns that might be in the data, and then set out to verify if they are present or not. Methodology/Principal Findings In this study, we report what we think is the first large scale content-analysis of cross-linguistic text in the social sciences, by using various artificial intelligence techniques. We analyse 1.3 M news articles in 22 languages detecting a clear structure in the choice of stories covered by the various outlets. This is significantly affected by objective national, geographic, economic and cultural relations among outlets and countries, e.g., outlets from countries sharing strong economic ties are more likely to cover the same stories. We also show that the deviation from average content is significantly correlated with membership to the eurozone, as well as with the year of accession to the EU. Conclusions/Significance While independently making a multitude of small editorial decisions, the leading media of the 27 EU countries, over a period of six months, shaped the contents of the EU mediasphere in a way that reflects its deep geographic, economic and cultural relations. Detecting these subtle signals in a statistically rigorous way would be out of the reach of traditional methods. This analysis demonstrates the power of the available methods for significant automation of media content analysis.

Expert Systems With Applications | 2014

Efficient classification of multi-labeled text streams by clashing

Ricardo Ñanculef; Ilias N. Flaounas; Nello Cristianini

We present a method for the classification of multi-labeled text documents explicitly designed for data stream applications that require to process a virtually infinite sequence of data using constant memory and constant processing time. Our method is composed of an online procedure used to efficiently map text into a low-dimensional feature space and a partition of this space into a set of regions for which the system extracts and keeps statistics used to predict multi-label text annotations. Documents are fed into the system as a sequence of words, mapped to a region of the partition, and annotated using the statistics computed from the labeled instances colliding in the same region. This approach is referred to as clashing. We illustrate the method in real-world text data, comparing the results with those obtained using other text classifiers. In addition, we provide an analysis about the effect of the representation space dimensionality on the predictive performance of the system. Our results show that the online embedding indeed approximates the geometry of the full corpus-wise TF and TF-IDF space. The model obtains competitive F measures with respect to the most accurate methods, using significantly fewer computational resources. In addition, the method achieves a higher macro-averaged F measure than methods with similar running time. Furthermore, the system is able to learn faster than the other methods from partially labeled streams.

european conference on machine learning | 2009

Inference and validation of networks

Ilias N. Flaounas; Marco Turchi; Tijl De Bie; Nello Cristianini

We develop a statistical methodology to validate the result of network inference algorithms, based on principles of statistical testing and machine learning. The comparison of results with reference networks, by means of similarity measures and null models, allows us to measure the significance of results, as well as their predictive power. The use of Generalised Linear Models allows us to explain the results in terms of available ground truth which we expect to be partially relevant. We present these methods for the case of inferring a network of News Outlets based on their preference of stories to cover. We compare three simple network inference methods and show how our technique can be used to choose between them. All the methods presented here can be directly applied to other domains where network inference is used.

artificial intelligence applications and innovations | 2010

Learning the Preferences of News Readers with SVM and Lasso Ranking

Elena Hensinger; Ilias N. Flaounas; Nello Cristianini

We attack the task of predicting which news-stories are more appealing to a given audience by comparing ‘most popular stories’, gathered from various online news outlets, over a period of seven months, with stories that did not become popular despite appearing on the same page at the same time. We cast this as a learning-to-rank task, and train two different learning algorithms to reproduce the preferences of the readers, within each of the outlets. The first method is based on Support Vector Machines, the second on the Lasso. By just using words as features, SVM ranking can reach significant accuracy in correctly predicting the preference of readers for a given pair of articles. Furthermore, by exploiting the sparsity of the solutions found by the Lasso, we can also generate lists of keywords that are expected to trigger the attention of the outlets’ readers.

2010 2nd International Workshop on Cognitive Information Processing | 2010

Predicting relations in news-media content among EU countries

Ilias N. Flaounas; Nick Fyson; Nello Cristianini

We investigate the complex relations existing within news content in the 27 countries of the European Union (EU). In particular we are interested in detecting and modelling any biases in the patterns of content that appear in news outlets of different countries.

european conference on machine learning | 2010

Detecting events in a million New York times articles

Tristan Snowsill; Ilias N. Flaounas; Tijl De Bie; Nello Cristianini

We present a demonstration of a newly developed text stream event detection method on over a million articles from the New York Times corpus. The event detection is designed to operate in a predominantly on-line fashion, reporting new events within a specified timeframe. The event detection is achieved by detecting significant changes in the statistical properties of the text where those properties are efficiently stored and updated in a suffix tree. This particular demonstration shows how our method is effective at discovering both short- and long-term events (which are often denoted topics), and how it automatically copes with topic drift on a corpus of 1 035 263 articles.

web intelligence | 2009

Detecting Macro-patterns in the European Mediasphere

Ilias N. Flaounas; Marco Turchi; Nello Cristianini

The analysis of the contents of news outlets has been the focus of social scientists for a long time. However, content analysis is often performed on hand-coded documents, which limits the size of the data accessible to the investigation and consequently limits the possibility of detecting macro-trends. The use of text categorisation, clustering and statistical machine translation (SMT) enables us to operate automatically on vast amounts of news items, and consequently to analyse patterns in the content of outlets in different languages, over long time periods. We report on experiments involving hundreds of European media in 22 different languages, demonstrating how it is possible to detect similarities and differences between outlets, and between countries, based on the contents of their articles.

Archive | 2013

Modelling and Explaining Online News Preferences

Elena Hensinger; Ilias N. Flaounas; Nello Cristianini

We use Machine Learning techniques to model the reading preferences of audiences of 14 online news outlets. The models, describing the appeal of a given article to each audience, are formed by linear functions of word frequencies, and are obtained by comparing articles that became “Most Popular” on a given day in a given outlet with articles that did not. We make use of 2,432,148 such article pairs, collected over a period of over 1.5 years. Those models are shown to be predictive of user choices, and they are then used to compare both the audiences and the contents of various news outlets. In the first case, we find that there is a significant correlation between demographic profiles of audiences and their preferences. In the second case we find that content appeal is related both to writing style - with more sentimentally charged language being preferred, and to content with “Public Affairs” topics, such as “Finance” and “Politics”, being less preferred.

Explore More