Martin Leginus
Aalborg University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Leginus.
international conference on user modeling adaptation and personalization | 2012
Martin Leginus; Peter Dolog; Valdas Žemaitis
Social tagging systems (STS) model three types of entities (i.e. tag-user-item) and relationships between them are encoded into a 3-order tensor. Latent relationships and patterns can be discovered by applying tensor factorization techniques like Higher Order Singular Value Decomposition (HOSVD), Canonical Decomposition etc. STS accumulate large amount of sparse data that restricts factorization techniques to detect latent relations and also significantly slows down the process of a factorization. We propose to reduce tag space by exploiting clustering techniques so that the quality of the recommendations and execution time are improved and memory requirements are decreased. The clustering is motivated by the fact that many tags in a tag space are semantically similar thus the tags can be grouped. Finally, promising experimental results are presented.
international conference on web engineering | 2012
Martin Leginus; Peter Dolog; Ricardo Lage; Frederico Araujo Durão
Tag clouds are useful means for navigation in the social web systems. Usually the systems implement the tag cloud generation based on tag popularity which is not always the best method. In this paper we propose methodologies on how to combine clustering into the tag cloud generation to improve coverage and overlap. We study several clustering algorithms to generate tag clouds. We show that by extending cloud generation based on tag popularity with clustering we slightly improve coverage. We also show that if the cloud is generated by clustering independently of the tag popularity baseline we minimize overlap and increase coverage. In the first case we therefore provide more items for a user to explore. In the second case we provide more diverse items for a user to explore. We experiment with the methodologies on two different datasets: Delicious and Bibsonomy. The methodologies perform slightly better on bibsonomy due to its specific focus. The best performing is the hierarchical clustering.
international conference on web engineering | 2011
Frederico Araujo Durão; Peter Dolog; Martin Leginus; Ricardo Lage
Tag clouds are means for navigation and exploration of information resources on the web provided by social Web sites. The most used approach to generate a tag cloud so far is based on popularity of tags among users who annotate by those tags. This approach however has several limitations, such as suppressing number of tags which are not used often but could lead to interesting resources as well as tags which have been suppressed due to the default number of tags to present in the tag cloud. In this paper we propose the SimSpectrum: a similarity based spectral clustering approach to generate a tag cloud which improves the current state of the art with respect to these limitations. Our approach is based on finding to which extent the tags are related by a similarity calculus. Based on the results from similarity calculation, the spectral clustering algorithm finds the clusters of tags which are strongly related and are loosely related to the other tags. By doing so, we can cover part of the tags which are discarded by traditional tag cloud generation approaches and therefore, present the user with more opportunities to find related interesting web resources. We also show that in terms of the metrics that capture the structural properties of a tag cloud such as coverage and relevance our method has significant results compared to the baseline tag cloud that relies on tag popularity. In terms of the overlap measure, our method shows improvements against the baseline approach. The proposed approach is evaluated using MedWorm medical article collection.
acm conference on hypertext | 2013
Martin Leginus; Peter Dolog; Ricardo Lage
Tag cloud is one of the navigation aids for exploring documents. Tag cloud also link documents through the user defined terms. We explore various graph based techniques to improve the tag cloud generation. Moreover, we introduce relevance measures based on underlying data such as ratings or citation counts for improved measurement of relevance of tag clouds. We show, that on the given data sets, our approach outperforms the state of the art baseline methods with respect to such relevance by 41 % on Movielens dataset and by 11 % on Bibsonomy data set.
international conference on user modeling, adaptation, and personalization | 2014
Ricardo Lage; Peter Dolog; Martin Leginus
In this paper we present an analysis of improvements to a web-based Graphical User Interface (GUI) for health surveillance systems. Such systems are designed to provide means to detect and suggest outbreaks and corresponding information about them from both formal (e.g., hospital reports) and informal (e.g., news sites) sources. However, despite the availability of different such systems, few studies have been carried out to discuss the elements of the system’s GUI and how it can support users in their tasks. To this end, we investigate techniques for adapting, structuring and browsing information in an intuitive and friendly way to the user, focusing on a transition from a static to a dynamic adapted web experience. We conduct a case study with health surveillance experts where we present a case for recommendations matching the user’s preferences within a system and discuss improvements to the presented GUI. We discuss improvements in the light of the feedback provided by these users, proposing how adapted elements of a GUI can be used to improve the user experience in a surveillance task.
international conference on web information systems and technologies | 2015
Martin Leginus; Leon Derczynski; Peter Dolog
Intuitive and effective access to large volumes of information is increasingly important. As social media explodes as a useful source of information, so are methods required to access these large volumes of usergenerated content. Word clouds are an effective information access tool. However, those generated over social media data often depict redundant and mis-ranked entries. This limits the users’ ability to browse and explore datasets. This paper proposes a method for improving word cloud generation over social streams. Named entity expressions in tweets are detected, disambiguated and aggregated into entity clusters. A word cloud is generated from terms that represent the most relevant entity clusters. We find that word clouds with grouped named entities attain significantly broader coverage and significantly decreased content duplication. Further, access to relevant entries in the collection is improved. An extrinsic crowdsourced user evaluation of generated word clouds was performed. Word clouds with grouped named entities are rated as significantly more relevant and more diverse with respect to the baseline. In addition, we found that word clouds with higher levels of Mean Average Precision (MAP) are more likely to be rated by users as being relevant to the concepts reflected. Critically, this supports MAP as a tool for predicting word cloud quality without requiring a human in the loop.
international conference on web information systems and technologies | 2015
Martin Leginus; Leon Derczynski; Peter Dolog
Word clouds have been proven as an effective tool for information access in different domains. As social media is a main driver of large increase in available user generated content, means for accessing information in such content are needed. We study word clouds as a means for information access in social media. Currently-used clouds that are generated from social media data include redundant and mis-ranked entries, harming their utility. We propose a method for generating improved word clouds over social streams. In this method, named entities are detected, disambiguated and aggregated into clusters, which in turn inform cloud construction. We show that word clouds using named entity clusters attain broader coverage and decreased content duplication. Further, an extrinsic evaluation shows improved access to data, with word clouds having grouped named entities being rated more relevant and diverse. Additionally we find word clouds with higher Mean Average Precision (MAP) tend to be more relevant to underlying concepts. Critically, this supports MAP as a tool for predicting cloud quality without needing a human.
international conference on web engineering | 2015
Martin Leginus; ChengXiang Zhai; Peter Dolog
Social media is ubiquitous. There is a need for intelligent retrieval interfaces that will enable a better understanding, exploration and browsing of social media data. A novel two dimensional ad hoc topic map is proposed called Beomap. The main novelty of Beomap is that it allows a user to define an ad hoc semantic dimension with a keyword query when visualizing topics in text data. This not only helps to impose more meaningful spatial dimensions for visualization, but also allows users to steer browsing and exploration of the topic map through ad hoc defined queries. We developed a system to implement Beomap for exploring Twitter data, and evaluated the proposed Beomap in two ways, including an offline simulation and a user study. Results of both evaluation strategies show that the new Beomap interface is better than a standard interactive interface.
international conference on web engineering | 2015
Martin Leginus; ChengXiang Zhai; Peter Dolog
Social media is ubiquitous. There is a need for intelligent retrieval interfaces that will enable a better understanding, exploration and browsing of social media data. A novel two dimensional ad hoc topic map is proposed called Beomap. The main novelty of Beomap is that it allows a user to define an ad hoc semantic dimension with a keyword query when visualizing topics in text data. This not only helps to impose more meaningful spatial dimensions for visualization, but also allows users to steer browsing and exploration of the topic map through ad hoc defined queries. We developed a system to implement Beomap for exploring Twitter data, and evaluated the proposed Beomap in two ways, including an offline simulation and a user study. Results of both evaluation strategies show that the new Beomap interface is better than a standard interactive interface.
international conference on web information systems and technologies | 2013
Ricardo Lage; Peter Dolog; Martin Leginus
In this chapter we review vector space models to propose a new one based on the Jensen-Shannon divergence with the goal of classifying ignored short messages on a social network service. We assume that ignored messages are those published ones that were not interacted with. Our goal then is to attempt to classify messages to be published as ignored to discard them from a set messages that can be used by a recommender system. To evaluate our model, we conduct experiments comparing different models on a Twitter dataset with more than 13,000 Twitter accounts. Results show that our best model tested obtained an average accuracy of 0.77, compared to 0.74 from a model from the literature. Similarly, this method obtained an average precision of 0.74 compared to 0.58 from the second best performing model.