Heri Ramampiaro
Norwegian University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Heri Ramampiaro.
international acm sigir conference on research and development in information retrieval | 2013
Krisztian Balog; Heri Ramampiaro
Cumulative citation recommendation refers to the task of filtering a time-ordered corpus for documents that are highly relevant to a predefined set of entities. This task has been introduced at the TREC Knowledge Base Acceleration track in 2012, where two main families of approaches emerged: classification and ranking. In this paper we perform an experimental comparison of these two strategies using supervised learning with a rich feature set. Our main finding is that ranking outperforms classification on all evaluation settings and metrics. Our analysis also reveals that a ranking-based approach has more potential for future improvements.
Multimedia Tools and Applications | 2014
Massimiliano Ruocco; Heri Ramampiaro
The event detection problem, which is closely related to clustering, has gained a lot of attentions within event detection for textual documents. However, although image clustering is a problem that has been treated extensively in both Content-Based Image Retrieval (CBIR) and Text-Based Image Retrieval (TBIR) systems, event detection within image management is a relatively new area. Having this in mind, we propose a novel approach for event extraction and clustering of images, taking into account textual annotations, time and geographical positions. Our goal is to develop a clustering method based on the fact that an image may belong to an event cluster. Here, we stress the necessity of having an event clustering and cluster extraction algorithm that are both scalable and allow online applications. To achieve this, we extend a well-known clustering algorithm called Suffix Tree Clustering (STC), originally developed to cluster text documents using document snippets. The idea is that we consider an image along with its annotation as a document. Further, we extend it to also include time and geographical position so that we can capture the contextual information from each image during the clustering process. This has appeared to be particularly useful on images gathered from online photo-sharing applications such as Flickr. Hence, our STC-based approach is aimed at dealing with the challenges induced by capturing contextual information from Flickr images and extracting related events. We evaluate our algorithm using different annotated datasets mainly gathered from Flickr. As part of this evaluation we investigate the effects of using different parameters, such as time and space granularities, and compare these effects. In addition, we evaluate the performance of our algorithm with respect to mining events from image collections. Our experimental results clearly demonstrate the effectiveness of our STC-based algorithm in extracting and clustering events.
Information Technology & Management | 2004
Heri Ramampiaro; Mads Nygård
The theme of this paper is on transactional support for cooperative work environments, focusing on data sharing – i.e., providing suitable mechanisms to manage concurrent access to shared resources. The subject is not new per se. In fact, in terms of transaction models and frameworks, several solutions already exist. Still, there are some problems that are not solved. Among these are the problems that result from the dynamic and heterogeneous nature of cooperative work. Our solution is to provide transactional support that not only can be tailored to suit different situations, but can also be modified following changes in the actual environment while the work is being performed – i.e., adaptable transactional support. As part of this, we have identified and extracted the beneficial features from existing models and attempted to extend these to form a transactional framework, called CAGISTrans. This is a framework for the specification of transaction models suiting specific applications. To handle dynamic environments we propose a new way of organizing the elements of a transaction model to allow runtime refinement. In addition, we have developed a transaction management system, built on the middleware principle, to allow interoperability and database independence. Thus this addresses the problems induced by the heterogeneous nature of cooperative environments.
international symposium on multimedia | 2010
Massimiliano Ruocco; Heri Ramampiaro
Image clustering is a problem that has been treated extensively in both Content-Based (CBIR) and Text-Based (TBIR) Image Retrieval Systems. In this paper, we propose a new image clustering approach that takes both annotation, time and geographical position into account. Our goal is to develop a clustering method that allows an image to be part of an event cluster. We extend a well-known clustering algorithm called Suffix Tree Clustering (STC), which was originally developed to cluster text documents using a document snippet. To be able to use this algorithm, we consider an image with annotation as a document. Then, we extend it to also include time and geographical position. This appears to be particularly useful on the images gathered from online photo-sharing applications such as Flickr. Here image tags are often subjective and incomplete. For this reason, clustering based on textual annotations alone is not enough to capture all context information related to an image. Our approach has been suggested to address this challenge. In addition, we propose a novel algorithm to extract event clusters. The algorithm is evaluated using an annotated dataset from Flickr, and a comparison between different granularity of time and space is provided.
Applied Intelligence | 2018
Quang-Huy Duong; Philippe Fournier-Viger; Heri Ramampiaro; Kjetil Nørvåg; Thu-Lan Dam
Discovering high utility itemsets in transaction databases is a key task for studying the behavior of customers. It consists of finding groups of items bought together that yield a high profit. Several algorithms have been proposed to mine high utility itemsets using various approaches and more or less complex data structures. Among existing algorithms, one-phase algorithms employing the utility-list structure have shown to be the most efficient. In recent years, the simplicity of the utility-list structure has led to the development of numerous utility-list based algorithms for various tasks related to utility mining. However, a major limitation of utility-list based algorithms is that creating and maintaining utility-lists are time consuming and can consume a huge amount of memory. The reasons are that numerous utility lists are built and that the utility-list intersection/join operation to construct a utility-list is costly. This paper addresses this issue by proposing an improved utility-list structure called utility-list buffer to reduce the memory consumption and speed up the join operation. This structure is integrated into a novel algorithm named ULB-Miner (Utility-List Buffer for high utility itemset Miner), which introduces several new ideas to more efficiently discover high utility itemsets. ULB-Miner uses the designed utility-list buffer structure to efficiently store and retrieve utility-lists, and reuse memory during the mining process. Moreover, the paper also introduces a linear time method for constructing utility-list segments in a utility-list buffer. An extensive experimental study on various datasets shows that the proposed algorithm relying on the novel utility-list buffer structure is highly efficient in terms of both execution time and memory consumption. The ULB-Miner algorithm is up to 10 times faster than the FHM and HUI-Miner algorithms and consumes up to 6 times less memory. Moreover, it performs well on both dense and sparse datasets.
european conference on machine learning | 2017
Eliezer de Souza da Silva; Helge Langseth; Heri Ramampiaro
We introduce Poisson Matrix Factorization with Content and Social trust information (PoissonMF-CS), a latent variable probabilistic model for recommender systems with the objective of jointly modeling social trust, item content and user’s preference using Poisson matrix factorization framework. This probabilistic model is equivalent to collectively factorizing a non-negative user–item interaction matrix and a non-negative item–content matrix. The user–item matrix consists of sparse implicit (or explicit) interactions counts between user and item, and the item–content matrix consists of words or tags counts per item. The model imposes additional constraints given by the social ties between users, and the homophily effect on social networks – the tendency of people with similar preferences to be socially connected. Using this model we can account for and fine-tune the weight of content-based and social-based factors in the user preference. We develop approximate variational inference algorithm and perform experiments comparing PoissonMF-CS with competing models. The experimental evaluation indicates that PoissonMF-CS achieves superior predictive performance on held-out data for the top-M recommendations task. Also, we observe that PoissonMF-CS generates compact latent representations when compared with alternative models while maintaining superior predictive performance.
MIKE | 2014
Hai Thanh Nguyen; Thomas Almenningen; Martin Havig; Herman Schistad; Anders Kofod-Petersen; Helge Langseth; Heri Ramampiaro
Fashion e-commerce is a fast growing area in online shopping. The fashion domain has several interesting properties, which make personalised recommendations more difficult than in more traditional domains. To avoid potential bias when using explicit user ratings, which are also expensive to obtain, this work approaches fashion recommendations by analysing implicit feedback from users in an app. A user’s actual behaviour, such as Clicks, Wants and Purchases, is used to infer her implicit preference score of an item she has interacted with. This score is then blended with information about the item’s price and popularity as well as the recentness of the user’s action wrt. the item. Based on these implicit preference scores, we infer the user’s ranking of other fashion items by applying different recommendation algorithms. Experimental results show that the proposed method outperforms the most popular baseline approach, thus demonstrating its effectiveness and viability.
workshop on location-based social networks | 2012
Massimiliano Ruocco; Heri Ramampiaro
The availability of a huge amount of geotagged resources on the web can be exploited to extract new useful information. We propose a set of estimators that are able to evaluate the degree of clustering of the spatial distribution of terms used to tag such geotagged resources. We introduce the concept of tag point pattern to derive indexes from the exploratory analysis by taking advantage of the second order Ripleys K-function, previously used in epidemiology, geo-statistics and ecology. The derived model estimates the degree of aggregation of the geotagged resources, taking into account the heterogeneity of the spatial distribution of the underlying population. Further, thanks to subsampling techniques, our approach is able to handle large datasets. Without losing of generality, we perform our experiments on a dataset derived Flickr pictures, as a use case. This dataset consists of tags that were extracted from a set of 1.2 million of pictures. We evaluate our proposed indexes with respect to their ability to extract tags related to geographical landmarks and hotspots. Our experiments show that we get good results using our estimators.
Transactions on large-scale data- and knowledge-centered systems IV | 2011
Heri Ramampiaro; Chen Li
The large amount and diversity of available biomedical information has put a high demand on existing search systems. Such a tool should be able to not only retrieve the sought information, but also filter out irrelevant documents, while giving the relevant ones the highest ranking. Focusing on biomedical information, this work investigates how to improve the ability for a system to find and rank relevant documents. To achieve this goal, we apply a series of information retrieval techniques to search in biomedical information and combine them in an optimal manner. These techniques include extending and using well-established information retrieval (IR) similarity models such as the Vector Space Model (VSM) and BM25 and their underlying scoring schemes. The techniques also allow users to affect the ranking according to their view of relevance. The techniques have been implemented and tested in a proof-of-concept prototype called BioTracer, which extends a Java-based open source search engine library. The results from our experiments using the TREC 2004 Genomic Track collection are promising. Our investigation have also revealed that involving the user in the search process will indeed have positive effects on the ranking of search results, and that the approaches used in BioTracer can be used to meet the users information needs.
international conference on information technology | 2010
Heri Ramampiaro
With the large amount of biomedical information available today, providing a good search tool is vital. Such a tool should not only be able to retrieve the sought information, but also to filter out irrelevant documents, while giving the relevant ones the highest ranking. Focusing on biomedical information, the main goal of this work has been to investigate how to improve the ability for a system to find and rank relevant documents. To achieve this, we apply a series of information retrieval techniques to search in biomedical information and combine them in an optimal manner. These techniques include extending and using well-established information retrieval (IR) similarity models like the Vector Space Model (VSM) and BM25 and their underlying scoring schemes, and allowing users to affect the ranking according to their view of relevance. The techniques have been implemented and tested in a proof-of-concept prototype called BioTracer, extending a Java-based open source search engine library. The results from our experiments using the TREC 2004 Genomic Track collection seem promising. Our investigation have also revealed that involving the user in the search will indeed have positive effects on the ranking of search results, and that the approaches used in BioTracer can be used to meet the users information needs.