Giovanni Semeraro | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Giovanni Semeraro is active.

Explore More

Publication

Featured researches published by Giovanni Semeraro.

Recommender Systems Handbook | 2011

Content-based Recommender Systems: State of the Art and Trends

Pasquale Lops; Marco de Gemmis; Giovanni Semeraro

Recommender systems have the effect of guiding users in a personal- ized way to interesting objects in a large space of possible options. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past. Indeed, the basic process performed by a content-based recom- mender consists in matching up the attributes of a user profile in which preferences and interests are stored, with the attributes of a content object (item), in order to recommend to the user new interesting items. This chapter provides an overview of content-based recommender systems, with the aim of imposing a degree of order on the diversity of the different aspects involved in their design and implementation. The first part of the chapter presents the basic concepts and terminology of content- based recommender systems, a high level architecture, and their main advantages and drawbacks. The second part of the chapter provides a review of the state of the art of systems adopted in several application domains, by thoroughly describ- ing both classical and advanced techniques for representing items and user profiles. The most widely adopted techniques for learning user profiles are also presented. The last part of the chapter discusses trends and future research which might lead towards the next generation of systems, by describing the role of User Generated Content as a way for taking into account evolving vocabularies, and the challenge of feeding users with serendipitous recommendations, that is to say surprisingly interesting items that they might not have otherwise discovered.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1997

A comparative analysis of methods for pruning decision trees

Floriana Esposito; Donato Malerba; Giovanni Semeraro; J. Kay

In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a top-down approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune observed in each method.

User Modeling and User-adapted Interaction | 2007

A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation

Marco Degemmis; Pasquale Lops; Giovanni Semeraro

Collaborative and content-based filtering are the recommendation techniques most widely adopted to date. Traditional collaborative approaches compute a similarity value between the current user and each other user by taking into account their rating style, that is the set of ratings given on the same items. Based on the ratings of the most similar users, commonly referred to as neighbors, collaborative algorithms compute recommendations for the current user. The problem with this approach is that the similarity value is only computable if users have common rated items. The main contribution of this work is a possible solution to overcome this limitation. We propose a new content-collaborative hybrid recommender which computes similarities between users relying on their content-based profiles, in which user preferences are stored, instead of comparing their rating styles. In more detail, user profiles are clustered to discover current user neighbors. Content-based user profiles play a key role in the proposed hybrid recommender. Traditional keyword-based approaches to user profiling are unable to capture the semantics of user interests. A distinctive feature of our work is the integration of linguistic knowledge in the process of learning semantic user profiles representing user interests in a more effective way, compared to classical keyword-based profiles, due to a sense-based indexing. Semantic profiles are obtained by integrating machine learning algorithms for text categorization, namely a naïve Bayes approach and a relevance feedback method, with a word sense disambiguation strategy based exclusively on the lexical knowledge stored in the WordNet lexical database. Experiments carried out on a content-based extension of the EachMovie dataset show an improvement of the accuracy of sense-based profiles with respect to keyword-based ones, when coping with the task of classifying movies as interesting (or not) for the current user. An experimental session has been also performed in order to evaluate the proposed hybrid recommender system. The results highlight the improvement in the predictive accuracy of collaborative recommendations obtained by selecting like-minded users according to user profiles.

international conference hybrid intelligent systems | 2008

Introducing Serendipity in a Content-Based Recommender System

Leo Iaquinta; M. de Gemmis; Pasquale Lops; Giovanni Semeraro; M. Filannino; Piero Molino

Today recommenders are commonly used with various purposes, especially dealing with e-commerce and information filtering tools. Content-based recommenders rely on the concept of similarity between the bought/ searched/ visited item and all the items stored in a repository. It is a common belief that the user is interested in what is similar to what she has already bought/searched/visited. We believe that there are some contexts in which this assumption is wrong: it is the case of acquiring unsearched but still useful items or pieces of information. This is called serendipity. Our purpose is to stimulate users and facilitate these serendipitous encounters to happen. This paper presents the design and implementation of a hybrid recommender system that joins a content-based approach and serendipitous heuristics in order to mitigate the over-specialization problem with surprising suggestions.

Machine Learning | 2000

Multistrategy Theory Revision: Induction and Abductionin INTHELEX

Floriana Esposito; Giovanni Semeraro; Nicola Fanizzi; Stefano Ferilli

This paper presents an integration of induction and abduction in INTHELEX, a prototypical incremental learning system. The refinement operators perform theory revision in a search space whose structure is induced by a quasi-ordering, derived from Plotkins θ-subsumption, compliant with the principle of Object Identity. A reduced complexity of the refinement is obtained, without a major loss in terms of expressiveness. These inductive operators have been proven ideal for this search space. Abduction supports the inductive operators in the completion of the incoming new observations. Experiments have been run on a standard dataset about family trees as well as in the domain of document classification to prove the effectiveness of such multistrategy incremental learning system with respect to a classical batch algorithm.

First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. | 2004

Machine learning methods for automatically processing historical documents: from paper acquisition to XML transformation

Floriana Esposito; Donato Malerba; Giovanni Semeraro; Stefano Ferilli; Oronzo Altamura; Teresa Maria Altomare Basile; Margherita Berardi; Michelangelo Ceci; N. Di Mauro

One of the aims of the EU project COLLATE is to design and implement a Web-based collaboratory for archives, scientists and end-users working with digitized cultural material. Since the originals of such a material are often unique and scattered in various archives, severe problems arise for their wide fruition. A solution would be to develop intelligent document processing tools that automatically transform printed documents into a Web-accessible form such as XML. Here, we propose the use of a document processing system, WISDOM++, which uses heavily machine learning techniques in order to perform such a task, and report promising results obtained in preliminary experiments.

Applied Artificial Intelligence | 1994

MULTISTRATEGY LEARNING FOR DOCUMENT RECOGNITION

Floriana Esposito; Donato Malerba; Giovanni Semeraro

In this paper, a methodology for document classification and understanding is proposed. It is based on a multistrategy approach to learning from examples. By document classification, we mean the process of identification of the particular class to which a document belongs. Document understanding is defined as the process of detecting the logical structure of a document. The multistrategy approach for document classification and understanding has been implemented in a system called PLRS, which embeds two empirical learning systems: RES and INDUBIIH. Given a set of documents whose layout structure has already been detected and such that the membership class has been defined by the user, RES generates the knowledge base of an expert system devoted to the classification of a document. The language used to describe both the layout of the training documents and the learned rules is a first-order language. The learning methodology adopted for the problem of learning classification rules integrates both a paramet...

logic-based program synthesis and transformation | 1997

A Logic Framework for the Incremental Inductive Synthesis of Datalog Theories

Giovanni Semeraro; Floriana Esposito; Donato Malerba; Nicola Fanizzi; Stefano Ferilli

This paper presents a logic framework for the incremental inductive synthesis of Datalog theories. It allows us to cast the problem as a process of abstract diagnosis and debugging of an incorrect theory. This process involves a search in a space, whose algebraic structure (conferred by the notion of object identity) makes easy the definition of algorithms that meet several properties which are deemed as desirable from the point of view of the theoretical computer science. Such algorithms embody two ideal refinement operators, one for generalizing incomplete clauses, and the other one for specializing inconsistent clauses.

international conference on pattern recognition | 1990

An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization

Floriana Esposito; Donato Malerba; Giovanni Semeraro; E. Annese; G. Scafuro

A novel approach to automatic classification of digitized office documents based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods.<<ETX>>

conference on recommender systems | 2015

Semantics-Aware Content-Based Recommender Systems

Marco de Gemmis; Pasquale Lops; Cataldo Musto; Fedelucio Narducci; Giovanni Semeraro

Content-based recommender systems (CBRSs) rely on item and user descriptions (content) to build item representations and user profiles that can be effectively exploited to suggest items similar to those a target user already liked in the past. Most content-based recommender systems use textual features to represent items and user profiles, hence they suffer from the classical problems of natural language ambiguity. This chapter presents a comprehensive survey of semantic representations of items and user profiles that attempt to overcome the main problems of the simpler approaches based on keywords. We propose a classification of semantic approaches into top-down and bottom-up. The former rely on the integration of external knowledge sources, such as ontologies, encyclopedic knowledge and data from the Linked Data cloud, while the latter rely on a lightweight semantic representation based on the hypothesis that the meaning of words depends on their use in large corpora of textual documents. The chapter shows how to make recommender systems aware of semantics to realize a new generation of content-based recommenders.

Explore More