Massimo De Santo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Massimo De Santo is active.

Explore More

Publication

Featured researches published by Massimo De Santo.

Pattern Recognition | 2003

Automatic classification of clustered microcalcifications by a multiple expert system

Massimo De Santo; Mario Molinara; Francesco Tortorella; Mario Vento

Abstract Mammography is a not invasive diagnostic technique widely used for early cancer detection in women breast. A significant visual clue of the disease is the presence of clusters of microcalcifications. The automatic recognition of malignant clusters of microcalcifications, which could be very helpful for diagnostic purposes, is a very difficult task because of the small size of the microcalcifications and of the poor quality of the mammographic images. In this paper we propose a novel approach for classifying clusters of microcalcifications, based on a Multiple Expert System; such system aggregates several experts, some of which are devoted to classify the single microcalcifications while others are aimed to classify the cluster considered as a whole. The final output results from the suitable combination of the two groups of experts. The tests performed on a standard database of 40 mammographic images have confirmed the effectiveness of the approach.

IEEE Transactions on Education | 2010

Ontology for E-Learning: A Bayesian Approach

Francesco Colace; Massimo De Santo

In the last decade, the evolution of educational technologies has forced an extraordinary interest in new methods for delivering learning content to learners. Today, distance education represents an effective way for supporting and sometimes substituting the traditional formative processes, thanks to the technological improvements achieved in the field in recent years. However, the role of technology has often been overestimated. The amount of information students can obtain from the Internet is huge, and as a result, they can easily be confused. Teachers can also be disconcerted by this vast quantity of content and are often unable to suggest the correct content to their students. In the open scientific literature, it is widely recognized that an important factor for success in delivering learning content is related to the capability for customizing the learning process for the specific needs of a given learner. This task is still far from having been fully accomplished, and there is a real interest in investigating new approaches and tools to adapt the formative process to specific individual needs. In this scenario, the introduction of ontology formalism can improve the quality of the formative process, allowing the introduction of new and effective services. Ontologies can lead to important improvements in the definition of a courses knowledge domain, in the generation of an adapted learning path, and in the assessment phase. This paper provides an initial discussion of the role of ontologies in the context of e-learning. The improvements related to the introduction of ontologies formalism in the e-learning field are discussed, and a novel algorithm for ontology building through the use of Bayesian networks is shown. Finally, the application of this algorithm in the assessment process and some experimental results are illustrated.

Computers in Human Behavior | 2014

Text classification using a few labeled examples

Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

Supervised text classifiers need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available because human labeling is enormously time-consuming. For this reason, there has been recent interest in methods that are capable of obtaining a high accuracy when the size of the training set is small. In this paper we introduce a new single label text classification method that performs better than baseline methods when the number of labeled examples is small. Differently from most of the existing methods that usually make use of a vector of features composed of weighted words, the proposed approach uses a structured vector of features, composed of weighted pairs of words. The proposed vector of features is automatically learned, given a set of documents, using a global method for term extraction based on the Latent Dirichlet Allocation implemented as the Probabilistic Topic Model. Experiments performed using a small percentage of the original training set (about 1%) confirmed our theories.

Information Processing and Management | 2015

Weighted Word Pairs for query expansion

Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

Abstract This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.

Journal of Visual Languages and Computing | 2014

Terminological ontology learning and population using latent Dirichlet allocation

Francesco Colace; Massimo De Santo; Luca Greco; Flora Amato; Vincenzo Moscato; Antonio Picariello

The success of Semantic Web will heavily rely on the availability of formal ontologies to structure machine understanding data. However, there is still a lack of general methodologies for ontology automatic learning and population, i.e. the generation of domain ontologies from various kinds of resources by applying natural language processing and machine learning techniques In this paper, the authors present an ontology learning and population system that combines both statistical and semantic methodologies. Several experiments have been carried out, demonstrating the effectiveness of the proposed approach. HighlightsA graph of terms can be effectively used for ontology building.Such a graph is extracted from documents thanks to a LDA based methodology.Ontology learning involves the use of annotated lexicons (WordNet).Proposed method achieves good performances on standard datasets.

affective computing and intelligent interaction | 2013

A Probabilistic Approach to Tweets' Sentiment Classification

Francesco Colace; Massimo De Santo; Luca Greco

Prior to 2003, mankind generated a total of about 5 Exabytes of contents. Now, we generate this amount of contents in about two days! The spread of generic (as Twitter, Facebook or Google+) or specialized (as Linked In or Viadeo) social networks allows sharing opinions on different aspects of life every day. Therefore this information is a rich source of data for opinion mining and sentiment analysis. This paper introduces a novel approach to the sentiment analysis based on the Weighted Word Pairs obtained by the use of the Latent Dirichlet Allocation (LDA) approach. The proposed methodology aims at identifying a word-based graphical model for depicting and mining a positive or negative attitude towards a topic. For the evaluation of the proposed approach a challenging scenario has been set: the real-time analysis of tweets. The experimental evaluation shows how the proposed approach is effective and satisfactory.

complex, intelligent and software intensive systems | 2012

Text Classification Using a Graph of Terms

Paolo Napoletano; Francesco Colace; Massimo De Santo; Luca Greco

It is well known that supervised text classification methods need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available. For this reason, there has been recent interest in methods that are capable of obtaining a high accuracy even if the size of the training set is not big. The main purpose of text mining techniques is to identify common patterns through the observation of vectors of features and then to use such patterns to make predictions. Most existing methods usually make use of a vector of features made up of weighted words that unfortunately are insufficiently discriminative when the number of features is much higher than the number of labeled examples. In this paper we demonstrate that, to obtain a greater accuracy in the analysis and revelation of common patterns, we could employ more complex features than simple weighted words. The proposed vector of features considers a hierarchical structure, named a mixed Graph of Terms, composed of a directed and an undirected sub-graph of words, that can be automatically constructed from a set of documents through the probabilistic Topic Model. The method has been tested on the top 10 classes of the ModApte split from the Reuters-21578 dataset, learned on several subsets of the original training set and showing a better performance than a method using a list of weighted words as a vector of features and linear support vector machines.

Storage and Retrieval for Image and Video Databases | 1999

Algorithm for video cut detection in MPEG sequences.

Giuseppe Boccignone; Massimo De Santo; Gennaro Percannella

In this paper we address the problem of the detection of abrupt shot changes in videos. Differently from the majority of the techniques in the literature., we perform this task directly on the stream coded in the Mpeg format, without resorting to any decoding procedure. The proposed algorithm proceeds according to a step-wise refinement strategy and combining different cut detection criteria. Experimental results are presented and discussed.

international conference on advances in pattern recognition | 2001

A Neural Multi-expert Classification System for MPEG Audio Segmentation

Massimo De Santo; Gennaro Percannella; C. Sansone; Mario Vento

The current research efforts in the field of video parsing and analysis are mainly focused on the use of pictorial information, while neglecting an important supplementary source of content information such as the embedded audio or soundtrack. In contrast, in this paper we address the issue of exploiting audio information that can be jointly used with video information for scene changes detection. The proposed method directly works on MPEG encoded sequences so to avoid computationally intensive decoding procedures. It is based on a multiexpert classification system made up of a hierarchical ensemble of neural networks. Finally, after presentation of a large audio database, suitably designed for assessing the performance of the approach, preliminary experimental results are discussed.

Journal of the Association for Information Science and Technology | 2015

Improving relevance feedback-based query expansion by the use of a weighted word pairs approach

Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]‐6, ‐7, ‐8, ‐9, and ‐10). Results demonstrated that the QE method based on this new structure outperforms the baseline.

Explore More