Massimo Nicosia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Massimo Nicosia is active.

Explore More

Publication

Featured researches published by Massimo Nicosia.

north american chapter of the association for computational linguistics | 2015

QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English

Massimo Nicosia; Simone Filice; Alberto Barrón-Cedeño; Iman Saleh; Hamdy Mubarak; Wei Gao; Preslav Nakov; Giovanni Da San Martino; Alessandro Moschitti; Kareem Darwish; Lluís Màrquez; Shafiq R. Joty; Walid Magdy

This paper describes QCRI’s participation in SemEval-2015 Task 3 “Answer Selection in Community Question Answering”, which targeted real-life Web forums, and was offered in both Arabic and English. We apply a supervised machine learning approach considering a manifold of features including among others word n-grams, text similarity, sentiment analysis, the presence of specific words, and the context of a comment. Our approach was the best performing one in the Arabic subtask and the third best in the two English subtasks.

conference on information and knowledge management | 2013

Building structures from classifiers for passage reranking

Aliaksei Severyn; Massimo Nicosia; Alessandro Moschitti

This paper shows that learning to rank models can be applied to automatically learn complex patterns, such as relational semantic structures occurring in questions and their answer passages. This is achieved by providing the learning algorithm with a tree representation derived from the syntactic trees of questions and passages connected by relational tags, where the latter are again provided by the means of automatic classifiers, i.e., question and focus classifiers and Named Entity Recognizers. This way effective structural relational patterns are implicitly encoded in the representation and can be automatically utilized by powerful machine learning models such as kernel methods. We conduct an extensive experimental evaluation of our models on well-known benchmarks from the question answer (QA) track of TREC challenges. The comparison with state-of-the-art systems and BM25 show a relative improvement in MAP of more than 14% and 45%, respectively. Further comparison on the task restricted to the answer sentence reranking shows an improvement in MAP of more than 8% over the state of the art.

conference on computational natural language learning | 2014

Learning to Rank Answer Candidates for Automatic Resolution of Crossword Puzzles

Gianni Barlacchi; Massimo Nicosia; Alessandro Moschitti

In this paper, we study the impact of relational and syntactic representations for an interesting and challenging task: the automatic resolution of crossword puzzles. Automatic solvers are typically based on two answer retrieval modules: (i) a web search engine, e.g., Google, Bing, etc. and (ii) a database (DB) system for accessing previously resolved crossword puzzles. We show that learning to rank models based on relational syntactic structures defined between the clues and the answer can improve both modules above. In particular, our approach accesses the DB using a search engine and reranks its output by modeling paraphrasing. This improves on the MRR of previous system up to 53% in ranking answer candidates and greatly impacts on the resolution accuracy of crossword puzzles up to 15%.

empirical methods in natural language processing | 2014

Learning to Differentiate Better from Worse Translations

Francisco Guzmán; Shafiq R. Joty; Lluís Màrquez; Alessandro Moschitti; Preslav Nakov; Massimo Nicosia

We present a pairwise learning-to-rank approach to machine translation evaluation that learns to differentiate better from worse translations in the context of a given reference. We integrate several layers of linguistic information encapsulated in tree-based structures, making use of both the reference and the system output simultaneously, thus bringing our ranking closer to how humans evaluate translations. Most importantly, instead of deciding upfront which types of features are important, we use the learning framework of preference re-ranking kernels to learn the features automatically. The evaluation results show that learning in the proposed framework yields better correlation with humans than computing the direct similarity over the same type of structures. Also, we show our structural kernel learning (SKL) can be a general framework for MT evaluation, in which syntactic and semantic information can be naturally incorporated.

meeting of the association for computational linguistics | 2015

SACRY: Syntax-based Automatic Crossword puzzle Resolution sYstem

Alessandro Moschitti; Massimo Nicosia; Gianni Barlacchi

In this paper, we present our Crossword Puzzle Resolution System (SACRY), which exploits syntactic structures for clue reranking and answer extraction. SACRY uses a database (DB) containing previously solved CPs in order to generate the list of candidate answers. Additionally, it uses innovative features, such as the answer position in the rank and aggregated information such as the min, max and average clue reranking scores. Our system is based on WebCrow, one of the most advanced systems for automatic crossword puzzle resolution. Our extensive experiments over our two million clue dataset show that our approach highly improves the quality of the answer list, enabling the achievement of unprecedented results on the complete CP resolution tasks, i.e., accuracy of 99.17%.

international joint conference on natural language processing | 2015

Distributional Neural Networks for Automatic Resolution of Crossword Puzzles

Aliaksei Severyn; Massimo Nicosia; Gianni Barlacchi; Alessandro Moschitti

Automatic resolution of Crossword Puzzles (CPs) heavily depends on the quality of the answer candidate lists produced by a retrieval system for each clue of the puzzle grid. Previous work has shown that such lists can be generated using Information Retrieval (IR) search algorithms applied to the databases containing previously solved CPs and reranked with tree kernels (TKs) applied to a syntactic tree representation of the clues. In this paper, we create a labelled dataset of 2 million clues on which we apply an innovative Distributional Neural Network (DNN) for reranking clue pairs. Our DNN is computationally efficient and can thus take advantage of such large datasets showing a large improvement over the TK approach, when the latter uses small training data. In contrast, when data is scarce, TKs outperform DNNs.

conference on information and knowledge management | 2017

Accurate Sentence Matching with Hybrid Siamese Networks

Massimo Nicosia; Alessandro Moschitti

Recent neural network approaches to sentence matching compute the probability of two sentences being similar by minimizing a logistic loss. In this paper, we learn sentence representations by means of a siamese network, which: (i) uses encoders that share parameters; and (ii) enables the comparison between two sentences in terms of their euclidean distance, by minimizing a contrastive loss. Moreover, we add a multilayer perceptron in the architecture to simultaneously optimize the contrastive and the logistic losses. This way, our network can exploit a more informative feedback, given by the logistic loss, which is also quantified by the distance that the two sentences have according to their representation in the euclidean space. We show that jointly minimizing the two losses yields higher accuracy than minimizing them independently. We verify this finding by evaluating several baseline architectures in two sentence matching tasks: question paraphrasing and textual entailment recognition. Our network approaches the state of the art, while being much simpler and faster to train, and with less parameters than its competitors.

european conference on information retrieval | 2015

Learning to Rank Aggregated Answers for Crossword Puzzles

Massimo Nicosia; Gianni Barlacchi; Alessandro Moschitti

In this paper, we study methods for improving the quality of automatic extraction of answer candidates for automatic resolution of crossword puzzles (CPs), which we set as a new IR task. Since automatic systems use databases containing previously solved CPs, we define a new effective approach consisting in querying the database (DB) with a search engine for clues that are similar to the target one. We rerank the obtained clue list using state-of-the-art methods and go beyond them by defining new learning to rank approaches for aggregating similar clues associated with the same answer.

Exploiting Linked Data and Knowledge Graphs in Large Organisations | 2017

Question Answering and Knowledge Graphs

Alessandro Moschitti; Kateryna Tymoshenko; Panos Alexopoulos; Andrew D. Walker; Massimo Nicosia; Guido Vetere; Alessandro Faraotti; Marco Monti; Jeff Z. Pan; Honghan Wu; Yuting Zhao

In the Digital and Information Age, companies and government agencies are highly digitalized, as the information exchanges happening in their processes. They store information both as natural language text and structured data, e.g., relational databases or knowledge graphs. In this scenario, methods for organizing, finding, and selecting relevant information, beyond the capabilities of classic Information Retrieval, are always active topics of research and development.

meeting of the association for computational linguistics | 2017

RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations.

Kateryna Tymoshenko; Alessandro Moschitti; Massimo Nicosia; Aliaksei Severyn

We present a highly-flexible UIMA-based pipeline for developing structural kernelbased systems for relational learning from text, i.e., for generating training and test data for ranking, classifying short text pairs or measuring similarity between pieces of text. For example, the proposed pipeline can represent an input question and answer sentence pairs as syntacticsemantic structures, enriching them with relational information, e.g., links between question class, focus and named entities, and serializes them as training and test files for the tree kernel-based reranking framework. The pipeline generates a number of dependency and shallow chunkbased representations shown to achieve competitive results in previous work. It also enables easy evaluation of the models thanks to cross-validation facilities.

Explore More