Is this you? Create Your Porfile

Salvatore Romeo

Qatar Computing Research Institute

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Salvatore Romeo is active.

Explore More

Publication

Featured researches published by Salvatore Romeo.

north american chapter of the association for computational linguistics | 2016

ConvKN at SemEval-2016 Task 3: Answer and Question Selection for Question Answering on Arabic and English Fora.

Alberto Barrón-Cedeño; Giovanni Da San Martino; Shafiq R. Joty; Alessandro Moschitti; Fahad Al-Obaidli; Salvatore Romeo; Kateryna Tymoshenko; Antonio Uva

We describe our system, ConvKN, participating to the SemEval-2016 Task 3 “Community Question Answering”. The task targeted the reranking of questions and comments in real-life web fora both in English and Arabic. ConvKN combines convolutional tree kernels with convolutional neural networks and additional manually designed features including text similarity and thread specific features. For the first time, we applied tree kernels to syntactic trees of Arabic sentences for a reranking task. Our approaches obtained the second best results in three out of four tasks. The only task we performed averagely is the one where we did not use tree kernels in our classifier.

conference on information and knowledge management | 2016

Learning to Re-Rank Questions in Community Question Answering Using Advanced Features

Giovanni Da San Martino; Alberto Barrón Cedeño; Salvatore Romeo; Antonio Uva; Alessandro Moschitti

We study the impact of different types of features for question ranking in community Question Answering: bag-of-words models (BoW), syntactic tree kernels (TKs) and rank features. It should be noted that structural kernels have never been applied to the question reranking task, i.e., question to question similarity, where they have to model paraphrase relations. Additionally, the informal text, typically present in forums, poses new challenges to the use of TKs. We compare our learning to rank (L2R) algorithms against a strong baseline given by the Google rank (GR). The results show that (i) our shallow structures used in TKs are robust enough to noisy data and (ii) improving GR requires effective BoW features and TKs along with an accurate model of GR features in the used L2R algorithm.

international conference on tools with artificial intelligence | 2013

A Versatile Graph-Based Approach to Package Recommendation

Roberto Interdonato; Salvatore Romeo; Andrea Tagarelli; George Karypis

An emerging trend in research on recommender systems is the design of methods capable of recommending packages instead of single items. The problem is challenging due to a variety of critical aspects, including context-based and user-provided constraints for the items constituting a package, but also the high sparsity and limited accessibility of the primary data used to solve the problem. Most existing works on the topic have focused on a specific application domain (e.g., travel package recommendation), thus often providing ad-hoc solutions that cannot be adapted to other domains. By contrast, in this paper we propose a versatile package recommendation approach that is substantially independent of the peculiarities of a particular application domain. A key aspect in our framework is the exploitation of prior knowledge on the content type models of the packages being generated that express what the users expect from the recommendation task. Packages are learned for each package model, while the recommendation stage is accomplished by performing a PageRank-style method personalized w.r.t. the target users preferences, possibly including a limited budget. Our developed method has been tested on a TripAdvisor dataset and compared with a recently proposed method for learning composite recommendations.

european conference on information retrieval | 2015

Knowledge-Based Representation for Transductive Multilingual Document Classification

Salvatore Romeo; Dino Ienco; Andrea Tagarelli

Multilingual document classification is often addressed by approaches that rely on language-specific resources (e.g., bilingual dictionaries and machine translation tools) to evaluate cross-lingual document similarities. However, the required transformations may alter the original document semantics, raising additional issues to the known difficulty of obtaining high-quality labeled datasets. To overcome such issues we propose a new framework for multilingual document classification under a transductive learning setting. We exploit a large-scale multilingual knowledge base, BabelNet, to support the modeling of different language-written documents into a common conceptual space, without requiring any language translation process. We resort to a state-of-the-art transductive learner to produce the document classification. Results on two real-world multilingual corpora have highlighted the effectiveness of the proposed document model w.r.t. document representations usually involved in multilingual and cross-lingual analysis, and the robustness of the transductive setting for multilingual document classification.

empirical methods in natural language processing | 2014

Semantic-Based Multilingual Document Clustering via Tensor Modeling

Salvatore Romeo; Andrea Tagarelli; Dino Ienco

A major challenge in document clustering research arises from the growing amount of text data written in different languages. Previous approaches depend on language-specific solutions (e.g., bilingual dictionaries, sequential machine translation) to evaluate document similarities, and the required transformations may alter the original document semantics. To cope with this issue we propose a new document clustering approach for multilingual corpora that (i) exploits a large-scale multilingual knowledge base, (ii) takes advantage of the multi-topic nature of the text documents, and (iii) employs a tensor-based model to deal with high dimensionality and sparseness. Results have shown the significance of our approach and its better performance w.r.t. classic document clustering approaches, in both a balanced and an unbalanced corpus evaluation.

international acm sigir conference on research and development in information retrieval | 2017

Cross-Language Question Re-Ranking

Giovanni Da San Martino; Salvatore Romeo; Alberto Barroón-Cedeño; Shafiq R. Joty; Lluís Maàrquez; Alessandro Moschitti; Preslav Nakov

We study how to find relevant questions in community forums when the language of the new questions is different from that of the existing questions in the forum. In particular, we explore the Arabic-English language pair. We compare a kernel-based system with a feed-forward neural network in a scenario where a large parallel corpus is available for training a machine translation system, bilingual dictionaries, and cross-language word embeddings. We observe that both approaches degrade the performance of the system when working on the translated text, especially the kernel-based system, which depends heavily on a syntactic kernel. We address this issue using a cross-language tree kernel, which compares the original Arabic tree to the English trees of the related questions. We show that this kernel almost closes the performance gap with respect to the monolingual system. On the neural network side, we use the parallel corpus to train cross-language embeddings, which we then use to represent the Arabic input and the English related questions in the same space. The results also improve to close to those of the monolingual neural network. Overall, the kernel system shows a better performance compared to the neural network in all cases.

international syposium on methodologies for intelligent systems | 2014

Clustering View-Segmented Documents via Tensor Modeling

Salvatore Romeo; Andrea Tagarelli; Dino Ienco

We propose a clustering framework for view-segmented documents, i.e., relatively long documents made up of smaller fragments that can be provided according to a target set of views or aspects. The framework is designed to exploit a view-based document segmentation into a third-order tensor model, whose decomposition result would enable any standard document clustering algorithm to better reflect the multi-faceted nature of the documents. Experimental results on document collections featuring paragraph-based, metadata-based, or user-driven views have shown the significance of the proposed approach, highlighting performance improvement in the document clustering task.

international acm sigir conference on research and development in information retrieval | 2017

ECIR 2016 Workshop on Modeling, Learning and Mining for Cross/Multilinguality (MultiLingMine '16)

Dino Ienco; Mathieu Roche; Salvatore Romeo; Paolo Rosso; Andrea Tagarelli

The First International Workshop on Modeling, Learning and Mining for Cross/Multilinguality (MultiLingMine) was held in conjunction with the 2016 European Conference on Information Retrieval (ECIR), in Padua, Italy. This report presents an overview of the motivations and objectives underlying the establishment of this workshop. It also provides a summary of the contributing papers and of the main research topics and trends discussed among the participants.

european conference on information retrieval | 2017

A Multiple-Instance Learning Approach to Sentence Selection for Question Ranking

Salvatore Romeo; Giovanni Da San Martino; Alberto Barrón-Cedeño; Alessandro Moschitti

In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation.

european conference on information retrieval | 2016

MultiLingMine 2016: Modeling, Learning and Mining for Cross/Multilinguality

Dino Ienco; Mathieu Roche; Salvatore Romeo; Paolo Rosso; Andrea Tagarelli

The increasing availability of text information coded in many different languages poses new challenges to modern information retrieval and mining systems in order to discover and exchange knowledge at a larger world-wide scale. The 1st International Workshop on Modeling, Learning and Mining for Cross/Multilinguality (dubbed MultiLingMine 2016) provides a venue to discuss research advances in cross-/multilingual related topics, focusing on new multidisciplinary research questions that have not been deeply investigated so far (e.g., in CLEF and related events relevant to CLIR). This includes theoretical and experimental on-going works about novel representation models, learning algorithms, and knowledge-based methodologies for emerging trends and applications, such as, e.g., cross-view cross-/multilingual information retrieval and document mining, (knowledge-based) translation-independent cross-/multilingual corpora, applications in social network contexts, and more.

Explore More