Marco Dinarelli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marco Dinarelli is active.

Explore More

Publication

Featured researches published by Marco Dinarelli.

IEEE Transactions on Audio, Speech, and Language Processing | 2011

Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages

Stefan Hahn; Marco Dinarelli; Christian Raymond; Fabrice Lefèvre; Patrick Lehnen; R. De Mori; Alessandro Moschitti; Hermann Ney; Giuseppe Riccardi

One of the first steps in building a spoken language understanding (SLU) module for dialogue systems is the extraction of flat concepts out of a given word sequence, usually provided by an automatic speech recognition (ASR) system. In this paper, six different modeling approaches are investigated to tackle the task of concept tagging. These methods include classical, well-known generative and discriminative methods like Finite State Transducers (FSTs), Statistical Machine Translation (SMT), Maximum Entropy Markov Models (MEMMs), or Support Vector Machines (SVMs) as well as techniques recently applied to natural language processing such as Conditional Random Fields (CRFs) or Dynamic Bayesian Networks (DBNs). Following a detailed description of the models, experimental and comparative results are presented on three corpora in different languages and with different complexity. The French MEDIA corpus has already been exploited during an evaluation campaign and so a direct comparison with existing benchmarks is possible. Recently collected Italian and Polish corpora are used to test the robustness and portability of the modeling approaches. For all tasks, manual transcriptions as well as ASR inputs are considered. Additionally to single systems, methods for system combination are investigated. The best performing model on all tasks is based on conditional random fields. On the MEDIA evaluation corpus, a concept error rate of 12.6% could be achieved. Here, additionally to attribute names, attribute values have been extracted using a combination of a rule-based and a statistical approach. Applying system combination using weighted ROVER with all six systems, the concept error rate (CER) drops to 12.0%.

Proceedings of SRSL 2009, the 2nd Workshop on Semantic Representation of Spoken Language | 2009

Annotating Spoken Dialogs: From Speech Segments to Dialog Acts and Frame Semantics

Marco Dinarelli; Silvia Quarteroni; Sara Tonelli; Alessandro Moschitti; Giuseppe Riccardi

We are interested in extracting semantic structures from spoken utterances generated within conversational systems. Current Spoken Language Understanding systems rely either on hand-written semantic grammars or on flat attribute-value sequence labeling. While the former approach is known to be limited in coverage and robustness, the latter lacks detailed relations amongst attribute-value pairs. In this paper, we describe and analyze the human annotation process of rich semantic structures in order to train semantic statistical parsers. We have annotated spoken conversations from both a human-machine and a human-human spoken dialog corpus. Given a sentence of the transcribed corpora, domain concepts and other linguistic features are annotated, ranging from e.g. part-of-speech tagging and constituent chunking, to more advanced annotations, such as syntactic, dialog act and predicate argument structure. In particular, the two latter annotation layers appear to be promising for the design of complex dialog systems. Statistics and mutual information estimates amongst such features are reported and compared across corpora.

IEEE Transactions on Audio, Speech, and Language Processing | 2012

Discriminative Reranking for Spoken Language Understanding

Marco Dinarelli; Alessandro Moschitti; Giuseppe Riccardi

Spoken language understanding (SLU) is concerned with the extraction of meaning structures from spoken utterances. Recent computational approaches to SLU, e.g., conditional random fields (CRFs), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, recent works have shown that the accuracy of CRF can be significantly improved by modeling long-distance dependency features. In this paper, we propose novel approaches to encode all possible dependencies between features and most importantly among parts of the meaning structure, e.g., concepts and their combination. We rerank hypotheses generated by local models, e.g., stochastic finite state transducers (SFSTs) or CRF, with a global model. The latter encodes a very large number of dependencies (in the form of trees or sequences) by applying kernel methods to the space of all meaning (sub) structures. We performed comparative experiments between SFST, CRF, support vector machines (SVMs), and our proposed discriminative reranking models (DRMs) on representative conversational speech corpora in three different languages: the ATIS (English), the MEDIA (French), and the LUNA (Italian) corpora. These corpora have been collected within three different domain applications of increasing complexity: informational, transactional, and problem-solving tasks, respectively. The results show that our DRMs consistently outperform the state-of-the-art models based on CRF.

meeting of the association for computational linguistics | 2009

Re-Ranking Models for Spoken Language Understanding

Marco Dinarelli; Alessandro Moschitti; Giuseppe Riccardi

Spoken Language Understanding aims at mapping a natural language spoken sentence into a semantic representation. In the last decade two main approaches have been pursued: generative and discriminative models. The former is more robust to overfitting whereas the latter is more robust to many irrelevant features. Additionally, the way in which these approaches encode prior knowledge is very different and their relative performance changes based on the task. In this paper we describe a machine learning framework where both models are used: a generative model produces a list of ranked hypotheses whereas a discriminative model based on structure kernels and Support Vector Machines, re-ranks such list. We tested our approach on the MEDIA corpus (human-machine dialogs) and on a new corpus (human-machine and human-human dialogs) produced in the European LUNA project. The results show a large improvement on the state-of-the-art in concept segmentation and labeling.

empirical methods in natural language processing | 2009

Re-Ranking Models Based-on Small Training Data for Spoken Language Understanding

Marco Dinarelli; Alessandro Moschitti; Giuseppe Riccardi

The design of practical language applications by means of statistical approaches requires annotated data, which is one of the most critical constraint. This is particularly true for Spoken Dialog Systems since considerably domain-specific conceptual annotation is needed to obtain accurate Language Understanding models. Since data annotation is usually costly, methods to reduce the amount of data are needed. In this paper, we show that better feature representations serve the above purpose and that structure kernels provide the needed improved representation. Given the relatively high computational cost of kernel methods, we apply them to just re-rank the list of hypotheses provided by a fast generative model. Experiments with Support Vector Machines and different kernels on two different dialog corpora show that our re-ranking models can achieve better results than state-of-the-art approaches when small data is available.

international conference on acoustics, speech, and signal processing | 2010

The LUNA Spoken Dialogue System: Beyond utterance classification

Marco Dinarelli; Evgeny A. Stepanov; Sebastian Varges; Giuseppe Riccardi

We present a call routing application for complex problem solving tasks. Up to date work on call routing has been mainly dealing with call-type classification. In this paper we take call routing further: Initial call classification is done in parallel with a robust statistical Spoken Language Understanding module. This is followed by a dialogue to elicit further task-relevant details from the user before passing on the call. The dialogue capability also allows us to obtain clarifications of the initial classifier guess. Based on an evaluation, we show that conducting a dialogue significantly improves upon call routing based on call classification alone. We present both subjective and objective evaluation results of the system according to standard metrics on real users.

spoken language technology workshop | 2008

Joint generative and discriminative models for spoken language understanding

Marco Dinarelli; Alessandro Moschitti; Giuseppe Riccardi

Spoken Language Understanding aims at mapping a natural language spoken sentence into a semantic representation. In the last decade two main approaches have been pursued: generative and discriminative models. The former is more robust to overfitting whereas the latter is more robust to many irrelevant features. Additionally, the way in which these approaches encode prior knowledge is very different and their relative performance changes based on the task. In this paper we describe a training framework where both models are used: a generative model produces a list of ranked hypotheses whereas a discriminative model, depending on string kernels and Support Vector Machines, re-ranks such list. We tested such approach on a new corpus produced in the European LUNA project. The results show a large improvement on the state-of-the-art in concept segmentation and labeling.

spoken language technology workshop | 2008

Semantic annotations for conversational speech: From speech transcriptions to predicate argument structures

Arianna Bisazza; Marco Dinarelli; Silvia Quarteroni; Sara Tonelli; Alessandro Moschitti; Giuseppe Riccardi

In this paper, we describe the semantic content, which can be automatically generated, for the design of advanced dialog systems. Since the latter will be based on machine learning approaches, we created training data by annotating a corpus with the needed content. Given a sentence of our transcribed corpus, domain concepts and other linguistic levels ranging from basic ones, i.e. part-of-speech tagging and constituent chunking level, to more advanced ones, i.e. syntactic and predicate argument structure (PAS) levels are annotated. In particular, the proposed PAS and taxonomy of dialog acts appear to be promising for the design of more complex dialog systems. Statistics about our semantic annotation are reported.

ieee automatic speech recognition and understanding workshop | 2009

Ontology-based grounding of Spoken Language Understanding

Silvia Quarteroni; Marco Dinarelli; Giuseppe Riccardi

Current Spoken Language Understanding models rely on either hand-written semantic grammars or flat attribute-value sequence labeling. In most cases, no relations between concepts are modeled, and both concepts and relations are domain-specific, making it difficult to expand or port the domain model. In contrast, we expand our previous work on a domain model based on an ontology where concepts follow the predicate-argument semantics and domain-independent classical relations are defined on such concepts. We conduct a thorough study on a spoken dialog corpus collected within a customer care problem-solving domain, and we evaluate the coverage and impact of the ontology for the interpretation, grounding and re-ranking of spoken language understanding interpretations.

spoken language technology workshop | 2010

Hypotheses selection for re-ranking semantic annotations

Marco Dinarelli; Alessandro Moschitti; Giuseppe Riccardi

Discriminative reranking has been successfully used for several tasks of Natural Language Processing (NLP). Recently it has been applied also to Spoken Language Understanding, improving state-of-the-art for some applications. However, such proposed models can be further improved by considering: (i) a better selection of the initial n-best hypotheses to be re-ranked and (ii) the use of a strategy that decides when the reranking model should be used, i.e. in some cases only the basic approach should be applied.

Explore More