Cristian Ursu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cristian Ursu is active.

Explore More

Publication

Featured researches published by Cristian Ursu.

Natural Language Engineering | 2002

Architectural elements of language engineering robustness

Diana Maynard; Valentin Tablan; Hamish Cunningham; Cristian Ursu; Horacio Saggion; Kalina Bontcheva; Yorick Wilks

We discuss robustness in LE systems from the perspective of engineering, and the predictability of both outputs and construction process that this entails. We present an architectural system that contributes to engineering robustness and low-overhead systems development (GATE, a General Architecture for Text Engineering). To verify our ideas we present results from the development of a multi-purpose cross-genre Named Entity recognition system. This system aims be robust across diverse input types, and to reduce the need for costly and timeconsuming adaptation of systems to new applications, with its capability to process texts from widely differing domains and genres.

meeting of the association for computational linguistics | 2004

Data-Driven Strategies for an Automated Dialogue System

Hilda Hardy; Tomek Strzalkowski; Min Wu; Cristian Ursu; Nick Webb; Alan W. Biermann; R. Bryce Inouye; Ashley McKenzie

We present a prototype natural-language problem-solving application for a financial services call center, developed as part of the Amities multilingual human-computer dialogue project. Our automated dialogue system, based on empirical evidence from real call-center conversations, features a data-driven approach that allows for mixed system/customer initiative and spontaneous conversation. Preliminary evaluation results indicate efficient dialogues and high user satisfaction, with performance comparable to or better than that of current conversational travel information systems.

Literary and Linguistic Computing | 2004

Corpus linguistics and South Asian languages : corpus creation and tool development.

Paul Baker; Andrew Hardie; Tony McEnery; Richard Xiao; Kalina Bontcheva; Hamish Cunningham; Robert J. Gaizauskas; Oana Hamza; Diana Maynard; Valentin Tablan; Cristian Ursu; B. D. Jayaram; Mark Leisher

This paper describes the work carried out on the EMILLE Project (Enabling Minority Language Engineering), which was undertaken by the Universities of Lancaster and Sheffield. The primary resource developed by the project is the EMILLE Corpus, which consists of a series of monolingual corpora for fourteen South Asian languages, totalling more than 96 million words, and a parallel corpus of English and five of these languages. The EMILLE Corpus also includes an annotated component, namely, part-of-speech tagged Urdu data, together with twenty written Hindi corpus files annotated to show the nature of demonstrative use in Hindi. In addition, the project has had to address a number of issues related to establishing a language engineering (LE) environment for South Asian language processing, such as translating 8-bit language data into Unicode and producing a number of basic LE tools. The development of tools for EMILLE has contributed to the ongoing development of the LE architecture GATE, which has been extended to make use of Unicode. GATE thus plugs some of the gaps for language processing R&D necessary for the exploitation of the EMILLE corpora.

applications of natural language to data bases | 2002

Access to Multimedia Information through Multisource and Multilanguage Information Extraction

Horacio Saggion; Hamish Cunningham; Kalina Bontcheva; Diana Maynard; Cristian Ursu; Oana Hamza; Yorick Wilks

We describe our work on information extraction from multiple sources for the Multimedia Indexing and Searching Environment, a project aiming at developing technology to produce formal annotations about essential events in multimedia programme material. The creation of a composite index from multiple and multi-lingual sources is a unique aspect of this project. The domain chosen for tuning the software components and testing is football. Our information extraction system is based on the use of finite state machinery pipelined with full semantic analysis and discourse interpretation.

recent advances in natural language processing | 2001