Jean-Pierre Chanod | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jean-Pierre Chanod is active.

Explore More

Publication

Featured researches published by Jean-Pierre Chanod.

Natural Language Engineering | 2002

Robustness beyond shallowness: incremental deep parsing

Salah Ait-Mokhtar; Jean-Pierre Chanod; Claude Roux

Robustness is a key issue for natural language processing in general and parsing in particular, and many approaches have been explored in the last decade for the design of robust parsing systems. Among those approaches is shallow or partial parsing, which produces minimal and incomplete syntactic structures, often in an incremental way. We argue that with a systematic incremental methodology one can go beyond shallow parsing to deeper language analysis, while preserving robustness. We describe a generic system based on such a methodology and designed for building robust analyzers that tackle deeper linguistic phenomena than those traditionally handled by the now widespread shallow parsers. The rule formalism allows the recognition of n-ary linguistic relations between words or constituents on the basis of global or local structural, topological and/or lexical conditions. It offers the advantage of accepting various types of inputs, ranging from raw to chunked or constituent-marked texts, so for instance it can be used to process existing annotated corpora, or to perform a deeper analysis on the output of an existing shallow parser. It has been successfully used to build a deep functional dependency parser, as well as for the task of co-reference resolution, in a modular way.

Natural Language Engineering | 1996

Regular expressions for language engineering

Lauri Karttunen; Jean-Pierre Chanod; Gregory Grefenstette; A. Schille

Many of the processing steps in natural language engineering can be performed using finite state transducers. An optimal way to create such transducers is to compile them from regular expressions. This paper is an introduction to the regular expression calculus, extended with certain operators that have proved very useful in natural language applications ranging from tokenization to light parsing. The examples in the paper illustrate in concrete detail some of these applications.

conference on applied natural language processing | 1997

INCREMENTAL FINITE-STATE PARSING

Salah Ait-Mokhtar; Jean-Pierre Chanod

This paper describes a new finite-state shallow parser. It merges constructive and reductionist approaches within a highly modular architecture. Syntactic information is added at the sentence level in an incremental way, depending on the contextual information available at a given stage. This approach overcomes the inefficiency of previous fully reductionist constraint-based systems, while maintaining broad coverage and linguistic granularity. The implementation relies on a sequence of networks built with the replace operator. Given the high level of modularity, the core grammar is easily augmented with corpus-specific sub-grammars. The current system is implemented for French and is being expanded to new languages.

conference of the european chapter of the association for computational linguistics | 1995

Tagging French: comparing a statistical and a constraint-based method

Jean-Pierre Chanod; Pasi Tapanainen

In this paper we compare two competing approaches to part-of-speech tagging, statistical and constraint-based disambiguation, using French as our test language. We imposed a time limit on our experiment: the amount of time spent on the design of our constraint system was about the same as the time we used to train and test the easy-to-implement statistical model. We describe the two systems and compare the results. The accuracy of the statistical method is reasonably good, comparable to taggers for English. But the constraint-based tagger seems to be superior even with the limited time we allowed ourselves for rule development.

european conference on research and advanced technology for digital libraries | 2005

From legacy documents to XML: a conversion framework

Jean-Pierre Chanod; Boris Chidlovskii; Hervé Déjean; Olivier Fambon; Jérôme Fuselier; Thierry Jacquin; Jean-Luc Meunier

We present an integrated framework for the document conversion from legacy formats to XML format. We describe the LegDoC project, aimed at automating the conversion of layout annotations layout-oriented formats like PDF, PS and HTML to semantic-oriented annotations. A toolkit of different components covers complementary techniques the logical document analysis and semantic annotations with the methods of machine learning. We use a real case conversion project as a driving example to exemplify different techniques implemented in the project.

Archive | 2001

Robust Parsing and Beyond

Jean-Pierre Chanod

Robust analysis covers not only processing partially ungrammatical input (Carbonell & Hayes 1983, Weng 1993), but more generally unrestricted text as actually produced by end-users in various situations.

Lecture Notes in Computer Science | 1999

Natural Language Processing and Digital Libraries

Jean-Pierre Chanod

As one envisions a document model where language, physical location and medium - electronic, paper or other - impose no barrier to effective use, natural language processing will play an increasing role, especially in the context of digital libraries. This paper presents language components based mostly on finite-state technology that improve our capabilities for exploring, enriching and interacting in various ways with documents. This ranges from morphology to part-of-speech tagging, NP extraction and shallow parsing. We then focus on a series of on-going projects which illustrate how this technology is already impacting the building and sharing of knowledge through digital libraries.

european conference on research and advanced technology for digital libraries | 2010

Xeproc©: a model-based approach towards document process preservation

Thierry Jacquin; Hervé Déjean; Jean-Pierre Chanod

Developed in the context of the EU Integrated Project SHAMAN, Xeproc© technology lets one define and design document processes while producing an abstract representation that is independent of the implementation. These representations capture the intent behind the workflow and can be preserved for reuse in future unknown infrastructures. Xeproc© is available under Eclipse Public Licence.

business information systems | 2015

Preserving Consistency in Domain-Specific Business Processes Through Semantic Representation of Artefacts

Nikolaos Lagos; Adrian Mos; Jean-Yves Vion-Dury; Jean-Pierre Chanod

Large organizations today face a growing challenge of managing heterogeneous process collections containing business processes. Explicit semantics inherent to domain-specific models can help alleviate some of the management challenges. Starting with concept definitions, designers can create domain specific processes and eventually generate industry-standard BPMN for use in BPMS solutions. However, any of these artefacts (concepts, domain processes and BPMN) can be modified by various stakeholders and changes done by one person may influence models used by others. There is therefore a need for tool support to aid in keeping track of changes done and their impacts on different stakeholders. In this paper we present an approach towards providing such support based on a semantic layer that records the provenance of the information and accordingly propagates impacts of changes to related resources, and illustrate the applicability of the approach via an illustrative example.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2011

XML Processing in the Cloud: Large-Scale Digital Preservation in Small Institutions

Peter Wittek; Thierry Jacquin; Hervé Déjean; Jean-Pierre Chanod; S´ndor Dar´nyi

Digital preservation deals with the problem of retaining the meaning of digital information over time to ensure its accessibility. The process often involves a workflow which transforms the digital objects. The workflow defines document pipelines containing transformations and validation checkpoints, either to facilitate migration for persistent archival or to extract metadata. The transformations, nevertheless, are computationally expensive, and therefore digital preservation can be out of reach for an organization whose core operation is not in data conservation. The operations described the document workflow, however, do not frequently reoccur. This paper combines an implementation-independent workflow designer with cloud computing to support small institution in their ad-hoc peak computing needs that stem from their efforts in digital preservation.

Explore More