Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Luka Nerima is active.

Publication


Featured researches published by Luka Nerima.


workshop on statistical machine translation | 2009

Deep Linguistic Multilingual Translation and Bilingual Dictionaries

Eric Wehrli; Luka Nerima; Yves Scherrer

This paper describes the MulTra project, aiming at the development of an efficient multilingual translation technology based on an abstract and generic linguistic model as well as on object-oriented software design. In particular, we will address the issue of the rapid growth both of the transfer modules and of the bilingual databases. For the latter, we will show that a significant part of bilingual lexical databases can be derived automatically through transitivity, with corpus validation.


conference of the european chapter of the association for computational linguistics | 2003

Creating a multilingual collocation dictionary from large text corpora

Luka Nerima; Violeta Seretan; Eric Wehrli

This paper describes a system of terminological extraction capable of handling multi-word expressions, using a powerful syntactic parser. The system includes a concordancing tool enabling the user to display the context of the collocation, i.e. the sentence or the whole document where the collocation occurs. Since the corpora are multilingual, the system also offers an alignment mechanism for the corresponding translated documents.


international workshop on the web and databases | 1998

Language and Tools to Specify Hypertext Views on Databases

Gilles Falquet; Jacques Guyot; Luka Nerima

We present a declarative language for the construction of hypertext views on databases. The language is based on an object-oriented data model and a simple hypertext model with reference and inclusion links. A hypertext view specification consists in a collection of parameterized node schemes which specify how to construct node and link instances from the database contents. We show how this language can express different issues in hypertext view design. These include: the direct mapping of objects to nodes; the construction of complex nodes based on sets of objects; the representation of polymorphic sets of objects; and the representation of tree and graph structures. We have defined sublanguages corresponding to particular database models (relational, semantic, object-oriented) and implemented tools to generate Web views for these database models.


Archive | 2015

The Fips Multilingual Parser

Eric Wehrli; Luka Nerima

This paper reports on the Fips parser, a multilingual constituent parser that has been developed over the last two decades. After a brief historical overview of the numerous modifications and adaptations made to this system over the years, we provide a description of its main characteristics. The linguistic framework that underlies the Fips system has been much influenced by Generative Grammar, but drastically simplified in order to make it easier to implement in an efficient manner. The parsing procedure is a one pass (no preprocessing, no postprocessing) scan of the input text, using rules to build up constituent structures and (syntactic) interpretation procedures to determine the dependency relations between constituents (grammatical functions, etc.), including cases of long-distance dependencies. The final section offers a description of the rich lexical database developed for Fips. The lexical model assumes two distinct levels for lexical units: words, which are inflected forms of lexical units, and lexemes, which are more abstract units, roughly corresponding to a particular reading of a word. Collocations are defined as an association of two lexical units (lexeme or collocation) in a specific grammatical relation such as adjective-noun or verb-object.


acm conference on hypertext | 2004

Towards digital libraries of virtual hyperbooks

Gilles Falquet; Luka Nerima; Jean-Claude Ziswiler

This paper describes a technique for integrating several (many) virtual hyperbooks in a digital library. We consider a virtual hyperbook model that comprises a domain ontology. By interconnecting the hyperbooks ontologies, we can create a multi-point of view ontology that describes a set of hyperbooks. A hypertext interface specification language can use this ontology to construct new semantically and narratively coherent hyperdocuments based on the content of several hyperbooks.


international multiconference on computer science and information technology | 2009

On-line and off-line translation aids for non-native readers

Eric Wehrli; Luka Nerima; Violeta Seretan; Yves Scherrer

Twic and TwicPen are reading aid systems for readers of material in foreign languages. Although they include a sentence translation engine, both systems are primarily conceived to give word and expression translation to readers with a basic knowledge of the language they read. Twic has been designed for on-line material and consists of a plug-in for internet browsers communicating with our server. TwicPen offers a similar assistance for readers of printed material. It consists of a hand-held scanner connected to a lap-top (or desk-top) computer running our parsing and translation software. Both systems provide readers a limited number of translations selected on the basis of a linguistic analysis of the whole scanned text fragment (a phrase, part of the sentence, etc.). The use of a morphological and syntactic parser makes it possible (i) to disambiguate to a large extent the word selected by the user (and hence to drastically reduce the noise in the response), and (ii) to handle expressions (compounds, collocations, idioms), often a major source of difficulty for non-native readers. The systems are available for the following language-pairs: English-French, French-English, German-French, German-English, Italian-French, Spanish-French. Several other pairs are under development.


international conference on computational linguistics | 2014

When Rules Meet Bigrams

Eric Wehrli; Luka Nerima

This paper discusses an on-going project aiming at improving the quality and the efficiency of a rule-based parser by the addition of a statistical component. The proposed technique relies on bigrams of pairs word+category selected from the homographs contained in our lexical database and computed over a large section of the Hansard corpus, previously tagged. The bigram table is used by the parser to rank and prune the set of alternatives. To evaluate the gains obtained by the hybrid system, we conducted two manual evaluations. One over a small subset of the Hansard corpus, the other one with a corpus of about 50 articles taken from the magazine The Economist. In both cases, we compare analyses obtained by the parser with and without the statistical component, focusing only on one important source of mistakes, the confusion between nominal and verbal readings for ambiguous words such as announce, sets, costs, labour, etc.


extended semantic web conference | 2013

NERITS - A Machine Translation Mashup System Using Wikimeta and DBpedia

Kamel Nebhi; Luka Nerima; Eric Wehrli

Recently, Machine Translation (MT) has become a quite popular technology in everyday use through Web services such as Google Translate. Although the different MT approaches provide good results, none of them exploits contextual information like Named Entity (NE) to help user comprehension.


Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) | 2017

Parsing and MWE Detection: Fips at the PARSEME Shared Task

Luka Nerima; Vasiliki Foufi; Eric Wehrli

Identifying multiword expressions (MWEs) in a sentence in order to ensure their proper processing in subsequent applications, like machine translation, and performing the syntactic analysis of the sentence are interrelated processes. In our approach, priority is given to parsing alternatives involving collocations, and hence collocational information helps the parser through the maze of alternatives, with the aim to lead to substantial improvements in the performance of both tasks (collocation identification and parsing), and in that of a subsequent task (machine translation). In this paper, we are going to present our system and the procedure that we have followed in order to participate to the open track of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs) in running texts.


EUROPHRAS 2017 - Computational and Corpus-based Phraseology: Recent Advances and Interdisciplinary Approaches | 2017

Automatic Annotation of Verbal Collocations in Modern Greek

Vasiliki Foufi; Luka Nerima; Eric Wehrli

Identifying multiword expressions (MWEs) in a sentence and performi ng the syntactic analysis of the sentence are interrelated processes. In our approach, priority is given to parsing alternatives involving collocations, and hence collocational information helps the parser through the maze of alternatives, with the aim to lead to substantial improvements in the performance of both tasks (collocation identification and parsing), and in that of a subsequent task (automatic annotation). In this paper, we are going to present our system and the procedure that we have followed in order to proceed to the automatic annotation of Greek verbal multiword expressions (VMWEs) in running texts.

Collaboration


Dive into the Luka Nerima's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Seongbin Park

Information Sciences Institute

View shared research outputs
Top Co-Authors

Avatar

Christine Vanoirbeek

École Polytechnique Fédérale de Lausanne

View shared research outputs
Researchain Logo
Decentralizing Knowledge