Is this you? Create Your Porfile

Lidia Moreno

Polytechnic University of Valencia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lidia Moreno is active.

Explore More

Publication

Featured researches published by Lidia Moreno.

Computational Linguistics | 2001

An algorithm for anaphora resolution in Spanish texts

Manuel Palomar; Lidia Moreno; Jesús Peral; Rafael Muñoz; Antonio Ferrández; Patricio Martínez-Barco; Maximiliano Saiz-Noeda

This paper presents an algorithm for identifying noun phrase antecedents of third person personal pronouns, demonstrative pronouns, reflexive pronouns, and omitted pronouns (zero pronouns) in unrestricted Spanish texts. We define a list of constraints and preferences for different types of pronominal expressions, and we document in detail the importance of each kind of knowledge (lexical, morphological, syntactic, and statistical) in anaphora resolution for Spanish. The paper also provides a definition for syntactic conditions on Spanish NP-pronoun noncoreference using partial parsing. The algorithm has been evaluated on a corpus of 1,677 pronouns and achieved a success rate of 76.8. We have also implemented four competitive algorithms and tested their performance in a blind evaluation on the same test corpus. This new approach could easily be extended to other languages such as English, Portuguese, Italian, or Japanese.

Machine Translation | 1999

An Empirical Approach to Spanish Anaphora Resolution

Antonio Ferrández; Manuel Palomar; Lidia Moreno

This paper documents the development of an empirically-basedsystem implemented in Prolog that automatically resolves severalkinds of anaphora in Spanish texts. These are pronominalreferences, surface-count anaphora, one-anaphora and ellipticalzero-subject constructions (i.e., sentences that omit theirpronominal subject). The resolution is based onrepresentations resulting from either partial or full parsing. Thesystem developed can also work on the output of a POStagger or with different dictionaries, without changing thegrammar. This grammar represents the syntactic information of eachlanguage by means of the Slot Unification Grammar formalism. The different kinds of information used for anaphora resolution in full and partial parsing are shown, as wellas evaluation results. The system has been adapted toEnglish texts, obtaining encouraging results that prove that itcan be applied with only a very few refinements to other languagesas well as Spanish and English. In addition, the differencesbetween English and Spanish anaphora are noted.

meeting of the association for computational linguistics | 1998

Anaphor Resolution In Unrestricted Texts With Partial Parsing

Antonio Ferrández; Manuel Palomar; Lidia Moreno

In this paper we deal with several kinds of anaphora in unrestricted texts. These kinds of anaphora are pronominal references, surfacecount anaphora and one-anaphora. In order to solve these anaphors we work on the output of a part-of-speech tagger, on which we automatically apply a partial parsing from the formalism: Slot Unification Grammar, which has been implemented in Prolog. We only use the following kinds of information: lexical (the lemma of each word), morphologic (person, number, gender) and syntactic. Finally we show the experimental results, and the restrictions and preferences that we have used for anaphor resolution with partial parsing.

international conference natural language processing | 2011

Towards the detection of cross-language source code reuse

Enrique Flores; Alberto Barrón-Cedeño; Paolo Rosso; Lidia Moreno

Internet has made available huge amounts of information, also source code. Source code repositories and, in general, programming related websites, facilitate its reuse. In this work, we propose a simple approach to the detection of cross-language source code reuse, a nearly investigated problem. Our preliminary experiments, based on character n-grams comparison, show that considering different sections of the code (i.e., comments, code, reserved words, etc.), leads to different results. When considering three programming languages: C++, Java, and Python, the best result is obtained when comments are discarded and the entire source code is considered.

international conference on computational linguistics | 2005

Integrating natural language techniques in OO-Method

Isabel Díaz; Lidia Moreno; Inmaculada Fuentes; Oscar Pastor

An approach that involves natural language analysis techniques for the treatment of software system functional requirements is described in this paper. This approach is used as the basis for a process developed to generate sequence diagrams automatically from the textual specification of use cases. This facility has been integrated in the Requirements Engineering Phase of OO-Method, an automatic production environment of software. For this purpose, a translator that is based on natural language parser is used. The translator provides grammatical information to each use case sentence and it identifies the corresponding interaction. The automatic transformation is conceived and specified following an orientation that is based on models and patterns. The results of the validation of the transformation patterns are presented.

Computer Applications in Engineering Education | 2015

Uncovering source code reuse in large-scale academic environments

Enrique Flores; Alberto Barrón-Cedeño; Lidia Moreno; Paolo Rosso

The advent of the Internet has caused an increase in content reuse, including source code. The purpose of this research is to uncover potential cases of source code reuse in large‐scale environments. A good example is academia, where massive courses are taught to students who must demonstrate that they have acquired the knowledge. The need of detecting content reuse in quasi real‐time encourages the development of automatic systems such as the one described in this paper for source code reuse detection. Our approach is based on the comparison of programs at character level. It is able to find potential cases of reuse across a huge number of assignments. It achieved better results than JPlag, the most used online system to find similarities among multiple sets of source codes. The most common obfuscation operations we found were changes in identifier names, comments and indentation.

forum for information retrieval evaluation | 2014

On the Detection of SOurce COde Re-use

Enrique Flores; Paolo Rosso; Lidia Moreno; Esaú Villatoro-Tello

This paper summarizes the goals, organization and results of the first SOCO competitive evaluation campaign for systems that automatically detect the source code re-use phenomenon. The detection of source code re-use is an important research field for both software industry and academia fields. Accordingly, PAN@FIRE track, named SOurce COde Re-use (SOCO) focused on the detection of re-used source codes in C/C++ and Java programming languages. Participant systems were asked to annotate several source codes whether or not they represent cases of source code re-use. In total five teams submitted 17 runs. The training set consisted of annotations made by several experts, a feature which turns the SOCO 2014 collection in a useful data set for future evaluations and, at the same time, it establishes a standard evaluation framework for future research works on the posed shared task.

international conference natural language processing | 2006

Automatic feature extraction for question classification based on dissimilarity of probability distributions

David Tomás; José L. Vicedo; Empar Bisbal; Lidia Moreno

Question classification is one of the first tasks carried out in a Question Answering system. In this paper we present a multilingual question classification system based on machine learning techniques. We use Support Vector Machines to classify the questions. All the features needed to train and test this method are automatically extracted through statistical information in an unsupervised way, comparing Poisson distributions of single words in two plain corpora of questions and documents. Thus, we need nothing but plain text to train the system, obtaining a flexible approach easy to adapt to new languages and domains. We have tested it on a bilingual corpus of questions in English and Spanish.

mexican international conference on artificial intelligence | 2005

A multilingual SVM-based question classification system

Empar Bisbal; David Tomás; Lidia Moreno; José L. Vicedo; Armando Suárez

Question Classification (QC) is usually the first stage in a Question Answering system. This paper presents a multilingual SVM-based question classification system aiming to be language and domain independent. For this purpose, we use only surface text features. The system has been tested on the TREC QA track questions set obtaining encouraging results.

international conference natural language processing | 2005

Interaction transformation patterns based on semantic roles

Isabel Díaz; Lidia Moreno; Oscar Pastor; Alfredo Matteo

This paper presents a strategy to deduce interactions from the text of use cases. This strategy is used by Metamorphosis: an automatic software production framework, conceived to facilitate the modelling of interactions of a system. Metamorphosis follows a linguistic engineering approach that is centred on the construction of models through the successive transformation of these models, in the definition of semantic roles and the application of design patterns. To obtain the Interaction Model of a system, three transformation levels are defined: the system, the use case, and the sentence. This paper focuses on how a transformation of a sentence is performed. Each transformation pattern specifies how to obtain information from the semantic context of a sentence, to deduce its corresponding interaction fragment. Some of the results obtained from the validation of these patterns are also presented.

Explore More