Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stelios Piperidis is active.

Publication


Featured researches published by Stelios Piperidis.


international conference on computational linguistics | 1994

A matching technique in Example-Based Machine Translation

Lambros Cranias; Harris Papageorgiou; Stelios Piperidis

This paper addresses an important problem in Example-Based Machine Translation (EMBT), namely how to measure similarity between a sentence fragment and a set of stored examples. A new method is proposed that measures similarity according to both surface structure and content. A second contribution is the use of clustering to make retrieval of the best matching example from the database more efficient. Results on a large number of test cases from the CELEX database are presented.


meeting of the association for computational linguistics | 1994

AUTOMATIC ALIGNMENT IN PARALLEL CORPORA

Harris Papageorgiou; Lambros Cranias; Stelios Piperidis

This paper addresses the alignment issue in the framework of exploitation of large bimultilingual corpora for translation purposes. A generic alignment scheme is proposed that can meet varying requirements of different applications. Depending on the level at which alignment is sought, appropriate surface linguistic information is invoked coupled with information about possible unit delimiters. Each text unit (sentence, clause or phrase) is represented by the sum of its content tags. The results are then fed into a dynamic programming framework that computes the optimum alignment of units. The proposed scheme has been tested at sentence level on parallel corpora of the CELEX database. The success rate exceeded 99%. The next steps of the work concern the testing of the schemes efficiency at lower levels endowed with necessary bilingual information about potential delimiters.


Natural Language Engineering | 1997

Example retrieval from a translation memory

Lambros Cranias; Harris Papageorgiou; Stelios Piperidis

Clustering of a translation memory is proposed to make the retrieval of similar translation examples from a translation memory more efficient, while a second contribution is a metric of text similarity which is based on both surface structure and content. Tests on the two proposed techniques are run on part of the CELEX database. The results reported indicate that the clustering of the translation memory results in a significant gain in the retrieval response time, while the deterioration in the retrieval accuracy can be considered to be negligible. The text similarity metric proposed is evaluated by a human expert and found to be compatible with the human perception of text similarity.


systems, man and cybernetics | 1994

Clustering: a technique for search space reduction in example-based machine translation

Lambros Cranias; Harris Papageorgiou; Stelios Piperidis

This paper addresses an important problem in example-based machine translation (EBMT), namely how to make retrieval of the example that best matches the input more efficient. The use of clustering is proposed, to enable the application of the same similarity metric to first limit the search space and then locate the best available match in a database. Evaluation results are presented on a large number of test cases.<<ETX>>


international conference natural language processing | 2000

A System for Recognition of Named Entities in Greek

Sotiris Boutsis; Iason Demiros; Voula Giouli; Maria Liakata; Harris Papageorgiou; Stelios Piperidis

In this paper, we describe work in progress for the development of a Greek named entity recognizer. The system aims at information extraction applications where large scale text processing is needed. Speed of analysis, system robustness, and results accuracy have been the basic guidelines for the systems design. Pattern matching techniques have been implemented on top of an existing automated pipeline for Greek text processing and the resulting system depends on non-recursive regular expressions in order to capture different types of named entities. For development and testing purposes, we collected a corpus of financial texts from several web sources and manually annotated part of it. Overall precision and recall are 86% and 81% respectively.


Archive | 2000

From sentences to words and clauses

Stelios Piperidis; Harris Papageorgiou; Sotiris Boutsis

This chapter addresses the issue of multilingual corpora alignment, presenting schemes which attempt alignment at sentence, clause, noun phrase and word level. Statistical inductive techniques are coupled with symbolic processing analysing specific language phenomena. Sentence alignment combines statistical techniques with the notion of semantic load of text units. Lexical equivalences are extracted based on morphosyntactic tagging and noun phrase recognition on each side of the parallel corpus. A statistical score then filters the most likely translation candidates of single and multi-word units. Similarly, clause alignment couples surface linguistic analysis with a probabilistic model based on word occurrence and cooccurrence probabilities, and word lengths. The best clause alignment is approximated by feeding all possible alignments into a dynamic programming framework. Word and clause alignment have been tested on English-Greek parallel corpora of different domains, yielding results exploitable in knowledge acquisition applications. Sentence alignment has been tested in several languages and integrated in a computer-aided translation platform maximizing translation reuse and consistency.


international conference on computational linguistics | 2000

Application of analogical modelling to example based machine translation

Christos Malavazos; Stelios Piperidis

This paper describes a self-modelling, incremental algorithm for learning translation rules from existing bilingual corpora. The notions of supracontext and subcontext are extended to encompass bilingual information through simultaneous analogy on both source and target sentences and juxtaposition of corresponding results. Analogical modelling is performed during the learning phase and translation patterns are projected in a multi-dimensional analogical network. The proposed framework was evaluated on a small training corpus providing promising results. Suggestions to improve system performance are this kind of analysis unquestionably leads to more computationally expensive and difficult to obtain systems. Our approach consists in a fully modular analogical framework, which can cope with lack of resources, and will perform even better when these are available.


MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources | 2004

Building parallel corpora for eContent professionals

Maria Gavrilidou; Penny Labropoulou; Elina Desipri; Voula Giouli; Vassilis Antonopoulos; Stelios Piperidis

This paper reports on completed work carried out in the framework of the INTERA project, and specifically, on the production of multilingual resources (LRs) for eContent purposes. The paper presents the methodology adopted for the development of the corpus (acquisition and processing of the textual data), discusses the divergence of the initial assumptions from the actual situation met during this procedure, and concludes with a summarization of the problems attested which undermine the viability of multilingual parallel corpora construction.


Database | 2016

Text mining resources for the life sciences

Piotr Przybyła; Matthew Shardlow; Sophie Aubin; Robert Bossy; Richard Eckart de Castilho; Stelios Piperidis; John McNaught; Sophia Ananiadou

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability.


language resources and evaluation | 2014

The language resource Strategic Agenda: the FLaReNet synthesis of community recommendations

Claudia Soria; Nicoletta Calzolari; Monica Monachini; Valeria Quochi; Núria Bel; Khalid Choukri; Joseph Mariani; J.E.J.M. Odijk; Stelios Piperidis

Abstract The main purpose of this paper is to serve as a landmark for future research and in particular for future strategic, infrastructural and coordination initiatives. It presents a preliminary plan for actions and infrastructures that could become the basis for future initiatives in the sector of Language Resources and Technologies (LRTs). The FLaReNet Language Resource Strategic Agenda presents a set of recommendations for the development and progress of LRT in Europe, as issued from a three-year consultation of the FLaReNet European project. Recommendations cover a broad range of topics and activities, spanning over production and use of language resources, licensing, maintenance and preservation issues, infrastructures for language resources, resource identification and sharing, evaluation and validation, interoperability and policy issues. The intended recipients belong to a large set of players and stakeholders in LRT, ranging from individuals to research and education institutions, to policy-makers, funding agencies, SMEs and large companies, service and media providers. The main goal of these recommendations is to serve as an instrument to support stakeholders in planning for and addressing the urgencies of the LRT of the future.

Collaboration


Dive into the Stelios Piperidis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Iason Demiros

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

Joseph Mariani

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maria Koutsombogera

National and Kapodistrian University of Athens

View shared research outputs
Top Co-Authors

Avatar

Bente Maegaard

University of Copenhagen

View shared research outputs
Top Co-Authors

Avatar

Sotiris Boutsis

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

Prokopis Prokopidis

National and Kapodistrian University of Athens

View shared research outputs
Researchain Logo
Decentralizing Knowledge