Alina Maria Ciobanu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alina Maria Ciobanu is active.

Explore More

Publication

Featured researches published by Alina Maria Ciobanu.

conference of the european chapter of the association for computational linguistics | 2014

Temporal Text Ranking and Automatic Dating of Texts

Vlad Niculae; Marcos Zampieri; Liviu P. Dinu; Alina Maria Ciobanu

This paper presents a novel approach to the task of temporal text classification combining text ranking and probability for the automatic dating of historical texts. The method was applied to three historical corpora: an English, a Portuguese and a Romanian corpus. It obtained performance ranging from 83% to 93% accuracy, using a fully automated approach with very basic features.

meeting of the association for computational linguistics | 2014

Automatic Detection of Cognates Using Orthographic Alignment

Alina Maria Ciobanu; Liviu P. Dinu

Words undergo various changes when entering new languages. Based on the assumption that these linguistic changes follow certain rules, we propose a method for automatically detecting pairs of cognates employing an orthographic alignment method which proved relevant for sequence alignment in computational biology. We use aligned subsequences as features for machine learning algorithms in order to infer rules for linguistic changes undergone by words when entering new languages and to discriminate between cognates and non-cognates. Given a list of known cognates, our approach does not require any other linguistic information. However, it can be customized to integrate historical information regarding language evolution.

empirical methods in natural language processing | 2014

An Etymological Approach to Cross-Language Orthographic Similarity. Application on Romanian

Alina Maria Ciobanu; Liviu P. Dinu

In this paper we propose a computational method for determining the orthographic similarity between Romanian and related languages. We account for etymons and cognates and we investigate not only the number of related words, but also their forms, quantifying orthographic similarities. The method we propose is adaptable to any language, as far as resources are available.

north american chapter of the association for computational linguistics | 2015

AMBRA: A Ranking Approach to Temporal Text Classification

Marcos Zampieri; Alina Maria Ciobanu; Vlad Niculae; Liviu P. Dinu

This paper describes the AMBRA system, entered in the SemEval-2015 Task 7: ‘Diachronic Text Evaluation’ subtasks one and two, which consist of predicting the date when a text was originally written. The task is valuable for applications in digital humanities, information systems, and historical linguistics. The novelty of this shared task consists of incorporating label uncertainty by assigning an interval within which the document was written, rather than assigning a clear time marker to each training document. To deal with non-linear effects and variable degrees of uncertainty, we reduce the problem to pairwise comparisons of the form is Document A older than Document B?, and propose a nonparametric way to transform the ordinal output into time intervals.

international joint conference on natural language processing | 2015

Automatic Discrimination between Cognates and Borrowings

Alina Maria Ciobanu; Liviu P. Dinu

Identifying the type of relationship between words provides a deeper insight into the history of a language and allows a better characterization of language relatedness. In this paper, we propose a computational approach for discriminating between cognates and borrowings. We show that orthographic features have discriminative power and we analyze the underlying linguistic factors that prove relevant in the classification task. To our knowledge, this is the first attempt of this kind.

conference of the european chapter of the association for computational linguistics | 2014

Predicting Romanian Stress Assignment

Alina Maria Ciobanu; Anca Dinu; Liviu P. Dinu

We train and evaluate two models for Romanian stress prediction: a baseline model which employs the consonant-vowel structure of the words and a cascaded model with averaged perceptron training consisting of two sequential models ‐ one for predicting syllable boundaries and another one for predicting stress placement. We show in this paper that Romanian stress is predictable, though not deterministic, by using data-driven machine learning techniques.

Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) | 2014

A Quantitative Insight into the Impact of Translation on Readability

Alina Maria Ciobanu; Liviu P. Dinu

In this paper we investigate the impact of translation on readability. We propose a quantitative analysis of several shallow, lexical and morpho-syntactic features that have been traditionally used for assessing readability and have proven relevant for this task. We conduct our experiments on a parallel corpus of transcribed parliamentary sessions and we investigate readability metrics for the original segments of text, written in the language of the speaker, and their translations.

Procedia Computer Science | 2016

Sequence Labeling for Cognate Production

Alina Maria Ciobanu

We propose a sequence labeling approach to cognate production based on the orthography of the words. Our approach leverages the idea that orthographic changes represent sound correspondences to a fairly large extent. Given an input word in language L1, we seek to determine its cognate pair in language L2. To this end, we employ a sequential model which captures the intuition that orthographic changes are highly dependent on the context in which they occur. We apply our method on two pairs of languages. Finally, we investigate how second language learners perceive the orthographic changes from their mother tongue to the language they learn.

Archive | 2018

Romanian Word Production: An Orthographic Approach Based on Sequence Labeling

Liviu P. Dinu; Alina Maria Ciobanu

Languages borrow words from one another for various reasons. How the borrowing process takes place, how new words enter a recipient language are key questions of historical linguistics. In this paper, we propose a multilingual method for word form production based on the orthography of the words. For borrowed words, we investigate the derivation from a donor language into a recipient language. We also address the problem of genetic cognates derivation. We experiment with Romanian as a recipient language and we investigate borrowings from multiple donor languages. The advantages of the proposed method are that it does not use any external knowledge, except for the training word pairs, and it does not require the phonetic transcriptions of the input words.

international conference on computational linguistics | 2017

Towards a Map of the Syntactic Similarity of Languages

Alina Maria Ciobanu; Liviu P. Dinu; Andrea Sgarro

In this paper we propose a computational method for determining the syntactic similarity between languages. We investigate multiple approaches and metrics, showing that the results are consistent across methods. We report results on 16 languages belonging to various language families. The analysis that we conduct is adaptable to any languages, as far as resources are available.

Explore More