Mihaela Colhon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mihaela Colhon is active.

Explore More

Publication

Featured researches published by Mihaela Colhon.

Fundamenta Informaticae | 2012

Feeding Syntactic Versus Semantic Knowledge to a Knowledge-lean Unsupervised Word Sense Disambiguation Algorithm with an Underlying Naïve Bayes Model

Florentina Hristea; Mihaela Colhon

The present paper concentrates on the issue of feature selection for unsupervised word sense disambiguation (WSD) performed with an underlying Naive Bayes model. It introduces dependency-based feature selection which, to our knowledge, is used for the first time in conjunction with the Naive Bayes model acting as clustering technique. Construction of the dependency-based semantic space required for the proposed task is discussed. The resulting disambiguation method, representing an extension of the method introduced in [15], lies at the border between unsupervised and knowledge-based techniques. Syntactic knowledge provided by dependency relations (and exemplified in the case of adjectives) is hereby compared to semantic knowledge offered by the semantic network WordNet (and examined in [15]). Our conclusion is that the Naive Bayes model reacts well in the presence of syntactic knowledge of this type and that dependency-based feature selection is a reliable alternative to the WordNet-based semantic one.

knowledge science, engineering and management | 2014

Relating the Opinion Holder and the Review Accuracy in Sentiment Analysis of Tourist Reviews

Mihaela Colhon; Costin Bădică; Alexandra Şendre

In this paper we propose a sentiment classification method for the categorization of tourist reviews according to the sentiment expressed. We also give the results of the application of our sentiment analysis method on a real data set extracted from the AmFostAcolo tourist review Web site. In our analysis we were focused on investigating the relation between the opinion holder and the accuracy of the review sentiment with the review score. Based on our initial experimental results we concluded that specific characteristics of the opinion holder, like for example his or her reputation, might relate to the accuracy of the opinions expressed in his or her reviews.

symbolic and numeric algorithms for scientific computing | 2009

Mamdani FLC with Various Implications

Ion Iancu; Mihaela Colhon

The task of the standard Mamdani fuzzy logic controller is to find a crisp control action from the fuzzy rule-base and from a set of crisp inputs. Because the interval inputs are frequently used in various domains (online shopping, for instance), in this paper we propose an extension of this type of controller which works with intervals as inputs and with various implication operators. For any implication we obtain a crisp value as output. Finally, these outputs are combined to obtain the overall crisp output action of the system.

balkan conference in informatics | 2015

Tourist review analytics using complex networks

Alex Becheru; Florin Emilian Buşe; Mihaela Colhon; Costin Bădică

A number of techniques for Natural Language Processing (shortly, NLP) based on graph representations were developed. Usually they target a specific NLP task, such as: text summarisation, syntactic parsing, word sense disambiguation, ontology construction, sentiment and subjectivity analysis, or text clustering. In this paper we explore complex network representation of tourist reviews for extracting lexical and quantitative features of the review text. The most important contribution of our proposal consists of defining a new method for keywords extraction using Complex Network ranking metrics.

international symposium on innovations in intelligent systems and applications | 2016

Polarity shifting for Romanian sentiment classification

Mihaela Colhon; Madalina Cerban; Alex Becheru; Mirela Teodorescu

There are three main classes of modifiers that can affect the polarity of the sentiments described in natural language texts: negations, intensifiers and diminishers. In this paper, we concentrate on the study of these particular words which have a very important semantic role in any natural language description. Our study is applied on a real data set extracted from the popular Romanian Web site AmFostAcolo dedicated to tourist impressions.

Archive | 2015

Quo Vadis: A Corpus of Entities and Relations

Dan Cristea; Daniela Gîfu; Mihaela Colhon; Paul Diac; Anca-Diana Bibiri; Cătălina Mărănduc; Liviu-Andrei Scutelnicu

This chapter describes a collective work aimed to build a corpus including annotations of semantic relations on a text belonging to the belletristic genre. The paper presents conventions of annotations for four categories of semantic relations and the process of building the corpus as a collaborative work. Part of the annotation is done automatically, such as the token/part of speech/lemma layer, and is performed during a preprocessing phase. Then, an entity layer (where entities of type person are marked) and a relation layer (evidencing binary relations between entities) are added manually by a team of trained annotators, the result being a heavily annotated file. A number of methods to obtain accuracy are detailed. Finally, some statistics over the corpus are drawn. The language under investigation is Romanian, but the proposed annotation conventions and methodological hints are applicable to any language and text genre.

RANLP 2017 - Workshop Knowledge Resources for the Socio-Economic Sciences and Humanities | 2017

A Multiform Balanced Dependency Treebank for Romanian

Mihaela Colhon; Cătălina Mărănduc; Cătălin Mititelu

The UAIC-RoDia-DepTb is a balanced treebank, containing texts in non-standard language: 2,575 chats sentences, old Romanian texts (a Gospel printed in 1648, a codex of laws printed in 1818, a novel written in 1910), regional popular poetry, legal texts, Romanian and foreign fiction, quotations. The proportions are comparable; each of these types of texts is represented by subsets of at least 1,000 phrases, so that the parser can be trained on their peculiarities. The annotation of the treebank started in 2007, and it has classical tags, such as those in school grammar, with the intention of using the resource for didactic purposes. The classification of circumstantial modifiers is rich in semantic information. We present in this paper the development in progress of this resource which has been automatically annotated and entirely manually corrected. We try to add new texts, and to make it available in more formats, by keeping all the morphological and syntactic information annotated, and adding logicalsemantic information. We will describe here two conversions, from the classic syntactic format into Universal Dependencies format and into a logical-semantic layer, which will be shortly presented.

Workshop on Social Media and the Web of Linked Data | 2015

Discovering Semantic Relations Within Nominals

Mihaela Colhon; Dan Cristea; Daniela Gîfu

We are interested to develop a technology able to discover entities and relations connecting them, as expressed in fiction texts. Deciphering these links is a major step in understanding the content of books. In this study we consider the case of imbricated entities, therefore entities realized at the surface text level by imbricated spans. For this research we use the QuoVadis corpus, whose conventions of annotations we describe briefly, same as some statistics on the types of relations, features regarding the relations’ arguments and words or expressions functioning as triggers. The approach to recognize the semantic relations is based on patterns extracted from the corpus. The evaluation shows very promising results.

web intelligence, mining and semantics | 2012

Deriving a statistical syntactic parsing from a treebank

Mihaela Colhon; Radu Simionescu

The study presented in this article is dedicated to a syntactic parser for Romanian. The central goal of the presented technique is to learn a model which is able to discriminate between probability for a word to be head of another word in a dependency structure corresponding to a sentence in the considered language. The model described in this paper was trained on a dependency treebank linguistic resource and is intended to be used in order to develop a dependency syntactic parser.

International Journal of Computers Communications & Control | 2010