Ana Meštrović
University of Rijeka
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ana Meštrović.
International Journal on Semantic Web and Information Systems | 2016
Slobodan Beliga; Ana Meštrović; Sanda Martinčić-Ipšić
In this work the authors propose a novel Selectivity-Based Keyword Extraction (SBKE) method, which extracts keywords from the source text represented as a network. The node selectivity value is calculated from a weighted network as the average weight distributed on the links of a single node and is used in the procedure of keyword candidate ranking and extraction. The authors show that selectivity-based keyword extraction slightly outperforms an extraction based on the standard centrality measures: in/out-degree, betweenness and closeness. Therefore, they include selectivity and its modification – generalized selectivity as node centrality measures in the SBKE method. Selectivity-based extraction does not require linguistic knowledge as it is derived purely from statistical and structural information of the network. The experimental results point out that selectivity-based keyword extraction has a great potential for the collection-oriented keyword extraction task.
CompleNet | 2014
Domagoj Margan; Sanda Martinčić-Ipšić; Ana Meštrović
This paper is an initial attempt to study the properties of the Croatian word order via complex networks. We present network properties of normal and shuffled Croatian texts for different co-occurrence window sizes and different linkage boundaries. The results of network analysis show that the text shuffling causes the decrease of the network diameter, due to the establishment of previously non-existing links. This indicates that the syntax does play a significant role in the Croatian language, although it is a mostly free word-order language.
international convention on information and communication technology, electronics and microelectronics | 2014
Sabina Sisovic; Sanda Martinčić-Ipšić; Ana Meštrović
In this paper we present the comparison of the linguistic networks from literature and blog texts. The linguistic networks are constructed from texts as directed and weighted co-occurrence networks of words. Words are nodes and links are established between two nodes if they are directly co-occurring within the sentence. The comparison of the networks structure is performed at global level (network) in terms of: average node degree, average shortest path length, diameter, clustering coefficient, density and number of components. Furthermore, we perform analysis on the local level (node) by comparing the rank plots of in and out degree, strength and selectivity. The selectivity-based results point out that there are differences between the structure of the networks constructed from literature and blogs.
international convention on information and communication technology electronics and microelectronics | 2014
Domagoj Margan; Ana Meštrović; Sanda Martinčić-Ipšić
This paper studies the properties of the Croatian texts via complex networks. We present network properties of normal and shuffled Croatian texts for different shuffling principles: on the sentence level and on the text level. In both experiments we preserved the vocabulary size, word and sentence frequency distributions. Additionally, in the first shuffling approach we preserved the sentence structure of the text and the number of words per sentence. Obtained results showed that degree rank distributions exhibit no substantial deviation in shuffled networks, and strength rank distributions are preserved due to the same word frequencies. Therefore, standard approach to study the structure of linguistic co-occurrence networks showed no clear difference among the topologies of normal and shuffled texts. Finally, we showed that the in- and out- selectivity values from shuffled texts are constantly below selectivity values calculated from normal texts. Our results corroborate that the node selectivity measure can capture structural differences between original and shuffled Croatian texts.
international convention on information and communication technology electronics and microelectronics | 2015
Domagoj Margan; Ana Meštrović
In this paper we describe LaNCoA, Language Networks Construction and Analysis toolkit implemented in Python. The toolkit provides various procedures for network construction from the text: on the word-level (co-occurrence networks, syntactic networks, shuffled networks), and on the subword-level (syllable networks, grapheme networks). Furthermore, we implement functions for the language networks analysis on the global and local level. The toolkit is organized in several modules that enable various aspects of language analysis: analysis of global network measures for different co-occurrence window, comparison of networks based on original and shuffled texts, comparison of networks constructed on different language levels, etc. Text manipulation methods, like corpora cleaning, lemmatization and stopwords removal, are also implemented. For the basic network representation we use available NetworkX functions and methods. However, language network analysis is specific and it requires implementation of additional functions and methods. That was the main motivation for this research.
Semanitic Keyword-based Search on Structured Data Sources | 2016
Ana Meštrović; Andrea Calì
We define a general framework for ontology-based information retrieval (IR). In our approach, document and query expansion rely on a base taxonomy that is extracted from a lexical database or a Linked Data set (e.g. WordNet, Wiktionary etc.). Each term from a document or query is modelled as a vector of base concepts from the base taxonomy. We define a set of mapping functions which map multiple ontological layers (dimensions) onto the base taxonomy. This way, each concept from the included ontologies can also be represented as a vector of base concepts from the base taxonomy. We propose a general weighting schema which is used for the vector space model. Our framework can therefore take into account various lexical and semantic relations between terms and concepts (e.g. synonymy, hierarchy, meronymy, antonymy, geo-proximity, etc.). This allows us to avoid certain vocabulary problems (e.g. synonymy, polysemy) as well as to reduce the vector size in the IR tasks.
international symposium elmar | 2007
Ivo Ipšić; Maja Matetic; Sanda Martinčić-Ipšić; Ana Meštrović; Marija Brkić
Speech technologies deal with designing computer systems that can recognize spoken words, comprehend human language and generate intelligible speech. There is a wide range of applications speech technology systems were successfully implemented in. One of the most complex applications in speech technology is a spoken dialog system, which can be used for information inquiry services. In the paper we present the work in the development of a spoken dialog system for Croatian language. We propose an approach for development of modules in a spoken dialog system for the limited domain which uses the same acoustic model for speech recognition and speech synthesis. For the linguistic analysis an approach is proposed which is based on qualitative language modelling, while the development of the dialog module is based on the formalism of the object-oriented frame logic language. Some experimental results for Croatian speech recognition and understanding are presented and discussed.
international conference on information and software technologies | 2016
Sanda Martinčić-Ipšić; Tanja Miličić; Ana Meštrović
In this paper co-occurrence language network measures from literature and legal texts are compared on the global and on the local scale. Our dataset consists of four legal texts and four short novellas both written in English. For each text we construct one directed and weighted network, where weight of a link between two nodes represents overall co-occurrence frequencies of the corresponding words. We choose four literature-law pairs of texts with approximately the same number of different words for comparison. The aim of this experiment was to investigate how complex network measures operate in different structures of texts and which of them are sensitive to different text types. Our results show that on the global scale only average strength is the measure that exhibit some uniform behaviour due to the differences in textual complexity. In general, global measures may not be well suited to discriminate between mentioned genres of texts. However, local perspective rank plots of in and out selectivity (average node strength) indicate that there are more noticeable structural differences between legal texts and literature.
international conference on information and software technologies | 2015
Tajana Ban Kirigin; Ana Meštrović; Sanda Martinčić-Ipšić
Multilayer networks and related concepts have been used for description and analysis of complex systems in many fields, such as for example biological, physical, social and information systems. In this paper we present the first steps towards defining a formal model for language networks representation - Multilayer Language Network (MLN) which is based on multilayer network formalism and which is suitable for representation, analysis and comparison of languages both in their entirety as well as in their various characteristics and complexity. The goal of this research is to define a universal formal model for languages, capturing various language levels (subsystems) and various language characteristics. As a starting point we apply standard network diagnostics on an MLN model for an English and Croatian text, considering word, syllable and grapheme language subsystems and various construction principles, and present obtained results.
international conference on intelligent engineering systems | 2006
Ana Meštrović; Mirko Čubrilo
The paper addresses data integration for heterogeneous knowledge sources in the semantic Web domain. The objective of the works is to show that logic programming languages with second order syntax and with object oriented approach (HiLog+F-logic) provide adequate support for a data integration system. In particular, object oriented approach supporting the F-logic language provides a natural translation and manipulation of Web data. The paper describes elements relevant for the development of semantic Web applications. Activities of data integration process are explained through the example of an application developed for a specific domain, searching data about scientific conferences