Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nikola Ljubešić is active.

Publication


Featured researches published by Nikola Ljubešić.


information technology interfaces | 2007

Language Indentification: How to Distinguish Similar Languages?

Nikola Ljubešić; Nives Mikelić; Damir Boras

The goal of this paper is to discuss the language identification problem of Croatian, language that even state-of-the-art language identification tools find, hard to distinguish from similar languages, such as Serbian, Slovenian or Slovak language. We developed the tool that implements the list of Croatian most frequent words with the threshold that each document needs to satisfy, we added, the specific characters elimination rule, applied second-order Markov model classification and a, rule of forbidden words. Finally, we built up the tool that, overperforms current tools in discriminating between these similar languages.


Proceedings of the 9th Web as Corpus Workshop (WaC-9) | 2014

bs,hr,srWaC - Web Corpora of Bosnian, Croatian and Serbian

Nikola Ljubešić; Filip Klubiċka

In this paper we present the construction process of top-level-domain web corpora of Bosnian, Croatian and Serbian. For constructing the corpora we use the SpiderLing crawler with its associated tools adapted for simultaneous crawling and processing of text written in two scripts, Latin and Cyrillic. In addition to the modified collection process we focus on two sources of noise in the resulting corpora: 1. they contain documents written in the other, closely related languages that can not be identified with standard language identification methods and 2. as most web corpora, they partially contain low-quality data not suitable for the specific research and application objectives. We approach both problems by using language modeling on the crawled data only, omitting the need for manually validated language samples for training. On the task of discriminating between closely related languages we outperform the state-of-the-art Blacklist classifier reducing its error to a fourth.


information technology interfaces | 2008

Comparing measures of semantic similarity

Nikola Ljubešić; Damir Boras; Nikola Bakarić; Jasmina Njavro

The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translation etc. Additionally, one of the main problems in natural language processing is data sparseness since no language sample is large enough to seize all possible language combinations. In our research we experiment with four different measures of association with context and eight different measures of vector similarity. The results show that the Jensen-Shannon divergence and L1 and L2 norm outperform other measures of vector similarity regardless of the measure of association with context used. Maximum likelihood estimate and t-test show better results than other measures of association with context.


Biological Chemistry | 2005

Differential accumulation of plastid preprotein translocon components during spruce (Picea abies L. Karst.) needle development.

Hrvoje Fulgosi; Hrvoje Lepeduš; Vera Cesar; Nikola Ljubešić

Abstract We demonstrate that basic components of the plastid protein-import apparatus originally found in pea, Toc34, Toc159, and Tic110, are also conserved in evolutionarily younger gymnosperms. We show that multiple isoforms of the preprotein receptor Toc34 differentially accumulate in various stages of needle development, while the amounts of Toc159 drastically decrease during chloroplast morphogenesis. Spruce Toc34 and Toc159 receptors are able to recognise and interact with the angiosperm precursor of the Rubisco small subunit. Young proplastids found in closed buds contain a highly elevated number of protein translocation complexes equipped with only two types of outer envelope receptors, Toc159 and a 30-kDa Toc34-related protein. Photosystem II (PSII) can already be assembled in a fully functional complex at this very early stage of needle development, suggesting that no additional receptor isoforms are needed for translocation of all necessary PSII components. We conclude that the accumulation of evolutionarily conserved plastid preprotein translocation components is differentially regulated during spruce needle development.


international conference on computational linguistics | 2014

Standardizing Tweets with Character-Level Machine Translation

Nikola Ljubešić; Tomaž Erjavec; Darja Fišer

This paper presents the results of the standardization procedure of Slovene tweets that are full of colloquial, dialectal and foreign-language elements. With the aim of minimizing the human input required we produced a manually normalized lexicon of the most salient out-of-vocabulary OOV tokens and used it to train a character-level statistical machine translation system CSMT. Best results were obtained by combining the manually constructed lexicon and CSMT as fallback with an overall improvement of 9.9% increase on all tokens and 31.3% on OOV tokens. Manual preparation of data in a lexicon manner has proven to be more efficient than normalizing running text for the task at hand. Finally we performed an extrinsic evaluation where we automatically lemmatized the test corpus taking as input either original or automatically standardized wordforms, and achieved 75.1% per-token accuracy with the former and 83.6% with the latter, thus demonstrating that standardization has significant benefits for upstream processing.


text speech and dialogue | 2011

Bootstrapping bilingual lexicons from comparable corpora for closely related languages

Nikola Ljubešić; Darja Fišer

In this paper we present an approach to bootstrap a Croatian-Slovene bilingual lexicon from comparable news corpora from scratch, without relying on any external bilingual knowledge resource. Instead of using a dictionary to translate context vectors, we build a seed lexicon from identical words in both languages and extend it with context-based cognates and translation candidates of the most frequent words. By enlarging the seed dictionary for only 7% we were able to improve the baseline precision from 0.597 to 0.731 on the mean reciprocal rank for the ten top-ranking translation candidates with a 50.4% recall on the gold standard of 500 entries.


Journal of Plant Biology | 2008

Ultrastructural characterization of the reversible differentiation of chloroplasts in cucumber fruit

Tatjana Prebeg; Mercedes Wrischer; Hrvoje Fulgosi; Nikola Ljubešić

The changes in plastid ultrastructure in the pericarp of cucumber (Cucumis sativus L) fruit were studied during fruit yellowing (which accompanied maturation) and regreening. In the course of fruit maturation, the thylakoid system was progressively reduced, and only a small number of membranes remained in the plastids of mature fruit. At the same time, the plastoglobules increased in size, often remaining in close proximity to the degrading thylakoids. In pericarp tissue which turned green again, the thylakoid network in the plastids was gradually reconstituted. Morphological similarities between the plastids in mature and regreening fruit indicated that the chloroplasts in regreened tissue were redifferentiated from the plastids of mature fruit. Reconstitution of the thylakoid system appeared to start from two morphologically distinct types of membranes: from double membranes which resembled thylakoids and from membrane-bound bodies (MBBs). The latter appeared to form thylakoids by two mechanisms: by detachment of extensions from their surfaces and by fragmentation. The plastoglobules remained in the plastids during thylakoid system reconstitution and were often observed in close proximity to developing thylakoids. In the course of chloroplast redifferentiation, several types of membraneous structures were found to be associated with the plastid envelope: (i) vesicles which appeared to separate from the envelope and to fuse subsequently with the developing thylakoids, (ii) tubules, and (iii) double-membrane sheets which appeared asde novo forming thylakoids.


Journal of Plant Physiology | 1998

The role of carotenoids in the structural and functional stability of thylakoids in plastids of dark-grown spruce seedlings

Mercedes Wrischer; Nikola Ljubešić; Branka Salopek

Summary The ability of cotyledons of spruce seedlings to develop their photosynthetic apparatus when grown in the dark was used to study the effect of the «bleaching» herbicide norflurazon (NF) on the formation of their etiochloroplasts. Ultrastructural analyses showed that in etiochloroplasts of cotyledons of 14-day-old seedlings grown in the dark on NF, the number of thylakoids, in particular of grana-thylakoids, was reduced. In plastids resulting after growth on a 200 μmol/L solution of NF, only single thylakoids developed, while in those developed after growth on 20 μmol/L of NF, incomplete stacking of thylakoids appeared. Lower concentrations of NF were less harmful, so that etiochloroplasts with grana regions developed. NF did not influence the formation of prolamellar bodies in the etiochloroplasts. Very strong reduction of carotenoid content, in particular of carotenes and violaxanthin, was obtained after growth of the seedlings in the dark on 200 and 20 μmol/L of NF. At the same time, the content of chlorophylls was also reduced, although to a much lesser degree than that of carotenoids. Analyses of membrane proteins indicated that in these etiochloroplasts the content of polypeptides of LHC II, the major light-harvesting complexes of PS II, was impaired. Only seedlings with less altered plastids were able to develop some photosynthetic activity. High concentrations of NF (200 and 20 μmol/L) caused bleaching of the cotyledons and damage of their plastids when seedlings germinated in the dark were transferred to the light and illuminated for several days.


meeting of the association for computational linguistics | 2016

A Global Analysis of Emoji Usage.

Nikola Ljubešić; Darja Fišer

Emojis are a quickly spreading and rather unknown communication phenomenon which occasionally receives attention in the mainstream press, but lacks the scientific exploration it deserves. This paper is a first attempt at investigating the global distribution of emojis. We perform our analysis of the spatial distribution of emojis on a dataset of ∼17 million (and growing) geo-encoded tweets containing emojis by running a cluster analysis over countries represented as emoji distributions and performing correlation analysis of emoji distributions and World Development Indicators. We show that emoji usage tends to draw quite a realistic picture of the living conditions in various parts of our world.


Plant Physiology and Biochemistry | 1998

Formation of the photosynthetic apparatus in plastids during greening of potato microtubers

Jasmina Muraja Ljubičić; Mercedes Wrischer; Nikola Ljubešić

The process of amyloplast-to-chloroplast and leucoplast-to-chloroplast transformation in potato microtubers (Solanum tuberosum L. var. Istra) exposed to light was studied. After 12 h of light exposure, the characteristic chlorophyll fluorescence was already detectable using a fluorescence microscope. At the same time, HPLC analyses showed the presence of chlorophylls a and b. Simultaneously, the assembly of the LHCII protein complex was detected by Western blot analysis. The formation of the LHCII protein and the appearance of chlorophylls and carotenoids confirmed the development of the LHCII pigmented multiprotein complexes during light-driven differentiation of the chloroplasts. Electron microscope analyses showed the development of thylakoids in amyloplasts and leucoplasts leading to the formation of chloroplasts.

Collaboration


Dive into the Nikola Ljubešić's collaboration.

Top Co-Authors

Avatar

Darja Fišer

University of Ljubljana

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Filip Klubička

Dublin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hrvoje Lepeduš

Josip Juraj Strossmayer University of Osijek

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Vera Cesar

Josip Juraj Strossmayer University of Osijek

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge