Darko Brodić
University of Belgrade
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Darko Brodić.
Sensors | 2010
Darko Brodić; Dragan R. Milivojevic; Zoran N. Milivojević
Text line segmentation is an essential stage in off-line optical character recognition (OCR) systems. It is a key because inaccurately segmented text lines will lead to OCR failure. Text line segmentation of handwritten documents is a complex and diverse problem, complicated by the nature of handwriting. Hence, text line segmentation is a leading challenge in handwritten document image processing. Due to inconsistencies in measurement and evaluation of text segmentation algorithm quality, some basic set of measurement methods is required. Currently, there is no commonly accepted one and all algorithm evaluation is custom oriented. In this paper, a basic test framework for the evaluation of text feature extraction algorithms is proposed. This test framework consists of a few experiments primarily linked to text line segmentation, skew rate and reference text line evaluation. Although they are mutually independent, the results obtained are strongly cross linked. In the end, its suitability for different types of letters and languages as well as its adaptability are its main advantages. Thus, the paper presents an efficient evaluation method for text analysis algorithms.
Neural Computing and Applications | 2018
Darko Brodić; Alessia Amelio; Zoran N. Milivojević
The manuscript provides a novel method for language identification using the texture analysis of the script. The method consists of mapping each letter from the text with certain script type. It is made according to characteristics concerning the position of the letter in the baseline area. In order to extract features, the co-occurrence matrix is computed. Then, the texture features are calculated. Extracted measures show meaningful differences due to dissimilarities in the script and language characteristics. It represents a basis in a decision-making process of the language identification. Feature classification is performed by the extension of a state-of-the-art method called genetic algorithms image clustering for document analysis. The proposed method is tested on an example of documents given in English, French, Slovenian and Serbian languages and compared to other well-known classification methods and feature representations in the state of the art. The results of experiments show the superiority of the proposed approach.
The Scientific World Journal | 2013
Darko Brodić; Zoran N. Milivojević; Čedomir A. Maluckov
Any document in Serbian language can be written in two different scripts: Latin or Cyrillic. Although characteristics of these scripts are similar, some of their statistical measures are quite different. The paper proposed a method for the extraction of certain script from document according to the occurrence and co-occurrence of the script types. First, each letter is modeled with the certain script type according to characteristics concerning its position in baseline area. Then, the frequency analysis of the script types occurrence is performed. Due to diversity of Latin and Cyrillic script, the occurrence of modeled letters shows substantial statistics dissimilarity. Furthermore, the co-occurrence matrix is computed. The analysis of the co-occurrence matrix draws a strong margin as a criteria to distinguish and recognize the certain script. The proposed method is analyzed on the case of a database which includes different types of printed and web documents. The experiments gave encouraging results.
Journal of Electrical Engineering-elektrotechnicky Casopis | 2011
Darko Brodić
The Evaluation of the Initial Skew Rate for Printed Text In this manuscript the algorithm for identification of the initial skew rate for printed text is presented. Proposed algorithm creates rectangular hull around all text characters. Combining nearby rectangular hulls form objects. After applying mathematical morphology on it, the biggest object is characterized as well as selected. Rectangular hull gravity center forms reference points on these objects used as a base for calculation ieestimation of the initial skew rate. Using the least square method, initial skew rate is calculated. Comparative analysis of the origin and estimated skew rate is presented as well as discussed. Algorithm is examined with a number of printed text examples. Proposed algorithm showed robustness for skewness of printed text in the wide range.
international test conference | 2012
Darko Brodić; Dragan R. Milivojevic
The paper presents a methodology for the estimation of the initial skew rate of text lines. Firstly, it splits text into groups according to the bounding boxes. Linked bounding boxes establish the bigger objects called connected components. After applying mathematical morphology operations, the enlarged group of the connected components is formed. The longest connected component is extracted by the longest common subsequence method. Inside the longest connected component, the gravity centers are determined for each bounding box. They represent the reference points, which are used for the calculation of the initial skew rate. Calculation is made by the moment based method. The comparative analysis of the origin and estimated skew rate is used to evaluate the algorithm. Hence, the proposed algorithm is examined with different printed text samples. It showed robustness for the skew estimation in the wide range of resolutions. DOI: http://dx.doi.org/10.5755/j01.itc.41.3.1249
Applied Intelligence | 2017
Darko Brodić; Alessia Amelio; Zoran N. Milivojević
This paper introduces a new method for clustering of documents, which have been written in a language evolving during different historical periods, with an example of the Italian language. In the first phase, the text is transformed into a string of four numerical codes, which have been derived from the energy profile of each letter, defining the height of the letters and their location in the text line. Each code represents a gray level and the text is codified as a 1-D image. In the second phase, texture features are extracted from the obtained image in order to create document feature vectors. Subsequently, a new clustering algorithm is employed on the feature vectors to discriminate documents from different historical periods of the language. Experiments are performed on a database of Italian documents given in Italian Vulgar and modern Italian. Results demonstrate that this proposed method perfectly identifies the historical periods of the language of the documents, outperforming other well-known clustering algorithms generally adopted for document categorization and other state-of-the-art text-based language models.
Applied Artificial Intelligence | 2016
Darko Brodić; Alessia Amelio; Zoran N. Milivojević
ABSTRACT This article proposes an algorithm for script identification by textural analysis of the image corresponding to the script types. In the first phase, each letter is modeled by the equivalent script type, which is determined by its position in the baseline area. Then, feature extraction is carried out. It is based on the script type cooccurrence pattern analysis. The obtained features of the script are stored for further analysis. The difference in script characteristics contributes to the diversity of the extracted features, which simplify the feature classification obtained by an extension of a state-of-the-art classification tool called Genetic Algorithms Image Clustering for Document Analysis. Accordingly, it represents the key element in the decision-making process of script identification. The proposed method is tested on an example of German printed documents, which contain Latin and Fraktur scripts. The experiment shows correct results, which is promising.
soft computing | 2015
Darko Brodić; Zoran N. Milivojević; Čedomir A. Maluckov
The paper deals with the problem of the script discrimination in old Slavic printed documents. Therefore, an algorithm for script classification and identification is proposed. It creates coded text from initial document. Then, the coded text is subjected to statistical analysis. As a result, the texture feature extraction is carried out. Obtained texture features are used as criteria for script classification and identification. The proposed method is tested on the samples of old Slavic printed documents written in Glagolitic, Cyrillic and Latin script.
Measurement Science Review | 2015
Darko Brodić; Alessia Amelio
Abstract The paper considers the level of the extremely low-frequency magnetic field, which is produced by laptop computers. The magnetic field, which is characterized by extremely low frequencies up to 300 Hz is measured due to its hazardous effects to the laptop users health. The experiment consists of testing 13 different laptop computers in normal operation conditions. The measuring of the magnetic field is performed in the adjacent neighborhood of the laptop computers. The measured data are presented and then classified. The classification is performed by the K-Medians method in order to determine the critical positions of the laptop. At the end, the measured magnetic field values are compared with the critical values suggested by different safety standards. It is shown that some of the laptop computers emit a very strong magnetic field. Hence, they must be used with extreme caution.
ifip international conference on theoretical computer science | 2010
Darko Brodić
In this paper, extended approach to Gaussian kernel algorithm for text segmentation, reference text line and skew rate extractions is presented. It assumes creation of boundary growing area around text based on Gaussian kernel algorithm extended by anisotropic approach. Those boundary growing areas form control image with distinct objects that are prerequisite for text segmentation. After text segmentation, text parameters such as reference text line and skew rate are calculated based on numerical method. Algorithm quality is examined by experiments. Results are evaluated by RMS method. Obtained results are compared with isotropic Gaussian kernel method. All results are examined, analyzed and summarized. Furthermore, optimal parameter values are suggested leading to anisotropic kernel optimization.