Jerzy Sas
Wrocław University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jerzy Sas.
Information Sciences | 2012
Jerzy Sas; Urszula Markowska-Kaczmar
In the paper we consider the problem of continuous handwriting segmentation into individual characters. The ultimate aim is to create the set of isolated character images used as a training set for the writer-dependent handwriting recognizer. Analytic approach is applied, where word recognition is based on the individual classification of characters. The input to the proposed segmentation method is a handwritten text image consisting of known words. The method consists of three stages. Initially, images of isolated words are over-segmented into sequences of graphemes. At the first stage the genetic algorithm is used to create the set of segmentation variants that are likely to correspond to actual characters. The fitness function is based on the similarity of images within subsets of images of the same character. At the second stage, the set of segmentation variants elicited as the last generation of the genetic algorithm is refined by applying a sequence of subtle segment boundary displacements that increase the similarity of images within sets of the same characters. In the third stage the most typical character prototypes are selected and fixed in word images. The segmentation of remaining words fragments is achieved by maximizing the similarity to the fixed character prototypes. The accuracy of handwritten text recognition with the acquired character images after each stage was experimentally evaluated. Experiments with continuous handwriting recognition show that application of each stage improves the word recognition accuracy.
international conference on artificial intelligence and soft computing | 2006
Jerzy Sas
In this paper, two level handwriting recognition concept is presented, where writer identification is used in order to increase handwriting recognition accuracy. On the upper level, author identification is performed. Lower level consists of a classifiers set trained on samples coming from individual writers. Recognition from upper level is used on the lower level for selecting or combining classifiers trained for identified writers. The feature set used on the upper level contains directional features as well as the features characteristic for general writing style as line spacing, tendency to line skewing and proportions of text line elements. which are usually lost in typical process of handwritten text. normalization. The proposed method can be used in applications, where texts subject to recognizing come form relatively small set of known writers.
computer recognition systems | 2013
Jerzy Sas; Andrzej Zolnierek
In the paper the combined approach to the problem of text region recognition problem is presented. We focused our attention on the chosen case of text extraction problem from specific type of images where text is imposed over graphical layer of vector images (charts, diagrams, etc.). For such images we proposed three-stage method using OCR tools as some kind of feed-back in process of text region searching. Some experimental results and examples of practical applications of recognition method are also briefly described.
intelligent systems design and applications | 2005
Jerzy Sas; Michal Luzyna
In the paper, the method of combining character classifiers for handprinted text recognition is presented. The combination rule is based on member classifiers reliability assessment. The assessment can be based on probabilistic classifier properties or it can use similarity measures individually evaluated for the character currently being recognized. The approach presented here follows soft classification paradigm, where the classifier not merely selects single class, but it provides the vector of support values corresponding to character likelihood. The proposed methods have been tested and compared in recognizing letters from polish alphabet, including nine difficult do recognize diacritic characters.
computer analysis of images and patterns | 2007
Jerzy Sas; Urszula Markowska-Kaczmar
In this paper, a method of semi-automatic training set acquisition for character classifiers used in cursive handwriting recognition is described. The training set consists of character samples extracted from a training corpus by segmentation. The method first splits the word images from the corpus into a sequence of graphemes. Then, the set of candidate segmentation variants is elicited with an evolutionary algorithm, where the segmentation variant determines subdivision of grapheme sequences of words into subsequences corresponding to consecutive letters. Segmentation variants are modeled by a chromosome population. Next, each segmentation variant from the final population is tuned in an iterative process and the best chromosome is selected. Then character samples resulting from application of the segmentation modeled by the selected chromosome are grouped into sets corresponding to letters from the alphabet. Finally, the most outstanding samples are rejected so as to maximize the accuracy of words recognition obtained with a character classifier trained with the reduced samples set.
document engineering | 2006
Grzegorz Godlewski; Maciej Piasecki; Jerzy Sas
In the paper, three-level hand-writing recognition using language syntactic properties on the upper level is presented. Isolated characters are recognized on the lowest level. The character classification from the lowest level is used in words recognition. Words are recognized using a combined classifier based on possibly incomplete unigram lexicon. Word classifier builds a rank of the most likely words. Ranks created for subsequent words are input to the syntactic classifier, which recognizes the whole sentences. Here the local syntactic constraints are used to build a syntactically consistent sentence. The method has been applied to recognition of hand-written medical texts describing fixed aspects of patient treatment. Due to narrow area of topics explained in the texts and peculiarity of style characteristic for physicians writing texts, the syntax of expected sentences is relatively simple, what makes the problem of checking the syntactic consistency simpler.
Lecture Notes in Computer Science | 2001
Marek Kurzynski; Edward Puchala; Jerzy Sas
The present paper is devoted to the pattern recognition procedures that simultaneously use the information contained in the empirical data (learning set) and the set of expert rules with unprecisely formulated weights understood as conditional probabilities. Adopting the probabilistic model the combined and unified recognition algorithms are derived. In the first approach algorithm is based simply on the both set of data, in the second however, one set of data is transformed into the second one. Proposed algorithms were applied practically to the diagnosis of acute renal failure in children. Obtained results have proved its effectiveness in the computer medical decisionmaking.
International Journal of Applied Mathematics and Computer Science | 2013
Jerzy Sas; Andrzej Żołnierek
Abstract The aim of works described in this article is to elaborate and experimentally evaluate a consistent method of Language Model (LM) construction for the sake of Polish speech recognition. In the proposed method we tried to take into account the features and specific problems experienced in practical applications of speech recognition in the Polish language, reach inflection, a loose word order and the tendency for short word deletion. The LM is created in five stages. Each successive stage takes the model prepared at the previous stage and modifies or extends it so as to improve its properties. At the first stage, typical methods of LM smoothing are used to create the initial model. Four most frequently used methods of LM construction are here. At the second stage the model is extended in order to take into account words indirectly co-occurring in the corpus. At the next stage, LM modifications are aimed at reduction of short word deletion errors, which occur frequently in Polish speech recognition. The fourth stage extends the model by insertion of words that were not observed in the corpus. Finally the model is modified so as to assure highly accurate recognition of very important utterances. The performance of the methods applied is tested in four language domains.
computer information systems and industrial management applications | 2007
Jerzy Sas; Urszula Markowska-Kaczmar
In this paper, the problem of semi-supervised handwriting segmentation into isolated character images is considered. Semi-supervised segmentation means here that the character sequence constituting a word presented on the image is known, but the character boundaries are not given and need to be automatically determined. The semi-supervised word segmentation can be useful in analytic writer-dependent approach to handwriting recognition, where the training set for personalized character classifier must be created for each writer from the text corpus consisting of text samples of an individual writer. The method described here over-segments the word images into sequences of graphemes in the first step. Then such grapheme sequences subdivision is sought, which results in the hypothetical character images sets maximizing average similarity in subsets corresponding to characters from the alphabet. It leads to the combinatorial optimization problem with enormously large search space. The suboptimal solution of this problem can be found using evolutionary algorithm. The sample character images extracted in this way can be used to train character classifiers. Some preliminary results of handwriting segmentation are presented in the paper and compared with fully supervised segmentation carried out by a human.
computer recognition systems | 2005
Jerzy Sas; Marek Kurzynski
In the paper the multilevel probabilistic approach to handprinted form recognition is described. The form recognition is decomposed into three levels: character recognition, word recognition and form contents recognition. On the word and form contents level the probabilistic lexicons are available. The decision on the word level is performed using probabilistic properties of character classifier and the contents of probabilistic lexicon. The novel approach to combining these two sources of information about classes (words) probabilities is proposed, which is based on lexicons and accuracy assessment of local character classifiers. Some experimental results and examples of practical applications of recognition method are also briefly described.