Dénes Paczolay
Hungarian Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dénes Paczolay.
international symposium on intelligent systems and informatics | 2010
Gábor Gosztolya; Dénes Paczolay; László Tóth
Wireless sensors are frequently used for recording surrounding speech and then sending it to a base station. Their way of communication via radio waves makes it important to employ some form of audio compression, while their limited RAM and low-capacity CPU restrict the range of methods which can be applied. In this paper a number of such methods are tested, and show that they can indeed be effective: a 30% bandwidth saving was achieved practically without information loss and a 50% bandwidth reduction at the cost of some negligible information loss.
text speech and dialogue | 2003
Dénes Paczolay; András Kocsor; László Tóth
Speaker normalization in a speech recognition can significantly improve speech recognition accuracy. One such method, vocal tract length normalization (VTLN), is especially useful when the system has to work reliably for males, females and children. It is just this situation with our phonological awareness teaching system, the “SpeechMaster”, which aims at real-time phoneme recognition and feedback. As most VTLN algorithms work off-line, this poses the additional problem of real-time operation. This paper examines how a well-known off-line algorithm can be approximated on-line by machine learning regression techniques. We conclude that, by employing a real-time estimation of VTLN parameters, the recognition error can be reduced by some 14-24 %.
international joint conference on computational cybernetics and technical informatics | 2010
Gábor Gosztolya; Dénes Paczolay; László Tóth
Wireless sensors are small-capacity devices with low consumption. Their capabilities already exceed the limit required for telephone-quality audio recording and processing, which calls for porting a number of speech processing applications. To do it, however, dynamically adjusting the sensitivity of their microphone is vital, which is called Automatic Gain Control (AGC). In this study we employed two simple algorithms for this task and will show that they can indeed be very effective in distant recording conditions, especially as it requires techniques which have low computational requirements.
International Journal of Speech Technology | 2006
András Bánhalmi; Dénes Paczolay; László Tóth; András Kocsor
This paper examines the susceptibility of a dictation system to various types of mismatches between the training and testing conditions. With these experiments we intend to find the best training configuration for the system and also to evaluate the efficiency of the speaker adaptation algorithm we use. The paper first presents the components of the dictation system, and then describes a set of training and recognition experiments where we vary the microphones and create gender-dependent and speaker-dependent models. In each case we examine how much the recognition performance can be improved further by speaker adaptation. We conclude that the best and most reliable scores can be obtained by using gender-dependent phone models in combination with speaker adaptation. Speaker adaptation results in great improvements in almost every case. However, our results do not confirm the assumption that the use of one microphone is better than the use of several.
engineering of computer based systems | 2013
András Bánhalmi; Dénes Paczolay; Ádám Zoltán Végh; Gabor Antal; Vilmos Bilicki
It is a frequent problem in system development when various different systems with similar functionality have to be integrated together. System integration generally means accessing, sharing and exchanging data among applications. This data sharing can be achieved by implementing interfaces for communication and data exchange, and services for all the applications to be integrated. However, this solution requires a large amount of human resources for software development and refactoring. Here, a novel framework is proposed for data integration that makes the process much simpler by generating all the source code needed for querying data of an integrated system. The proposed framework is based on two ontology mediators, namely, a local ontology generated for the data source to be integrated, and a global or central ontology for the common query interface. This way, the integration process is reduced to finding automatically or semi-automatically the mapping between the concepts of local and global ontologies. If this semantic matching is performed, then all the predefined SPARQL queries of global ontology will be rewritten automatically to queries belonging to data sources to be integrated. Hence, the differences between the systems to be integrated will appear just in the DAO (Data Access Object) layer that is generated according to the ontology alignments. The proposed integration technique does not need any program coding in practice (only visual concept matching has to be done semi-automatically), so the productivity of integration software development can be improved significantly. To demonstrate our concept, our novel framework is applied to real-world integration tasks. In this paper, the framework, methodology, experiences and some measurements relating to the productivity will be presented and discussed.
text speech and dialogue | 2007
Dénes Paczolay; András Bánhalmi; András Kocsor
Speaker normalization techniques are widely used to improve the accuracy of speaker independent speech recognition. One of the most popular group of such methods is Vocal Tract Length Normalization (VTLN). These methods try to reduce the inter-speaker variability by transforming the input feature vectors into a more compact domain, to achieve better separations between the phonetic classes. Among others, two algorithms are commonly applied: the Maximum Likelihood criterion-based, and the Linear Discriminant criterion-based normalization algorithms. Here we propose the use of the Springy Discriminant criterion for the normalization task. In addition we propose a method for the VTLN parameter determination that is based on pitch estimation. In the experiments this proves to be an efficient and swift way to initialize the normalization parameters for training, and to estimate them for the voice samples of new test speakers.
text speech and dialogue | 2001
András Kocsor; László Tóth; Dénes Paczolay
Acta Cybernetica | 2005
Dénes Paczolay; László Felföldi; András Kocsor
Unknown Journal | 2014
Dénes Paczolay; András Bánhalmi; László G. Nyúl; Vilmos Bilicki; Árpád Sárosi
CLEF (Working Notes) | 2014
András Bánhalmi; Dénes Paczolay; Vilmos Bilicki; László G. Nyúl; Árpád Sárosi