András Bánhalmi
Hungarian Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by András Bánhalmi.
european conference on machine learning | 2007
András Bánhalmi; András Kocsor; Róbert Busa-Fekete
For One-Class Classification problems several methods have been proposed in the literature. These methods all have the common feature that the decision boundary is learnt by just using a set of the positive examples. Here we propose a method that extends the training set with a counter-example set, which is generated directly using the set of positive examples. Using the extended training set, a binary classifier (here i¾?-SVM) is applied to separate the positive and the negative points. The results of this novel technique are compared with those of One-Class SVM and the Gaussian Mixture Model on several One-Class Classification tasks.
iberian conference on pattern recognition and image analysis | 2009
Gábor Gosztolya; András Bánhalmi; László Tóth
In this paper we focus on the anti-phoneme modelling part of segment-based speech recognition, where we have to distinguish the real phonemes from anything else which may appear (like parts of phonemes, several consecutive phonemes and noise). As it has to be performed while only having samples of the correct phonemes, it is an example of one-class classification. To solve this problem, first all phonemes are modelled with a number of Gaussian distributions; then the problem is converted into a two-class classification task by generating counter-examples; this way some machine learning algorithm (like ANNs) can be used to separate the two classes. We tested two methods for a counter-example generation like this: one was a solution specific to the anti-phoneme problem, while the other used a general algorithm. By making modifications to the latter to reduce its time requirements, we were able to achieve an improvement in the recognition scores of over 60% compared to having no anti-phoneme model at all, and it performed considerably better than the other two methods.
International Journal of Speech Technology | 2006
András Bánhalmi; Dénes Paczolay; László Tóth; András Kocsor
This paper examines the susceptibility of a dictation system to various types of mismatches between the training and testing conditions. With these experiments we intend to find the best training configuration for the system and also to evaluate the efficiency of the speaker adaptation algorithm we use. The paper first presents the components of the dictation system, and then describes a set of training and recognition experiments where we vary the microphones and create gender-dependent and speaker-dependent models. In each case we examine how much the recognition performance can be improved further by speaker adaptation. We conclude that the best and most reliable scores can be obtained by using gender-dependent phone models in combination with speaker adaptation. Speaker adaptation results in great improvements in almost every case. However, our results do not confirm the assumption that the use of one microphone is better than the use of several.
international symposium on bioinformatics research and applications | 2009
András Bánhalmi; Róbert Busa-Fekete; Balázs Kégl
The One-Class Classification (OCC) approach is based on the assumption that samples are available only from a target class in the training phase. OCC methods have been applied with success to problems where the classes are very different in size. As class-imbalance problems are typical in protein classification tasks, we were interested in testing one-class classification algorithms for the detection of distant similarities in protein sequences and structures. We found that the OCC approach brought about a small improvement in classification performance compared to binary classifiers (SVM, ANN, Random Forest). More importantly, there is a substantial (50 to 100 fold) improvement in the training time. OCCs may provide an especially useful alternative for processing those protein groups where discriminative classifiers cannot be easily trained.
nordic signal processing symposium | 2006
András Bánhalmi; András Kocsor; Kornél L. Kovács; László Tóth
Several pitch estimation algorithms have been proposed over the decades, but they have tended to become more and more complex and cumbersome, some of them requiring much more computational power than a real-time application can afford. Rather than have one sophisticated algorithm, here we propose to combine the output of several conventional and relatively simple algorithms by various dedicated combination schemes. These combination methods perform a kind of weighted majority voting that helps find the correct solution when just a few of the basic algorithms go wrong. For testing purposes we compare the performance of the methods on a pitch-annotated corpora. The results show that with the combination schemes the amount of errors can be reduced by about 20-35% relative to the error of the best individual estimator
Journal of Healthcare Engineering | 2018
András Bánhalmi; János Borbás; Márta Fidrich; Vilmos Bilicki; Zoltan Gingl; László Rudas
Background Heart rate variability (HRV) provides information about the activity of the autonomic nervous system. Because of the small amount of data collected, the importance of HRV has not yet been proven in clinical practice. To collect population-level data, smartphone applications leveraging photoplethysmography (PPG) and some medical knowledge could provide the means for it. Objective To assess the capabilities of our smartphone application, we compared PPG (pulse rate variability (PRV)) with ECG (HRV). To have a baseline, we also compared the differences among ECG channels. Method We took fifty parallel measurements using iPhone 6 at a 240 Hz sampling frequency and Cardiax PC-ECG devices. The correspondence between the PRV and HRV indices was investigated using correlation, linear regression, and Bland-Altman analysis. Results High PPG accuracy: the deviation of PPG-ECG is comparable to that of ECG channels. Mean deviation between PPG-ECG and two ECG channels: RR: 0.01 ms–0.06 ms, SDNN: 0.78 ms–0.46 ms, RMSSD: 1.79 ms–1.21 ms, and pNN50: 2.43%–1.63%. Conclusions Our iPhone application yielded good results on PPG-based PRV indices compared to ECG-based HRV indices and to differences among ECG channels. We plan to extend our results on the PPG-ECG correspondence with a deeper analysis of the different ECG channels.
engineering of computer based systems | 2013
András Bánhalmi; Dénes Paczolay; Ádám Zoltán Végh; Gabor Antal; Vilmos Bilicki
It is a frequent problem in system development when various different systems with similar functionality have to be integrated together. System integration generally means accessing, sharing and exchanging data among applications. This data sharing can be achieved by implementing interfaces for communication and data exchange, and services for all the applications to be integrated. However, this solution requires a large amount of human resources for software development and refactoring. Here, a novel framework is proposed for data integration that makes the process much simpler by generating all the source code needed for querying data of an integrated system. The proposed framework is based on two ontology mediators, namely, a local ontology generated for the data source to be integrated, and a global or central ontology for the common query interface. This way, the integration process is reduced to finding automatically or semi-automatically the mapping between the concepts of local and global ontologies. If this semantic matching is performed, then all the predefined SPARQL queries of global ontology will be rewritten automatically to queries belonging to data sources to be integrated. Hence, the differences between the systems to be integrated will appear just in the DAO (Data Access Object) layer that is generated according to the ontology alignments. The proposed integration technique does not need any program coding in practice (only visual concept matching has to be done semi-automatically), so the productivity of integration software development can be improved significantly. To demonstrate our concept, our novel framework is applied to real-world integration tasks. In this paper, the framework, methodology, experiences and some measurements relating to the productivity will be presented and discussed.
text speech and dialogue | 2007
András Bánhalmi; Róbert Busa-Fekete; András Kocsor
When training speaker-independent HMM-based acoustic models, a lot of manually transcribed acoustic training data must be available from a good many different speakers. These training databases have a great variation in the pitch of the speakers, articulation and the speed of talking. In practice, the speaker-independent models are used for bootstrapping the speaker-dependent models built by speaker adaptation methods. Thus the performance of the adaptation methods is strongly influenced by the performance of the speaker-independent model and by the accuracy of the automatic segmentation which also depends on the base model. In practice, the performance of the speaker-independent models can vary a great deal on the test speakers. Here our goal is to reduce this performance variability by increasing the performance value for the speakers with low values, at the price of allowing a small drop in the highest performance values. For this purpose we propose a new method for the automatic retraining of speaker-independent HMMs.
text speech and dialogue | 2007
Dénes Paczolay; András Bánhalmi; András Kocsor
Speaker normalization techniques are widely used to improve the accuracy of speaker independent speech recognition. One of the most popular group of such methods is Vocal Tract Length Normalization (VTLN). These methods try to reduce the inter-speaker variability by transforming the input feature vectors into a more compact domain, to achieve better separations between the phonetic classes. Among others, two algorithms are commonly applied: the Maximum Likelihood criterion-based, and the Linear Discriminant criterion-based normalization algorithms. Here we propose the use of the Springy Discriminant criterion for the normalization task. In addition we propose a method for the VTLN parameter determination that is based on pitch estimation. In the experiments this proves to be an efficient and swift way to initialize the normalization parameters for training, and to estimate them for the voice samples of new test speakers.
text speech and dialogue | 2007
András Kocsor; Róbert Busa-Fekete; András Bánhalmi
It is quite common to use feature extraction methods prior to classification. Here we deal with three algorithms defining uncorrelated features. The first one is the so-called whitening method, which transforms the data so that the covariance matrix becomes an identity matrix. The second method, the well-known Fast Independent Component Analysis (FastICA) searches for orthogonal directions along which the value of the non-Gaussianity measure is large in the whitened data space. The third one, the Whitening-based Springy Discriminant Analysis (WSDA) is a novel method combination, which provides orthogonal directions for better class separation. We compare the effects of the above methods on a real-time vowel classification task. Based on the results we conclude that the WSDA transformation is especially suitable for this task.