Jaromir Tovarek
Technical University of Ostrava
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jaromir Tovarek.
The Scientific World Journal | 2015
Pavol Partila; Miroslav Voznak; Jaromir Tovarek
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
Proceedings of SPIE | 2015
Miralem Mehic; Pavol Partila; Jaromir Tovarek; Miroslav Voznak
It is well known that Quantum Key Distribution (QKD) can be used with the highest level of security for distribution of the secret key, which is further used for symmetrical encryption. B92 is one of the oldest QKD protocols. It uses only two non-orthogonal states, each one coding for one bit-value. It is much faster and simpler when compared to its predecessors, but with the idealized maximum efficiencies of 25% over the quantum channel. B92 consists of several phases in which initial key is significantly reduced: secret key exchange, extraction of the raw key (sifting), error rate estimation, key reconciliation and privacy amplification. QKD communication is performed over two channels: the quantum channel and the classical public channel. In order to prevent a man-in-the-middle attack and modification of messages on the public channel, authentication of exchanged values must be performed. We used Wegman-Carter authentication because it describes an upper bound for needed symmetric authentication key. We explained the reduction of the initial key in each of QKD phases.
Proceedings of SPIE | 2016
Miralem Mehic; Peppino Fazio; Miroslav Voznak; Pavol Partila; Dan Komosny; Jaromir Tovarek; Z. Chmelikova
A mobile ad hoc network is a collection of mobile nodes which communicate without a fixed backbone or centralized infrastructure. Due to the frequent mobility of nodes, routes connecting two distant nodes may change. Therefore, it is not possible to establish a priori fixed paths for message delivery through the network. Because of its importance, routing is the most studied problem in mobile ad hoc networks. In addition, if the Quality of Service (QoS) is demanded, one must guarantee the QoS not only over a single hop but over an entire wireless multi-hop path which may not be a trivial task. In turns, this requires the propagation of QoS information within the network. The key to the support of QoS reporting is QoS routing, which provides path QoS information at each source. To support QoS for real-time traffic one needs to know not only minimum delay on the path to the destination but also the bandwidth available on it. Therefore, throughput, end-to-end delay, and routing overhead are traditional performance metrics used to evaluate the performance of routing protocol. To obtain additional information about the link, most of quality-link metrics are based on calculation of the lost probabilities of links by broadcasting probe packets. In this paper, we address the problem of including multiple routing metrics in existing routing packets that are broadcasted through the network. We evaluate the efficiency of such approach with modified version of DSDV routing protocols in ns-3 simulator.
Proceedings of SPIE | 2015
Martin Mikulec; Miroslav Voznak; Marcel Fajkus; Pavol Partila; Jaromir Tovarek; Z. Chmelikova
The paper is focused on the building ad-hoc GSM network based on open source software and low-cost hardware. The created Base Transmission Station can be deployed and put into operation in a few minutes in a required area to ensure private communication between connected GSM mobile terminals. The convergence between BTS station and the other networks is possible through IP network. The paper tries to define connection parameters to provide sufficient quality of voice service between the GSM network and IP Multimedia Subsystem. The paper brings practical results of voice call quality measurement between users inside BTS station mobile network and users inside IP Multimedia Subsystem network. The calls are simulated by low-cost embedded solution for speech quality measurement in GSM network. This tool is under development of our laboratory and allows automatic speech quality measurement of any GSM or UMTS mobile network. The Perceptual Evaluation of Speech Quality method is used to get final comparable results. The communication between BTS station and connected networks has to be secured against the interception from the third party. The influence of the securing method for quality of service is presented in detail. Paper, apart from the quality of service measurement section, describes technical requirements for successful interconnection between BTS and IMS networks. The authentication, authorization and accounting methods in roaming between BTS and IMS system are presented too.
Proceedings of SPIE | 2014
Pavol Partila; Miroslav Voznak; Tomáš Peterek; Marek Penhaker; V. Novak; Jaromir Tovarek; Miralem Mehic; L. Vojtech
Emotional states of humans and their impact on physiological and neurological characteristics are discussed in this paper. This problem is the goal of many teams who have dealt with this topic. Nowadays, it is necessary to increase the accuracy of methods for obtaining information about correlations between emotional state and physiological changes. To be able to record these changes, we focused on two majority emotional states. Studied subjects were psychologically stimulated to neutral - calm and then to the stress state. Electrocardiography, Electroencephalography and blood pressure represented neurological and physiological samples that were collected during patient’s stimulated conditions. Speech activity was recording during the patient was reading selected text. Feature extraction was calculated by speech processing operations. Classifier based on Gaussian Mixture Model was trained and tested using Mel-Frequency Cepstral Coefficients extracted from the patients speech. All measurements were performed in a chamber with electromagnetic compatibility. The article discusses a method for determining the influence of stress emotional state on the human and his physiological and neurological changes.
international conference on multimedia communications | 2017
Filip Rezac; Miroslav Voznak; Jaromir Tovarek; Jerry Chun-Wei Lin
The system which description and realization is presented in this article is intended to expand existing solutions to the emergency services for the distribution of voice messages with pre-recorded content in order to inform target participants about critical events in their surroundings using multimedia tools, based on Session Initiation Protocol. The transmission of information via voice channel has, over other forms of distributed notification, the advantage that the target user is forced to pick up the call and listen to the message - so the information cannot be ignored or overlooked. Another benefit is more accurate addressing of target groups, and ultimately it can ensure the re-delivery of messages to users who have not heard them. The authors present their own method of sending pre-recorded messages, which are then tested in terms of the definition of the performance capabilities of the distribution network. Practical implementation thereafter demonstrates the advantages and use of the proposed solutions in a simulated environment.
Proceedings of SPIE | 2017
Jaromir Tovarek; Pavol Partila
This article discusses the speaker identification for the improvement of the security communication between law enforcement units. The main task of this research was to develop the text-independent speaker identification system which can be used for real-time recognition. This system is designed for identification in the open set. It means that the unknown speaker can be anyone. Communication itself is secured, but we have to check the authorization of the communication parties. We have to decide if the unknown speaker is the authorized for the given action. The calls are recorded by IP telephony server and then these recordings are evaluate using classification If the system evaluates that the speaker is not authorized, it sends a warning message to the administrator. This message can detect, for example a stolen phone or other unusual situation. The administrator then performs the appropriate actions. Our novel proposal system uses multilayer neural network for classification and it consists of three layers (input layer, hidden layer, and output layer). A number of neurons in input layer corresponds with the length of speech features. Output layer then represents classified speakers. Artificial Neural Network classifies speech signal frame by frame, but the final decision is done over the complete record. This rule substantially increases accuracy of the classification. Input data for the neural network are a thirteen Mel-frequency cepstral coefficients, which describe the behavior of the vocal tract. These parameters are the most used for speaker recognition. Parameters for training, testing and validation were extracted from recordings of authorized users. Recording conditions for training data correspond with the real traffic of the system (sampling frequency, bit rate). The main benefit of the research is the system developed for text-independent speaker identification which is applied to secure communication between law enforcement units.
Proceedings of SPIE | 2016
Pavol Partila; Jaromir Tovarek; Miroslav Voznak
This paper presents a method for detecting speech under stress using Self-Organizing Maps. Most people who are exposed to stressful situations can not adequately respond to stimuli. Army, police, and fire department occupy the largest part of the environment that are typical of an increased number of stressful situations. The role of men in action is controlled by the control center. Control commands should be adapted to the psychological state of a man in action. It is known that the psychological changes of the human body are also reflected physiologically, which consequently means the stress effected speech. Therefore, it is clear that the speech stress recognizing system is required in the security forces. One of the possible classifiers, which are popular for its flexibility, is a self-organizing map. It is one type of the artificial neural networks. Flexibility means independence classifier on the character of the input data. This feature is suitable for speech processing. Human Stress can be seen as a kind of emotional state. Mel-frequency cepstral coefficients, LPC coefficients, and prosody features were selected for input data. These coefficients were selected for their sensitivity to emotional changes. The calculation of the parameters was performed on speech recordings, which can be divided into two classes, namely the stress state recordings and normal state recordings. The benefit of the experiment is a method using SOM classifier for stress speech detection. Results showed the advantage of this method, which is input data flexibility.
Proceedings of SPIE | 2016
Jaromir Tovarek; Pavol Partila; Jan Rozhon; Miroslav Voznak; Jan Skapa; Dominik Uhrin; Z. Chmelikova
This article discusses the impact of multilayer neural network parameters for speaker identification. The main task of speaker identification is to find a specific person in the known set of speakers. It means that the voice of an unknown speaker (wanted person) belongs to a group of reference speakers from the voice database. One of the requests was to develop the text-independent system, which means to classify wanted person regardless of content and language. Multilayer neural network has been used for speaker identification in this research. Artificial neural network (ANN) needs to set parameters like activation function of neurons, steepness of activation functions, learning rate, the maximum number of iterations and a number of neurons in the hidden and output layers. ANN accuracy and validation time are directly influenced by the parameter settings. Different roles require different settings. Identification accuracy and ANN validation time were evaluated with the same input data but different parameter settings. The goal was to find parameters for the neural network with the highest precision and shortest validation time. Input data of neural networks are a Mel-frequency cepstral coefficients (MFCC). These parameters describe the properties of the vocal tract. Audio samples were recorded for all speakers in a laboratory environment. Training, testing and validation data set were split into 70, 15 and 15 %. The result of the research described in this article is different parameter setting for the multilayer neural network for four speakers.
Proceedings of SPIE | 2016
Dominik Uhrin; Z. Chmelikova; Jaromir Tovarek; Pavol Partila; Miroslav Voznak
This article describes a system for evaluating the credibility of recordings with emotional character. Sound recordings form Czech language database for training and testing systems of speech emotion recognition. These systems are designed to detect human emotions in his voice. The emotional state of man is useful in the security forces and emergency call service. Man in action (soldier, police officer and firefighter) is often exposed to stress. Information about the emotional state (his voice) will help to dispatch to adapt control commands for procedure intervention. Call agents of emergency call service must recognize the mental state of the caller to adjust the mood of the conversation. In this case, the evaluation of the psychological state is the key factor for successful intervention. A quality database of sound recordings is essential for the creation of the mentioned systems. There are quality databases such as Berlin Database of Emotional Speech or Humaine. The actors have created these databases in an audio studio. It means that the recordings contain simulated emotions, not real. Our research aims at creating a database of the Czech emotional recordings of real human speech. Collecting sound samples to the database is only one of the tasks. Another one, no less important, is to evaluate the significance of recordings from the perspective of emotional states. The design of a methodology for evaluating emotional recordings credibility is described in this article. The results describe the advantages and applicability of the developed method.