Michal Hradis | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michal Hradis is active.

Explore More

Publication

Featured researches published by Michal Hradis.

eye tracking research & application | 2012

What do you want to do next: a novel approach for intent prediction in gaze-based interaction

Roman Bednarik; Hana Vrzakova; Michal Hradis

Interaction intent prediction and the Midas touch have been a longstanding challenge for eye-tracking researchers and users of gaze-based interaction. Inspired by machine learning approaches in biometric person authentication, we developed and tested an offline framework for task-independent prediction of interaction intents. We describe the principles of the method, the features extracted, normalization methods, and evaluation metrics. We systematically evaluated the proposed approach on an example dataset of gaze-augmented problem-solving sessions. We present results of three normalization methods, different feature sets and fusion of multiple feature types. Our results show that accuracy of up to 76% can be achieved with Area Under Curve around 80%. We discuss the possibility of applying the results for an online system capable of interaction intent prediction.

british machine vision conference | 2015

Convolutional Neural Networks for Direct Text Deblurring.

Michal Hradis; Jan Kotera; Pavel Zemcik; Filip Sroubek

In this work we address the problem of blind deconvolution and denoising. We focus on restoration of text documents and we show that this type of highly structured data can be successfully restored by a convolutional neural network. The networks are trained to reconstruct high-quality images directly from blurry inputs without assuming any specific blur and noise models. We demonstrate the performance of the convolutional networks on a large set of text documents and on a combination of realistic de-focus and camera shake blur kernels. On this artificial data, the convolutional networks significantly outperform existing blind deconvolution methods, including those optimized for text, in terms of image quality and OCR accuracy. In fact, the networks outperform even state-of-the-art non-blind methods for anything but the lowest noise levels. The approach is validated on real photos taken by various devices.

international conference on computer vision | 2008

Local Rank Patterns --- Novel Features for Rapid Object Detection

Michal Hradis; Adam Herout; Pavel Zemcik

This paper presents Local Rank Patterns (LRP) - novel features for rapid object detection in images which are based on existing features Local Rank Differences (LRD). The performance of the novel features is thoroughly tested on frontal face detection task and it is compared to the performance of the LRD and the traditionally used Haar-like features. The results show that the LRP surpass the LRD and the Haar-like features in the precision of detection and also in the average number of features needed for classification. Considering recent successful and efficient implementations of LRD on CPU, GPU and FPGA, the results suggest that LRP are good choice for object detection and that they could replace the Haar-like features in some applications in the future.

Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction | 2012

Gaze and conversational engagement in multiparty video conversation: an annotation scheme and classification of high and low levels of engagement

Roman Bednarik; Shahram Eivazi; Michal Hradis

When using a multiparty video mediated system, interacting participants assume a range of various roles and exhibit behaviors according to how engaged in the communication they are. In this paper we focus on estimation of conversational engagement from gaze signal. In particular, we present an annotation scheme for conversational engagement, a statistical analysis of gaze behavior across varying levels of engagement, and we classify vectors of computed eye tracking measures. The results show that in 74% of cases the level of engagement can be correctly classified into either high or low level. In addition, we describe the nuances of gaze during distinct levels of engagement.

european conference on interactive tv | 2012

A comparative study on distant free-hand pointing

Ondrej Polacek; Martin Klima; Adam J. Sporka; Pavel Zak; Michal Hradis; Pavel Zemcik; Vaclav Prochazka

In this paper we present a comparative study of free-hand pointing, an absolute remote pointing device. Unimanual and bimanual interaction were tested as well as the static reference system (spatial coordinates are fixed in the space in front of the TV) and novel body-aligned reference system (coordinates are bound to the current position of the user). We conducted a point-and-click experiment with 12 participants. We have identified the preferred interaction areas for left- and right-handed users in terms of hand preference and preferred spatial areas of the interaction. In bimanual interaction, the users relied more on dominant hand, switching hands only when necessary. Even though the remote pointing device was faster than the free-hand pointing, it was less accepted probably due to its low precision.

acm multimedia | 2016

Multimodal Emotion Recognition for AVEC 2016 Challenge

Filip Povolny; Pavel Matejka; Michal Hradis; Anna Popková; Lubomir Otrusina; Pavel Smrz; Ian Wood; Cecile Robin; Lori Lamel

This paper describes a systems for emotion recognition and its application on the dataset from the AV+EC 2016 Emotion Recognition Challenge. The realized system was produced and submitted to the AV+EC 2016 evaluation, making use of all three modalities (audio, video, and physiological data). Our work primarily focused on features derived from audio. The original audio features were complement with bottleneck features and also text-based emotion recognition which is based on transcribing audio by an automatic speech recognition system and applying resources such as word embedding models and sentiment lexicons. Our multimodal fusion reached CCC=0.855 on dev set for arousal and 0.713 for valence. CCC on test set is 0.719 and 0.596 for arousal and valence respectively.

international conference on image processing | 2016

CNN for license plate motion deblurring

Pavel Svoboda; Michal Hradis; Lukas Marsik; Pavel Zemcik

In this work we explore the previously proposed approach of direct blind deconvolution and denoising with convolutional neural networks (CNN) in a situation where the blur kernels are partially constrained. We focus on blurred images from a real-life traffic surveillance system, on which we, for the first time, demonstrate that neural networks trained on artificial data provide superior reconstruction quality on real images compared to traditional blind deconvolution methods. The training data is easy to obtain by blurring sharp photos from a target system with a very rough approximation of the expected blur kernels, thereby allowing custom CNNs to be trained for a specific application (image content and blur range). Additionally, we evaluate the behavior and limits of the CNNs with respect to blur direction range and length.

Proceedings of the international workshop on TRECVID video summarization | 2007

Video summarization at Brno University of Technology

Vítezslav Beran; Michal Hradis; Adam Herout; Stanislav Sumec; Igor Potucek; Pavel Zemcik; Josef Mlích; Aleš Láník; Petr ChmelaƊ

This paper describes the video summarization system built for the TRECVID 2007 evaluation by the Brno team. Motivations for the system design and its overall structure are described followed by more detailed description of the critical parts of the system, which are feature extraction and clustering of frames (shots, sub-shots) in time domain. Many ideas were not included in the system because of the time constraints. Those considered promising are stated and are briefly described as possible future work. The result of video summarization presented in this paper is promising and will benefit further investigation. This is specifically true as not all the features that can be considered and processing methods were implemented in the evaluated system.

eye tracking research & application | 2012

Voice activity detection from gaze in video mediated communication

Michal Hradis; Shahram Eivazi; Roman Bednarik

This paper discusses estimation of active speaker in multi-party video-mediated communication from gaze data of one of the participants. In the explored settings, we predict voice activity of participants in one room based on gaze recordings of a single participant in another room. The two rooms were connected by high definition, low delay audio and video links and the participants engaged in different activities ranging from casual discussion to simple problem-solving games. We treat the task as a classification problem. We evaluate several types of features and parameter settings in the context of Support Vector Machine classification framework. The results show that using the proposed approach vocal activity of a speaker can be correctly predicted in 89 % of the time for which the gaze data are available.

advanced concepts for intelligent vision systems | 2008

Local Rank Differences Image Feature Implemented on GPU

Lukas Polok; Adam Herout; Pavel Zemcik; Michal Hradis; Roman Juránek; Radovan Jošth

A currently popular trend in object detection and pattern recognition is usage of statistical classifiers, namely AdaBoost and its modifications. The speed performance of these classifiers largely depends on the low level image features they are using: both on the amount of information the feature provides and the executional time of its evaluation. Local Rank Differences is an image feature that is alternative to commonly used haar wavelets. It is suitable for implementation in programmable (FPGA) or specialized (ASIC) hardware, but --- as this paper shows --- it performs very well on graphics hardware (GPU) as well. The paper discusses the LRD features and their properties, describes an experimental implementation of LRD in graphics hardware, presents its empirical performance measures compared to alternative approaches and suggests several notes on practical usage of LRD and proposes directions for future work.

Explore More