Kai Essig
Bielefeld University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kai Essig.
IEEE Transactions on Image Processing | 2000
Chiou-Shann Fuh; Shun-Wen Cho; Kai Essig
In this work, we propose a model of a content-based image retrieval system by using the new idea of combining a color segmentation with relationship trees and a corresponding tree-matching method. We retain the hierarchical relationship of the regions in an image during segmentation. Using the information of the relationships and features of the regions, we can represent the desired objects in images more accurately. In retrieval, we compare not only region features but also region relationships.
International Journal of Parallel, Emergent and Distributed Systems | 2006
Kai Essig; Marc Pomplun; Helge Ritter
Using eye tracking for the investigation of visual attention has become increasingly popular during the last few decades. Nevertheless, only a small number of eye tracking studies have employed 3D displays, although such displays would closely resemble our natural visual environment. Besides higher cost and effort for the experimental setup, the main reason for the avoidance of 3D displays is the problem of computing a subjects current 3D gaze position based on the measured binocular gaze angles. The geometrical approaches to this problem that have been studied so far involved substantial error in the measurement of 3D gaze trajectories. In order to tackle this problem, we developed an anaglyph-based 3D calibration procedure and used a well-suited type of artificial neural network—a parametrized self-organizing map (PSOM)—to estimate the 3D gaze point from a subjects binocular eye-position data. We report an experiment in which the accuracy of the PSOM gaze-point estimation is compared to a geometrical solution. The results show that the neural network approach produces more accurate results than the geometrical method, especially for the depth axis and for distant stimuli.
Frontiers in Human Neuroscience | 2014
Thomas Schack; Kai Essig; Cornelia Frank; Dirk Koester
Research in sports, dance and rehabilitation has shown that basic action concepts (BACs) are fundamental building blocks of mental action representations. BACs are based on chunked body postures related to common functions for realizing action goals. In this paper, we outline issues in research methodology and an experimental method, the structural dimensional analysis of mental representation (SDA-M), to assess action-relevant representational structures that reflect the organization of BACs. The SDA-M reveals a strong relationship between cognitive representation and performance if complex actions are performed. We show how the SDA-M can improve motor imagery training and how it contributes to our understanding of coaching processes. The SDA-M capitalizes on the objective measurement of individual mental movement representations before training and the integration of these results into the motor imagery training. Such motor imagery training based on mental representations (MTMR) has been applied successfully in professional sports such as golf, volleyball, gymnastics, windsurfing, and recently in the rehabilitation of patients who have suffered a stroke.
eye tracking research & application | 2012
Kai Essig; Daniel Dornbusch; Daniel Prinzhorn; Helge Ritter; Jonathan Maycock; Thomas Schack
We implemented a system, called the VICON-EyeTracking Visualizer, that combines mobile eye tracking data with motion capture data to calculate and visualize the 3D gaze vector within the motion capture co-ordinate system. To ensure that both devices were temporally synchronized we used previously developed software by us. By placing reflective markers on objects in the scene, their positions are known and by spatially synchronizing both the eye tracker and the motion capture system allows us to automatically compute how many times and where fixations occur, thus overcoming the time consuming and error-prone disadvantages of the traditional manual annotation process. We evaluated our approach by comparing its outcome for a simple looking task and a more complex grasping task against the average results produced by the manual annotation process. Preliminary data reveals that the program only differed from the average manual annotation results by approximately 3 percent in the looking task with regard to the number of fixations and cumulative fixation duration on each point in the scene. In case of the more complex grasping task the results depend on the object size: for larger objects there was good agreement (less than 16 percent (or 950ms)), but this degraded for smaller objects, where there are more saccades towards object boundaries. The advantages of our approach are easy user calibration, the ability to have unrestricted body movements (due to the mobile eye-tracking system), and that it can be used with any wearable eye tracker and marker based motion tracking system. Extending existing approaches, our system is also able to monitor fixations on moving objects. The automatic analysis of gaze and movement data in complex 3D scenes can be applied to a variety of research domains, i. e., Human Computer Interaction, Virtual Reality or grasping and gesture research.
Frontiers in Psychology | 2011
Pia Knoeferle; Maria Nella Carminati; Dato Abashidze; Kai Essig
Eye-tracking findings suggest people prefer to ground their spoken language comprehension by focusing on recently seen events more than anticipating future events: When the verb in NP1-VERB-ADV-NP2 sentences was referentially ambiguous between a recently depicted and an equally plausible future clipart action, listeners fixated the target of the recent action more often at the verb than the object that hadn’t yet been acted upon. We examined whether this inspection preference generalizes to real-world events, and whether it is (vs. isn’t) modulated by how often people see recent and future events acted out. In a first eye-tracking study, the experimenter performed an action (e.g., sugaring pancakes), and then a spoken sentence either referred to that action or to an equally plausible future action (e.g., sugaring strawberries). At the verb, people more often inspected the pancakes (the recent target) than the strawberries (the future target), thus replicating the recent-event preference with these real-world actions. Adverb tense, indicating a future versus past event, had no effect on participants’ visual attention. In a second study we increased the frequency of future actions such that participants saw 50/50 future and recent actions. During the verb people mostly inspected the recent action target, but subsequently they began to rely on tense, and anticipated the future target more often for future than past tense adverbs. A corpus study showed that the verbs and adverbs indicating past versus future actions were equally frequent, suggesting long-term frequency biases did not cause the recent-event preference. Thus, (a) recent real-world actions can rapidly influence comprehension (as indexed by eye gaze to objects), and (b) people prefer to first inspect a recent action target (vs. an object that will soon be acted upon), even when past and future actions occur with equal frequency. A simple frequency-of-experience account cannot accommodate these findings.
PLOS ONE | 2015
Heiko Lex; Kai Essig; Andreas Knoblauch; Thomas Schack
Two core elements for the coordination of different actions in sport are tactical information and knowledge about tactical situations. The current study describes two experiments to learn about the memory structure and the cognitive processing of tactical information. Experiment 1 investigated the storage and structuring of team-specific tactics in humans’ long-term memory with regard to different expertise levels. Experiment 2 investigated tactical decision-making skills and the corresponding gaze behavior, in presenting participants the identical match situations in a reaction time task. The results showed that more experienced soccer players, in contrast to less experienced soccer players, possess a functionally organized cognitive representation of team-specific tactics in soccer. Moreover, the more experienced soccer players reacted faster in tactical decisions, because they needed less fixations of similar duration as compared to less experienced soccer players. Combined, these experiments offer evidence that a functionally organized memory structure leads to a reaction time and a perceptual advantage in tactical decision-making in soccer. The discussion emphasizes theoretical and applied implications of the current results of the study.
pervasive technologies related to assistive environments | 2017
Jonas Blattgerste; Benjamin Strenge; Patrick Renner; Thies Pfeiffer; Kai Essig
Augmented Reality (AR) gains increased attention as a means to provide assistance for different human activities. Hereby the suitability of AR does not only depend on the respective task, but also to a high degree on the respective device. In a standardized assembly task, we tested AR-based in-situ assistance against conventional pictorial instructions using a smartphone, Microsoft HoloLens and Epson Moverio BT-200 smart glasses as well as paper-based instructions. Participants solved the task fastest using the paper instructions, but made less errors with AR assistance on the Microsoft HoloLens smart glasses than with any other system. Methodically we propose operational definitions of time segments and other optimizations for standardized benchmarking of AR assembly instructions.
PLOS ONE | 2016
Andrea Finke; Kai Essig; Giuseppe Marchioro; Helge Ritter
The co-registration of eye tracking and electroencephalography provides a holistic measure of ongoing cognitive processes. Recently, fixation-related potentials have been introduced to quantify the neural activity in such bi-modal recordings. Fixation-related potentials are time-locked to fixation onsets, just like event-related potentials are locked to stimulus onsets. Compared to existing electroencephalography-based brain-machine interfaces that depend on visual stimuli, fixation-related potentials have the advantages that they can be used in free, unconstrained viewing conditions and can also be classified on a single-trial level. Thus, fixation-related potentials have the potential to allow for conceptually different brain-machine interfaces that directly interpret cortical activity related to the visual processing of specific objects. However, existing research has investigated fixation-related potentials only with very restricted and highly unnatural stimuli in simple search tasks while participant’s body movements were restricted. We present a study where we relieved many of these restrictions while retaining some control by using a gaze-contingent visual search task. In our study, participants had to find a target object out of 12 complex and everyday objects presented on a screen while the electrical activity of the brain and eye movements were recorded simultaneously. Our results show that our proposed method for the classification of fixation-related potentials can clearly discriminate between fixations on relevant, non-relevant and background areas. Furthermore, we show that our classification approach generalizes not only to different test sets from the same participant, but also across participants. These results promise to open novel avenues for exploiting fixation-related potentials in electroencephalography-based brain-machine interfaces and thus providing a novel means for intuitive human-machine interaction.
international conference on engineering applications of neural networks | 2013
André Frank Krause; Kai Essig; Martina Piefke; Thomas Schack
While the No-Prop (no back propagation) algorithm uses the delta rule to train the output layer of a feed-forward network, No-Prop-fast employs fast linear regression learning using the Hopf-Wiener solution. Ten times faster learning speeds can be achieved on large datasets like the MNIST benchmark, compared to one of the fastest backpropagation algorithm known. Additionally, the plain feed-forward network No-prop-fast can distinguish gaze movements on cartoons with and without text, as well as age-specific attention shifts between text and picture areas with minimal pre-processing.
Storage and Retrieval for Image and Video Databases | 1999
Chiou-Yann Tsai; Arbee L. P. Chen; Kai Essig
The amount of pictorial data grows enormously with the expansion of the WWW. From the large number of images, it is very important for users to retrieve desired images via an efficient and effective mechanism. In this paper we prose two efficient approaches to facilitate image retrieval by using a simple method to represent the image content. Each image is partitioned into m X n equal-sized sub-images. A color that has enough number of pixels in a block is extracted to represent its content. In the first approach, the image content is represented by the extracted colors of the blocks. The spatial information of images is considered in image retrieval. In the second approach, the colors of the blocks in an image are used to extract objects. A block- level process is process is proposed to perform the region extraction. The spatial information of regions is considered unimportant in image retrieval. Our experiments show that these two block-based approaches can speed up the image retrieval. Moreover, the two approaches are effective for different requirements of image similarity. Users can choose a proper approach to process their queries based on their similarity requirements.