Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gábor Sörös is active.

Publication


Featured researches published by Gábor Sörös.


user interface software and technology | 2014

In-air gestures around unmodified mobile devices

Jie Song; Gábor Sörös; Fabrizio Pece; Sean Ryan Fanello; Shahram Izadi; Cem Keskin; Otmar Hilliges

We present a novel machine learning based algorithm extending the interaction space around mobile devices. The technique uses only the RGB camera now commonplace on off-the-shelf mobile devices. Our algorithm robustly recognizes a wide range of in-air gestures, supporting user variation, and varying lighting conditions. We demonstrate that our algorithm runs in real-time on unmodified mobile devices, including resource-constrained smartphones and smartwatches. Our goal is not to replace the touchscreen as primary input device, but rather to augment and enrich the existing interaction vocabulary using gestures. While touch input works well for many scenarios, we demonstrate numerous interaction tasks such as mode switches, application and task management, menu selection and certain types of navigation, where such input can be either complemented or better served by in-air gestures. This removes screen real-estate issues on small touchscreens, and allows input to be expanded to the 3D space around the device. We present results for recognition accuracy (93% test and 98% train), impact of memory footprint and other model parameters. Finally, we report results from preliminary user evaluations, discuss advantages and limitations and conclude with directions for future work.


human factors in computing systems | 2015

Joint Estimation of 3D Hand Position and Gestures from Monocular Video for Mobile Interaction

Jie Song; Fabrizio Pece; Gábor Sörös; Marion Koelle; Otmar Hilliges

We present a machine learning technique to recognize gestures and estimate metric depth of hands for 3D interaction, relying only on monocular RGB video input. We aim to enable spatial interaction with small, body-worn devices where rich 3D input is desired but the usage of conventional depth sensors is prohibitive due to their power consumption and size. We propose a hybrid classification-regression approach to learn and predict a mapping of RGB colors to absolute, metric depth in real time. We also classify distinct hand gestures, allowing for a variety of 3D interactions. We demonstrate our technique with three mobile interaction scenarios and evaluate the method quantitatively and qualitatively.


ubiquitous computing | 2013

Device recognition for intuitive interaction with the web of things

Simon Mayer; Markus Schalch; Marian George; Gábor Sörös

Supporting human users when interacting with smart devices is important to drive the successful adoption of the Internet of Things in peoples homes and at their workplaces. In this poster contribution, we present a system that helps users control Web-enabled smart things in their environment. Our approach involves a handheld interaction device that recognizes smart things in its view using state-of-the-art visual object recognition techniques. It then augments the camera feed with appropriate interaction primitives such as knobs or buttons for control, and can also display measured values, for instance, when recognizing a sensor. The interaction primitives are generated from user interface descriptions that are embedded in the Web representations of the smart things. Our prototype implementation achieves frame rates that allow for interactive use of the system by human users, and indeed proved to facilitate the interaction with smart things in a demonstration testbed in our research group.


mobile and ubiquitous multimedia | 2013

Blur-resistant joint 1D and 2D barcode localization for smartphones

Gábor Sörös; Christian Flörkemeier

With the proliferation of built-in cameras barcode scanning on smartphones has become widespread in both consumer and enterprise domains. To avoid making the user precisely align the barcode at a dedicated position and angle in the camera image, barcode localization algorithms are necessary that quickly scan the image for possible barcode locations and pass those to the actual barcode decoder. In this paper, we present a barcode localization approach that is orientation, scale, and symbology (1D and 2D) invariant and shows better blur invariance than existing approaches while it operates in real time on a smartphone. Previous approaches focused on selected aspects such as orientation invariance and speed for 1D codes or scale invariance for 2D codes. Our combined method relies on the structure matrix and the saturation from the HSV color system. The comparison with three other real-time barcode localization algorithms shows that our approach outperforms the state of the art with respect to symbology and blur invariance at the expense of a reduced speed.


international conference on computer vision | 2015

Fine-Grained Product Class Recognition for Assisted Shopping

Marian George; Dejan Mircic; Gábor Sörös; Christian Floerkemeier; Friedemann Mattern

Assistive solutions for a better shopping experience can improve the quality of life of people, in particular also of visually impaired shoppers. We present a system that visually recognizes the fine-grained product classes of items on a shopping list, in shelves images taken with a smartphone in a grocery store. Our system consists of three components: (a) We automatically recognize useful text on product packaging, e.g., product name and brand, and build a mapping of words to product classes based on the large-scale GroceryProducts dataset. When the user populates the shopping list, we automatically infer the product class of each entered word. (b) We perform fine-grained product class recognition when the user is facing a shelf. We discover discriminative patches on product packaging to differentiate between visually similar product classes and to increase the robustness against continuous changes in product design. (c) We continuously improve the recognition accuracy through active learning. Our experiments show the robustness of the proposed method against cross-domain challenges, and the scalability to an increasing number of products with minimal re-training.


international conference on computer graphics and interactive techniques | 2013

Cyclo: a personal bike coach through the glass

Gábor Sörös; Florian Daiber; Tomer Weller

We present Cyclo, our prototype of a personal assistant for bike training using Google Glass. We describe our requirement study with 35 users and our design process for developing a novel application for Glass. Our hands-free user interface is potentially more convenient to use than traditional speedometers, and it provides instant performance feedback and context-aware notifications overlaid on the bikers view.


international symposium on wearable computers | 2015

Fast blur removal for wearable QR code scanners

Gábor Sörös; Stephan Semmler; Luc Humair; Otmar Hilliges

We present a fast restoration-recognition algorithm for scanning motion-blurred QR codes on handheld and wearable devices. We blindly estimate the blur from the salient edges of the code in an iterative optimization scheme, alternating between image sharpening, blur estimation, and decoding. The restored image is constrained to exploit the properties of QR codes which ensures fast convergence. The checksum of the code allows early termination when the code is first readable and precludes false positive detections. General blur removal algorithms perform poorly in restoring visual codes and are slow even on high-performance PCs. The proposed algorithm achieves good reconstruction quality on QR codes and outperforms existing methods in terms of speed. We present PC and Android implementations of a complete QR scanner and evaluate the algorithm on synthetic and real test images. Our work indicates a promising step towards enterprise-grade scan performance with wearable devices.


international symposium on applied machine intelligence and informatics | 2009

A cognitive robot supervision system

Gábor Sörös; Barna Reskó; Bjørn Solvang; Péter Baranyi

This paper applies new cognitive infocommunication channels in human-machine interaction to develop a new paradigm of robot teaching and supervision. The robot is considered as an unskilled worker who is strong and capable for precise manufacturing. It has a special kind of intelligence but it is handicapped in a sense, which requires it to be supervised. If people can learn how to communicate to this “new worker” they can get a new, capable “colleague”. The goal is that the boss is able to give the daily task to a robot in a similar way as he/she gives the jobs to the human workers, for example using CAD documentations, gestures and some verbal explanation. This paper presents an industrial robot supervision system inspired by research results of cognitive infocommunication. The operator can steer the remote manipulator by certain gestures using a motion capture suit as input device. Every gesture has its own meaning, which corresponds to a specific movement of the robot. The manipulator interprets and executes the instructions invoking its on-board artificial intelligence, while feedback through a 3D visualization unit closes the supervisory loop. The system was designed to be independent of the geographical distance between the user and the manipulated environment, allowing to establish control loops spanning through countries and continents. Successful results have been achieved between Norway, France and Hungary.


international conference on mobile systems, applications, and services | 2016

FOCUS: Robust Visual Codes for Everyone

Frederik Hermans; Liam McNamara; Gábor Sörös; Christian Rohner; Thiemo Voigt; Edith C.-H. Ngai

Visual codes are used to embed digital data in physical objects, or they are shown in video sequences to transfer data over screen/camera links. Existing codes either carry limited data to make them robust against a range of channel conditions (e.g., low camera quality or long distances), or they support a high data capacity but only work over a narrow range of channel conditions. We present FOCUS, a new code design that does not require this explicit trade-off between code capacity and the readers channel quality. Instead, FOCUS builds on concepts from OFDM to encode data at different levels of spatial detail. This enables each reader to decode as much data from a code as its channel quality allows. We build a prototype of FOCUS devices and evaluate it experimentally. Our results show that FOCUS gracefully adapts to the readers channel, and that it provides a significant performance improvement over recently proposed designs, including Strata and PixNet.


international conference on acoustics, speech, and signal processing | 2014

GPU-accelerated joint 1D and 2D barcode localization on smartphones

Gábor Sörös

The built-in cameras and powerful processors have turned smartphones into ubiquitous barcode scanners. In smartphone-based barcode scanning, barcode localization is an important preprocessing step that quickly scans the entire camera image and passes barcode candidates to the actual decoder. This paper presents the implementation steps of a robust joint 1D and 2D barcode localization algorithm on the mobile GPU. The barcode probability maps are derived from the structure matrix and the color of the individual pixels. The different steps of the localization algorithm are formulated as OpenGL ES 2.0 fragment shaders and both 1D and 2D barcode saliency maps are computed directly on the graphics hardware. The presented method can detect barcodes at various scales and orientations at 6 frames per second in HD resolution images on current generation smartphones.

Collaboration


Dive into the Gábor Sörös's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fabrizio Pece

University College London

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge