Srinath Sridhar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Srinath Sridhar is active.

Explore More

Publication

Featured researches published by Srinath Sridhar.

international conference on computer vision | 2013

Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data

Srinath Sridhar; Antti Oulasvirta; Christian Theobalt

Tracking the articulated 3D motion of the hand has important applications, for example, in human-computer interaction and teleoperation. We present a novel method that can capture a broad range of articulated hand motions at interactive rates. Our hybrid approach combines, in a voting scheme, a discriminative, part-based pose retrieval method with a generative pose estimation method based on local optimization. Color information from a multi-view RGB camera setup along with a person-specific hand model are used by the generative method to find the pose that best explains the observed images. In parallel, our discriminative pose estimation method uses fingertips detected on depth data to estimate a complete or partial pose of the hand by adopting a part-based pose retrieval strategy. This part-based strategy helps reduce the search space drastically in comparison to a global pose retrieval strategy. Quantitative results show that our method achieves state-of-the-art accuracy on challenging sequences and a near-real time performance of 10 fps on a desktop computer.

computer vision and pattern recognition | 2015

Fast and robust hand tracking using detection-guided optimization

Srinath Sridhar; Franziska Mueller; Antti Oulasvirta; Christian Theobalt

Markerless tracking of hands and fingers is a promising enabler for human-computer interaction. However, adoption has been limited because of tracking inaccuracies, incomplete coverage of motions, low framerate, complex camera setups, and high computational requirements. In this paper, we present a fast method for accurately tracking rapid and complex articulations of the hand using a single depth camera. Our algorithm uses a novel detectionguided optimization strategy that increases the robustness and speed of pose estimation. In the detection step, a randomized decision forest classifies pixels into parts of the hand. In the optimization step, a novel objective function combines the detected part labels and a Gaussian mixture representation of the depth to estimate a pose that best fits the depth. Our approach needs comparably less computational resources which makes it extremely fast (50 fps without GPU support). The approach also supports varying static, or moving, camera-to-scene arrangements. We show the benefits of our method by evaluating on public datasets and comparing against previous work.

international conference on computer graphics and interactive techniques | 2017

VNect: real-time 3D human pose estimation with a single RGB camera

Dushyant Mehta; Srinath Sridhar; Oleksandr Sotnychenko; Helge Rhodin; Mohammad Shafiei; Hans-Peter Seidel; Weipeng Xu; Dan Casas; Christian Theobalt

We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our methods accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e., it works for outdoor scenes, community videos, and low quality commodity RGB cameras.

human factors in computing systems | 2015

Investigating the Dexterity of Multi-Finger Input for Mid-Air Text Entry

Srinath Sridhar; Anna Maria Feit; Christian Theobalt; Antti Oulasvirta

This paper investigates an emerging input method enabled by progress in hand tracking: input by free motion of fingers. The method is expressive, potentially fast, and usable across many settings as it does not insist on physical contact or visual feedback. Our goal is to inform the design of high-performance input methods by providing detailed analysis of the performance and anatomical characteristics of finger motion. We conducted an experiment using a commercially available sensor to report on the speed, accuracy, individuation, movement ranges, and individual differences of each finger. Findings show differences of up to 50% in movement times and provide indices quantifying the individuation of single fingers. We apply our findings to text entry by computational optimization of multi-finger gestures in mid-air. To this end, we define a novel objective function that considers performance, anatomical factors, and learnability. First investigations of one optimization case show entry rates of 22 words per minute (WPM). We conclude with a critical discussion of the limitations posed by human factors and performance characteristics of existing markerless hand trackers.

international conference on 3d vision | 2014

Real-Time Hand Tracking Using a Sum of Anisotropic Gaussians Model

Srinath Sridhar; Helge Rhodin; Hans-Peter Seidel; Antti Oulasvirta; Christian Theobalt

Real-time marker-less hand tracking is of increasing importance in human-computer interaction. Robust and accurate tracking of arbitrary hand motion is a challenging problem due to the many degrees of freedom, frequent self-occlusions, fast motions, and uniform skin color. In this paper, we propose a new approach that tracks the full skeleton motion of the hand from multiple RGB cameras in real-time. The main contributions include a new generative tracking method which employs an implicit hand shape representation based on Sum of Anisotropic Gaussians (SAG), and a pose fitting energy that is smooth and analytically differentiable making fast gradient based pose optimization possible. This shape representation, together with a full perspective projection model, enables more accurate hand modeling than a related baseline method from literature. Our method achieves better accuracy than previous methods and runs at 25 fps. We show these improvements both qualitatively and quantitatively on publicly available datasets.

international symposium on mixed and augmented reality | 2013

User-centered perspectives for automotive augmented reality

Victor Ng-Thow-Hing; Karlin Bark; Lee Beckwith; Cuong Tran; Rishabh Bhandari; Srinath Sridhar

Augmented reality (AR) in automobiles has the potential to significantly alter the drivers user experience. Prototypes developed in academia and industry demonstrate a range of applications from advanced driver assist systems to location-based information services. A user-centered process for creating and evaluating designs for AR displays in automobiles helps to explore what collaborative role AR should serve between the technologies of the automobile and the driver. In particular, we consider the nature of this role along three important perspectives: understanding human perception, understanding distraction and understanding human behavior. We argue that AR applications should focus solely on tasks that involve the immediate local driving environment and not secondary task spaces to minimize driver distraction. Consistent depth cues should be supported by the technology to aid proper distance judgement. Driving aids supporting situation awareness should be designed with knowledge of current and future states of road users, while focusing on specific problems. Designs must also take into account behavioral phenomena such as risk compensation, inattentional blindness and an over-reliance on augmented technology in driving decisions.

european conference on computer vision | 2016

Real-Time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Srinath Sridhar; Franziska Mueller; Michael Zollhöfer; Dan Casas; Antti Oulasvirta; Christian Theobalt

Real-time simultaneous tracking of hands manipulating and interacting with external objects has many potential applications in augmented reality, tangible computing, and wearable computing. However, due to difficult occlusions, fast motions, and uniform hand appearance, jointly tracking hand and object pose is more challenging than tracking either of the two separately. Many previous approaches resort to complex multi-camera setups to remedy the occlusion problem and often employ expensive segmentation and optimization steps which makes real-time tracking impossible. In this paper, we propose a real-time solution that uses a single commodity RGB-D camera. The core of our approach is a 3D articulated Gaussian mixture alignment strategy tailored to hand-object tracking that allows fast pose optimization. The alignment energy uses novel regularizers to address occlusions and hand-object contacts. For added robustness, we guide the optimization with discriminative part classification of the hand and segmentation of the object. We conducted extensive experiments on several existing datasets and introduce a new annotated hand-object dataset. Quantitative and qualitative results show the key advantages of our method: speed, accuracy, and robustness.

international conference on computer vision | 2017

Real-Time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

Franziska Mueller; Dushyant Mehta; Oleksandr Sotnychenko; Srinath Sridhar; Dan Casas; Christian Theobalt

We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints—common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.

human factors in computing systems | 2017

WatchSense: On- and Above-Skin Input Sensing through a Wearable Depth Sensor

Srinath Sridhar; Anders Markussen; Antti Oulasvirta; Christian Theobalt; Sebastian Boring

This paper contributes a novel sensing approach to support on- and above-skin finger input for interaction on the move. WatchSense uses a depth sensor embedded in a wearable device to expand the input space to neighboring areas of skin and the space above it. Our approach addresses challenging camera-based tracking conditions, such as oblique viewing angles and occlusions. It can accurately detect fingertips, their locations, and whether they are touching the skin or hovering above it. It extends previous work that supported either mid-air or multitouch input by simultaneously supporting both. We demonstrate feasibility with a compact, wearable prototype attached to a users forearm (simulating an integrated depth sensor). Our prototype---which runs in real-time on consumer mobile devices---enables a 3D input space on the back of the hand. We evaluated the accuracy and robustness of the approach in a user study. We also show how WatchSense increases the expressiveness of input by interweaving mid-air and multitouch for several interactive applications.

international symposium on mixed and augmented reality | 2012

Generation of virtual display surfaces for in-vehicle contextual augmented reality

Srinath Sridhar; Victor Ng-Thow-Hing

In-vehicle contextual augmented reality (I-CAR) has the potential to provide novel visual feedback to drivers for an enhanced driving experience. To enable I-CAR, we present a parametrized road trench model (RTM) for dynamically extracting display surfaces from a drivers point of view that is adaptable to constantly changing road curvature and intersections. We use computer vision algorithms to analyze and extract road features that are used to estimate the parameters of the RTM. GPS coordinates are used to quickly compute lighting parameters for shading and shadows. Novel driver-based applications that use the RTM are presented.

Explore More