Ankur Agarwal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ankur Agarwal is active.

Explore More

Publication

Featured researches published by Ankur Agarwal.

ieee international workshop on horizontal interactive human computer systems | 2007

C-Slate: A Multi-Touch and Object Recognition System for Remote Collaboration using Horizontal Surfaces

Shahram Izadi; Ankur Agarwal; Antonio Criminisi; John Winn; Andrew Blake; Andrew W. Fitzgibbon

We introduce C-Slate, a new vision-based system, which utilizes stereo cameras above a commercially available tablet technology to support remote collaboration. The horizontally mounted tablet provides the user with high resolution stylus input, which is augmented by multi-touch interaction and recognition of untagged everyday physical objects using new stereo vision and machine learning techniques. This provides a novel and interesting interactive tabletop arrangement, capable of supporting a variety of fluid multi-touch interactions, including symmetric and asymmetric bimanual input, coupled with the potential for incorporating tangible objects into the user interface. When used in a remote context, these features are combined with the ability to see visual representations of remote users hands and remote physical objects placed on top of the surface. This combination of bimanual and tangible interaction and sharing of remote gestures and physical objects provides a new way to collaborate remotely, complementing existing channels such as audio and video conferencing.This paper presents the design of two tabletop file system interfaces: OriTop, a novel associative access approach to file system interaction, where users navigate multiple file systems by selecting focus files; and the Browser, a hierarchical interface that is based upon the same mental model as conventional desktop file system access. We report a qualitative study with ten users to explore both approaches. OnTop was found to better facilitate collaboration on file access and use, while the more familiar hierarchical model of the Browser was found to be more natural on very early use and has a clear role-particularly in cases where the associative approach fails.

ieee international workshop on horizontal interactive human computer systems | 2007

High Precision Multi-touch Sensing on Surfaces using Overhead Cameras

Ankur Agarwal; Shahram Izadi; Manmohan Chandraker; Andrew Blake

We present a method to enable multi-touch interactions on an arbitrary flat surface using a pair of cameras mounted above the surface. Current systems in this domain mostly make use of special touch-sensitive hardware, require cameras to be mounted behind the display, or are based on infrared sensors used in various configurations. The very few that use ordinary cameras mounted overhead for touch detection fail to do so accurately due to the difficulty in computing the proximity of fingertips to the surface with a precision that would match the behaviour of a truly touch-sensitive surface. This paper describes a novel computer vision algorithm that can robustly identify finger tips and detect touch with a precision of a few millimetres above the surface. The algorithm relies on machine learning methods and a geometric finger model to achieve the required precision, and can be trained to work in different physical settings. We provide a quantitative evaluation of the method and demonstrate its use for gesture based interactions with ordinary tablet displays, both in single user and remote collaboration scenarios.

International Journal of Computer Vision | 2008

Multilevel Image Coding with Hyperfeatures

Ankur Agarwal; Bill Triggs

AbstractnHistograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant with good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics over scales larger than the local input patches. We present a multilevel visual representation that remedies this. The starting point is the notion that to detect object parts in images, in practice it often suffices to detect co-occurrences of more local object fragments. This can be formalized by coding image patches against a codebook of known fragments or a more general statistical model and locally histogramming the resulting labels to capture their co-occurrence statistics. Local patch descriptors are converted into somewhat less local histograms over label occurrences. The histograms are themselves local descriptor vectors so the process can be iterated to code ever larger assemblies of object parts and increasingly abstract or ‘semantic’ image properties. We call these higher-level descriptors “hyperfeatures”. We formulate the hyperfeature model and study its performance under several different image coding methods including k-means based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks.n

computer vision and pattern recognition | 2007

Incorporating On-demand Stereo for Real Time Recognition

Thomas Deselaers; Antonio Criminisi; John Winn; Ankur Agarwal

A new method for localising and recognising hand poses and objects in real-time is presented. This problem is important in vision-driven applications where it is natural for a user to combine hand gestures and real objects when interacting with a machine. Examples include using a real eraser to remove words from a document displayed on an electronic surface. In this paper the task of simultaneously recognising object classes, hand gestures and detecting touch events is cast as a single classification problem. A random forest algorithm is employed which adaptively selects and combines a minimal set of appearance, shape and stereo features to achieve maximum class discrimination for a given image. This minimal set leads to both efficiency at run time and good generalisation. Unlike previous stereo works which explicitly construct disparity maps, here the stereo matching costs are used directly as visual cue and only computed on-demand, i.e. only for pixels where they are necessary for recognition. This leads to improved efficiency. The proposed method is assessed on a database of a variety of objects and hand poses selected for interacting on a flat surface in an office environment.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2010

Dense Stereo Matching over the Panum Band

Ankur Agarwal; Andrew Blake

Stereo matching algorithms conventionally match over a range of disparities sufficient to encompass all visible 3D scene points. Human vision, however, works over a narrow band of disparities-Panums fusional band-whose typical range may be as little as 1/20 of the full range of disparities for visible points. Only points inside the band are fused visually; the remainder of points are seen diplopically. A probabilistic approach is presented for dense stereo matching under the Panum band restriction. It is shown that existing dense stereo algorithms are inadequate in this problem setting and the main problem is segmentation, marking the image into the areas that fall inside the band. An approximation is derived that makes up for missing out-of-band information with a ¿proxy¿ based on image autocorrelation. It is shown that the Panum Proxy algorithm achieves accuracy close to what can be obtained when the full disparity band is available, and with gains of between one and two orders of magnitude in computation time. There are also substantial gains in computation space. Panum band processing is also demonstrated in an active stereopsis framework.

computer vision and pattern recognition | 2006

The Panum Proxy Algorithm for Dense Stereo Matching over a Volume of Interest

Ankur Agarwal; Andrew Blake

Stereo matching algorithms conventionally match over a range of disparities sufficient to encompass all visible 3D scene points. Human vision however does not do this. It works over a narrow band of disparities - Panum’s fusional band - whose typical range may be as little as 1/20 of the full range of disparities for visible points. Points inside the band are fused visually and the remainder of points are seen as diplopic - that is with double vision. The Panum band restriction is important also in machine vision, both with active (pan/tilt) cameras, and with high resolution cameras and digital pan/tilt.

Archive | 2007