Dushyant Rao | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dushyant Rao is active.

Explore More

Publication

Featured researches published by Dushyant Rao.

international conference on robotics and automation | 2017

Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks

Martin Engelcke; Dushyant Rao; Dominic Zeng Wang; Chi Hay Tong; Ingmar Posner

This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for different architectures and additionally propose to use an L1 penalty on the filter activations to further encourage sparsity in the intermediate representations. To the best of our knowledge, this is the first work to propose sparse convolutional layers and L1 regularisation for efficient large-scale processing of 3D data. We demonstrate the efficacy of our approach on the KITTI object detection benchmark and show that VoteSDeep models with as few as three layers outperform the previous state of the art in both laser and laser-vision based approaches by margins of up to 40% while remaining highly competitive in terms of processing time.

intelligent robots and systems | 2012

CurveSLAM: An approach for vision-based navigation without point features

Dushyant Rao; Soon-Jo Chung; Seth Hutchinson

Existing approaches to visual Simultaneous Localization and Mapping (SLAM) typically utilize points as visual feature primitives to represent landmarks in the environment. Since these techniques mostly use image points from a standard feature point detector, they do not explicitly map objects or regions of interest. Our work is motivated by the need for different SLAM techniques in path and riverine settings, where feature points can be scarce or may not adequately represent the environment. Accordingly, the proposed approach uses cubic Bézier curves as stereo vision primitives and offers a novel SLAM formulation to update the curve parameters and vehicle pose. This method eliminates the need for point-based stereo matching, with an optimization procedure to directly extract the curve information in the world frame from noisy edge measurements. Further, the proposed algorithm enables navigation with fewer feature states than most point-based techniques, and is able to produce a map which only provides detail in key areas. Results in simulation and with vision data validate that the proposed method can be effective in estimating the 6DOF pose of the stereo camera, and can produce structured, uncluttered maps.

field and service robotics | 2015

Hierarchical Classification in AUV Imagery

Michael Bewley; Navid Nourani-Vatani; Dushyant Rao; Bertrand Douillard; Oscar Pizarro; Stefan B. Williams

In recent years, Autonomous Underwater Vehicles (AUVs) have been used extensively to gather imagery and other environmental data for ocean monitoring. Processing of this vast amount of collected imagery to label content is difficult, expensive and time consuming. Because of this, typically only a small subset of images are labelled, and only at a small number of points. In order to make full use of the raw data returned from the AUV, this labelling process needs to be automated. In this work the single species classification problem of [1] is extended to a multi-species classification problem following a taxonomical hierarchy. We demonstrate the application of techniques used in areas such as computer vision, text classification and medical diagnosis to the supervised hierarchical classification of benthic images. After making a comparison to flat multi-class classification, we also discuss critical aspects such as training topology and various prediction and scoring methodologies. An interesting aspect of the presented work is that the ground truth labels are sparse and incomplete, i.e. not all labels go to the leaf node, which brings with it other interesting challenges.We find that the best classification results are obtained using Local Binary Patterns (LBP), training a network of binary classifiers with probabilistic output, and applying “one-vs-rest” classification at each level of the hierarchy for prediction. This work presents a working solution that allows AUV images to be automatically labelled with the most appropriate node in a hierarchy of 19 biological groupings and morphologies. The result is that the output of the AUV system can include a semantic map using the taxonomy prescribed by marine scientists. This has the potential to not only reduce the manual labelling workload, but also to reduce the current dependence that marine scientists have on extrapolating information from a relatively small number of sparsely labelled points.

international conference on robotics and automation | 2014

Multimodal learning for autonomous underwater vehicles from visual and bathymetric data

Dushyant Rao; Mark De Deuge; Navid Nourani-Vatani; Bertrand Douillard; Stefan B. Williams; Oscar Pizarro

Autonomous Underwater Vehicles (AUVs) gather large volumes of visual imagery, which can help monitor marine ecosystems and plan future surveys. One key task in marine ecology is benthic habitat mapping, the classification of large regions of the ocean floor into broad habitat categories. Since visual data only covers a small fraction of the ocean floor, traditional habitat mapping is performed using shipborne acoustic multi-beam data, with visual data as ground truth. However, given the high resolution and rich textural cues in visual data, an ideal approach should explicitly utilise visual features in the classification process. To this end, we propose a multimodal model which utilises visual data and shipborne multi-beam bathymetry to perform both classification and sampling tasks. Our algorithm learns the relationship between both modalities, but is also effective when visual data is missing. Our results suggest that by performing multimodal learning, classification performance is improved in scenarios where visual data is unavailable, such as the habitat mapping scenario. We also demonstrate empirically that the model is able to perform generative tasks, producing plausible samples from the underlying data-generating distribution.

AIAA Infotech at Aerospace Conference and Exhibit 2011 | 2011

Monocular Vision based Navigation in GPS-Denied Riverine Environments

Junho Yang; Dushyant Rao; Soon-Jo Chung; Seth Hutchinson

This paper presents a new method to estimate the range and bearing of landmarks and solve the simultaneous localization and mapping (SLAM) problem. The proposed ranging and SLAM algorithms have application to a micro aerial vehicle (MAV) flying through riverine environments which occasionally involve heavy foliage and forest canopy. Monocular vision navigation has merits in MAV applications since it is lightweight and provides abundant visual cues of the environment in comparison to other ranging methods. In this paper, we suggest a monocular vision strategy incorporating image segmentation and epipolar geometry to extend the capability of the ranging method to unknown outdoor environments. The validity of our proposed method is verified through experiments in a river-like environment.

The International Journal of Robotics Research | 2017

Large-scale cost function learning for path planning using deep inverse reinforcement learning:

Markus Wulfmeier; Dushyant Rao; Dominic Zeng Wang; Peter Ondruska; Ingmar Posner

We present an approach for learning spatial traversability maps for driving in complex, urban environments based on an extensive dataset demonstrating the driving behaviour of human experts. The direct end-to-end mapping from raw input data to cost bypasses the effort of manually designing parts of the pipeline, exploits a large number of data samples, and can be framed additionally to refine handcrafted cost maps produced based on manual hand-engineered features. To achieve this, we introduce a maximum-entropy-based, non-linear inverse reinforcement learning (IRL) framework which exploits the capacity of fully convolutional neural networks (FCNs) to represent the cost model underlying driving behaviours. The application of a high-capacity, deep, parametric approach successfully scales to more complex environments and driving behaviours, while at deployment being run-time independent of training dataset size. After benchmarking against state-of-the-art IRL approaches, we focus on demonstrating scalability and performance on an ambitious dataset collected over the course of 1 year including more than 25,000 demonstration trajectories extracted from over 120 km of urban driving. We evaluate the resulting cost representations by showing the advantages over a carefully, manually designed cost map and furthermore demonstrate its robustness towards systematic errors by learning accurate representations even in the presence of calibration perturbations. Importantly, we demonstrate that a manually designed cost map can be refined to more accurately handle corner cases that are scarcely seen in the environment, such as stairs, slopes and underpasses, by further incorporating human priors into the training framework.

international conference on robotics and automation | 2016

Multimodal information-theoretic measures for autonomous exploration

Dushyant Rao; Asher Bender; Stefan B. Williams; Oscar Pizarro

Autonomous underwater vehicles (AUVs) are widely used to perform information gathering missions in unseen environments. Given the sheer size of the ocean environment, and the time and energy constraints of an AUV, it is important to consider the potential utility of candidate missions when performing survey planning. In this paper, we utilise a multimodal learning approach to capture the relationship between in-situ visual observations, and shipborne bathymetry (ocean depth) data that are freely available a priori. We then derive information-theoretic measures under this model that predict the amount of visual information gain at an unobserved location based on the bathymetric features. Unlike previous approaches, these measures consider the value of additional visual features, rather than just the habitat labels obtained. Experimental results with a toy dataset and real marine data demonstrate that the approach can be used to predict the true utility of unexplored areas.

The International Journal of Robotics Research | 2018

Deep tracking in the wild: End-to-end tracking using recurrent neural networks:

Julie Dequaire; Peter Ondruska; Dushyant Rao; Dominic Zeng Wang; Ingmar Posner

This paper presents a novel approach for tracking static and dynamic objects for an autonomous vehicle operating in complex urban environments. Whereas traditional approaches for tracking often feature numerous hand-engineered stages, this method is learned end-to-end and can directly predict a fully unoccluded occupancy grid from raw laser input. We employ a recurrent neural network to capture the state and evolution of the environment, and train the model in an entirely unsupervised manner. In doing so, our use case compares to model-free, multi-object tracking although we do not explicitly perform the underlying data-association process. Further, we demonstrate that the underlying representation learned for the tracking task can be leveraged via inductive transfer to train an object detector in a data efficient manner. We motivate a number of architectural features and show the positive contribution of dilated convolutions, dynamic and static memory units to the task of tracking and classifying complex dynamic scenes through full occlusion. Our experimental results illustrate the ability of the model to track cars, buses, pedestrians, and cyclists from both moving and stationary platforms. Further, we compare and contrast the approach with a more traditional model-free multi-object tracking pipeline, demonstrating that it can more accurately predict future states of objects from current inputs.

The International Journal of Robotics Research | 2018

Learn from experience: Probabilistic prediction of perception performance to avoid failure

Corina Gurău; Dushyant Rao; Chi Hay Tong; Ingmar Posner

Despite significant advances in machine learning and perception over the past few decades, perception algorithms can still be unreliable when deployed in challenging time-varying environments. When these systems are used for autonomous decision-making, such as in self-driving vehicles, the impact of their mistakes can be catastrophic. As such, it is important to characterize the performance of the system and predict when and where it may fail in order to take appropriate action. While similar in spirit to the idea of introspection, this work introduces a new paradigm for predicting the likely performance of a robot’s perception system based on past experience in the same workspace. In particular, we propose two models that probabilistically predict perception performance from observations gathered over time. While both approaches are place-specific, the second approach additionally considers appearance similarity when incorporating past observations. We evaluate our method in a classical decision-making scenario in which the robot must choose when and where to drive autonomously in 60km of driving data from an urban environment. Results demonstrate that both approaches lead to fewer false decisions (in terms of incorrectly offering or denying autonomy) for two different detector models, and show that leveraging visual appearance within a state-of-the-art navigation framework increases the accuracy of our performance predictions.

The International Journal of Robotics Research | 2017

Multimodal learning and inference from visual and remotely sensed data

Dushyant Rao; Mark De Deuge; Navid Nourani–Vatani; Stefan B. Williams; Oscar Pizarro

Autonomous vehicles are often tasked to explore unseen environments, aiming to acquire and understand large amounts of visual image data and other sensory information. In such scenarios, remote sensing data may be available a priori, and can help to build a semantic model of the environment and plan future autonomous missions. In this paper, we introduce two multimodal learning algorithms to model the relationship between visual images taken by an autonomous underwater vehicle during a survey and remotely sensed acoustic bathymetry (ocean depth) data that is available prior to the survey. We present a multi-layer architecture to capture the joint distribution between the bathymetry and visual modalities. We then propose an extension based on gated feature learning models, which allows the model to cluster the input data in an unsupervised fashion and predict visual image features using just the ocean depth information. Our experiments demonstrate that multimodal learning improves semantic classification accuracy regardless of which modalities are available at classification time, allows for unsupervised clustering of either or both modalities, and can facilitate mission planning by enabling class-based or image-based queries.

Explore More