Vinay Bettadapura | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vinay Bettadapura is active.

Explore More

Publication

Featured researches published by Vinay Bettadapura.

international symposium on wearable computers | 2015

Predicting daily activities from egocentric images using deep learning

Daniel Castro; Steven Hickson; Vinay Bettadapura; Edison Thomaz; Gregory D. Abowd; Henrik I. Christensen; Irfan A. Essa

We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a persons activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.

workshop on applications of computer vision | 2015

Leveraging Context to Support Automated Food Recognition in Restaurants

Vinay Bettadapura; Edison Thomaz; Aman Parnami; Gregory D. Abowd; Irfan A. Essa

The pervasiveness of mobile cameras has resulted in a dramatic increase in food photos, which are pictures reflecting what people eat. In this paper, we study how taking pictures of what we eat in restaurants can be used for the purpose of automating food journaling. We propose to leverage the context of where the picture was taken, with additional information about the restaurant, available online, coupled with state-of-the-art computer vision techniques to recognize the food being consumed. To this end, we demonstrate image-based recognition of foods eaten in restaurants by training a classifier with images from restaurants online menu databases. We evaluate the performance of our system in unconstrained, real-world settings with food images taken in 10 restaurants across 5 different types of food (American, Indian, Italian, Mexican and Thai).

computer vision and pattern recognition | 2013

Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition

Vinay Bettadapura; Grant Schindler; Thomas Ploetz; Irfan A. Essa

We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams. In addition, we also propose the use of randomly sampled regular expressions to discover and encode patterns in activities. We demonstrate the effectiveness of our approach in experimental evaluations where we successfully recognize activities and detect anomalies in four complex datasets.

workshop on applications of computer vision | 2015

Egocentric Field-of-View Localization Using First-Person Point-of-View Devices

Vinay Bettadapura; Irfan A. Essa; Caroline Pantofaru

We present a technique that uses images, videos and sensor data taken from first-person point-of-view devices to perform egocentric field-of-view (FOV) localization. We define egocentric FOV localization as capturing the visual information from a persons field-of-view in a given environment and transferring this information onto a reference corpus of images and videos of the same space, hence determining what a person is attending to. Our method matches images and video taken from the first-person perspective with the reference corpus and refines the results using the first-persons head orientation information obtained using the device sensors. We demonstrate single and multi-user egocentric FOV localization in different indoor and outdoor environments with applications in augmented reality, event understanding and studying social interactions.

ubiquitous computing | 2012

Recognizing water-based activities in the home through infrastructure-mediated sensing

Edison Thomaz; Vinay Bettadapura; Gabriel Reyes; Megha Sandesh; Grant Schindler; Thomas Plötz; Gregory D. Abowd; Irfan A. Essa

Activity recognition in the home has been long recognized as the foundation for many desirable applications in fields such as home automation, sustainability, and healthcare. However, building a practical home activity monitoring system remains a challenge. Striking a balance between cost, privacy, ease of installation and scalability continues to be an elusive goal. In this paper, we explore infrastructure-mediated sensing combined with a vector space model learning approach as the basis of an activity recognition system for the home. We examine the performance of our single-sensor water-based system in recognizing eleven high-level activities in the kitchen and bathroom, such as cooking and shaving. Results from two studies show that our system can estimate activities with overall accuracy of 82.69% for one individual and 70.11% for a group of 23 participants. As far as we know, our work is the first to employ infrastructure-mediated sensing for inferring high-level human activities in a home setting.

computer assisted radiology and surgery | 2016

Automated video-based assessment of surgical skills for training and evaluation in medical schools

Aneeq Zia; Yachna Sharma; Vinay Bettadapura; Eric L. Sarin; Thomas Ploetz; Mark A. Clements; Irfan A. Essa

PurposeRoutine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches however are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities.MethodWe compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis.ResultsWe were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos.ConclusionOur evaluations show that frequency features perform better than motion texture features, which in-turn perform better than symbol-/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.

medical image computing and computer assisted intervention | 2015

Automated Assessment of Surgical Skills Using Frequency Analysis

Aneeq Zia; Yachna Sharma; Vinay Bettadapura; Eric L. Sarin; Mark A. Clements; Irfan A. Essa

We present an automated framework for visual assessment of the expertise level of surgeons using the OSATS Objective Structured Assessment of Technical Skills criteria. Video analysis techniques for extracting motion quality via frequency coefficients are introduced. The framework is tested on videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.

workshop on applications of computer vision | 2016

Discovering picturesque highlights from egocentric vacation videos

Vinay Bettadapura; Daniel Castro; Irfan A. Essa

We present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14 day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.

computer assisted radiology and surgery | 2018

Video and accelerometer-based motion analysis for automated surgical skills assessment

Aneeq Zia; Yachna Sharma; Vinay Bettadapura; Eric L. Sarin; Irfan A. Essa

PurposeBasic surgical skills of suturing and knot tying are an essential part of medical training. Having an automated system for surgical skills assessment could help save experts time and improve training efficiency. There have been some recent attempts at automated surgical skills assessment using either video analysis or acceleration data. In this paper, we present a novel approach for automated assessment of OSATS-like surgical skills and provide an analysis of different features on multi-modal data (video and accelerometer data).MethodsWe conduct a large study for basic surgical skill assessment on a dataset that contained video and accelerometer data for suturing and knot-tying tasks. We introduce “entropy-based” features—approximate entropy and cross-approximate entropy, which quantify the amount of predictability and regularity of fluctuations in time series data. The proposed features are compared to existing methods of Sequential Motion Texture, Discrete Cosine Transform and Discrete Fourier Transform, for surgical skills assessment.ResultsWe report average performance of different features across all applicable OSATS-like criteria for suturing and knot-tying tasks. Our analysis shows that the proposed entropy-based features outperform previous state-of-the-art methods using video data, achieving average classification accuracies of 95.1 and 92.2% for suturing and knot tying, respectively. For accelerometer data, our method performs better for suturing achieving 86.8% average accuracy. We also show that fusion of video and acceleration features can improve overall performance for skill assessment.ConclusionAutomated surgical skills assessment can be achieved with high accuracy using the proposed entropy features. Such a system can significantly improve the efficiency of surgical training in medical schools and teaching hospitals.

arXiv: Computer Vision and Pattern Recognition | 2012