Amir Tamrakar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amir Tamrakar is active.

Explore More

Publication

Featured researches published by Amir Tamrakar.

computer vision and pattern recognition | 2012

Evaluation of low-level features and their combinations for complex event detection in open source videos

Amir Tamrakar; Saad Ali; Qian Yu; Jingen Liu; Omar Javed; Ajay Divakaran; Hui Cheng; Harpreet S. Sawhney

Low-level appearance as well as spatio-temporal features, appropriately quantized and aggregated into Bag-of-Words (BoW) descriptors, have been shown to be effective in many detection and recognition tasks. However, their effcacy for complex event recognition in unconstrained videos have not been systematically evaluated. In this paper, we use the NIST TRECVID Multimedia Event Detection (MED11 [1]) open source dataset, containing annotated data for 15 high-level events, as the standardized test bed for evaluating the low-level features. This dataset contains a large number of user-generated video clips. We consider 7 different low-level features, both static and dynamic, using BoW descriptors within an SVM approach for event detection. We present performance results on the 15 MED11 events for each of the features as well as their combinations using a number of early and late fusion strategies and discuss their strengths and limitations.

workshop on applications of computer vision | 2013

Video event recognition using concept attributes

Jingen Liu; Qian Yu; Omar Javed; Saad Ali; Amir Tamrakar; Ajay Divakaran; Hui Cheng; Harpreet S. Sawhney

We propose to use action, scene and object concepts as semantic attributes for classification of video events in InTheWild content, such as YouTube videos. We model events using a variety of complementary semantic attribute features developed in a semantic concept space. Our contribution is to systematically demonstrate the advantages of this concept-based event representation (CBER) in applications of video event classification and understanding. Specifically, CBER has better generalization capability, which enables to recognize events with a few training examples. In addition, CBER makes it possible to recognize a novel event without training examples (i.e., zero-shot learning). We further show our proposed enhanced event model can further improve the zero-shot learning. Furthermore, CBER provides a straightforward way for event recounting/understanding. We use the TRECVID Multimedia Event Detection (MED11) open source event definitions and datasets as our test bed and show results on over 1400 hours of videos.

affective computing and intelligent interaction | 2015

The Tower Game Dataset: A multimodal dataset for analyzing social interaction predicates

David A. Salter; Amir Tamrakar; Behjat Siddiquie; Mohamed R. Amer; Ajay Divakaran; Brian Lande; Darius Mehri

We introduce the Tower Game Dataset for computational modeling of social interaction predicates. Existing research in affective computing has focused primarily on recognizing the emotional and mental state of a human based on external behaviors. Recent research in the social science community argues that engaged and sustained social interactions require the participants to jointly coordinate their verbal and non-verbal behaviors. With this as our guiding principle, we collected the Tower Game Dataset consisting of multimodal recordings of two players participating in a tower building game, in the process communicating and collaborating with each other. The format of the game was specifically chosen as it elicits spontaneous communication from the participants through social interaction predicates such as joint attention and entrainment. The dataset will be made public and we believe that it will foster new research in the area of computational social interaction modeling.

International Journal of Computer Vision | 2018

Deep Multimodal Fusion: A Hybrid Approach

Mohamed R. Amer; Timothy J. Shields; Behjat Siddiquie; Amir Tamrakar; Ajay Divakaran; Sek M. Chai

We propose a novel hybrid model that exploits the strength of discriminative classifiers along with the representation power of generative models. Our focus is on detecting multimodal events in time varying sequences as well as generating missing data in any of the modalities. Discriminative classifiers have been shown to achieve higher performances than the corresponding generative likelihood-based classifiers. On the other hand, generative models learn a rich informative space which allows for data generation and joint feature representation that discriminative models lack. We propose a new model that jointly optimizes the representation space using a hybrid energy function. We employ a Restricted Boltzmann Machines (RBMs) based model to learn a shared representation across multiple modalities with time varying data. The Conditional RBMs (CRBMs) is an extension of the RBM model that takes into account short term temporal phenomena. The hybrid model involves augmenting CRBMs with a discriminative component for classification. For these purposes we propose a novel Multimodal Discriminative CRBMs (MMDCRBMs) model. First, we train the MMDCRBMs model using labeled data by training each modality, followed by training a fusion layer. Second, we exploit the generative capability of MMDCRBMs to activate the trained model so as to generate the lower-level data corresponding to the specific label that closely matches the actual input data. We evaluate our approach on ChaLearn dataset, audio-mocap, as well as the Tower Game dataset, mocap-mocap as well as three multimodal toy datasets. We report classification accuracy, generation accuracy, and localization accuracy and demonstrate its superiority compared to the state-of-the-art methods.

learning analytics and knowledge | 2016

Multimodal analytics to study collaborative problem solving in pair programming

Shuchi Grover; Marie A. Bienkowski; Amir Tamrakar; Behjat Siddiquie; David A. Salter; Ajay Divakaran

Collaborative problem solving (CPS) is seen as a key skill in K-12 education---in computer science as well as other subjects. Efforts to introduce children to computing rely on pair programming as a way of having young learners engage in CPS. Characteristics of quality collaboration are joint exploring or understanding, joint representation, and joint execution. We present a data driven approach to assessing and elucidating collaboration through modeling of multimodal student behavior and performance data.

international joint conference on artificial intelligence | 2018

Aesop: A Visual Storytelling Platform for Conversational AI.

Tim Meo; Aswin Raghavan; David A. Salter; Alex Tozzo; Amir Tamrakar; Mohamed R. Amer

We present a new collaborative visual storytelling platform, Aesop, for direction and animation. Aesop consists of a language parser, human gesture monitoring, composition graphs, dialogue state manager, and an interactive 3D animation software. Aesop thus enables 3D spatial and temporal reasoning which are both essential for storytelling. Our key innovation is to enable conversational AI using both verbal and non-verbal communication, which enables research in language, vision, and planning.

Archive | 2010

Method for computing food volume in a method for analyzing food

Amir Tamrakar; Harpreet S. Sawhney; Qian Yu; Ajay Divakaran

Proc. of NIST TRECVID and Workshop, Gaithersberg, USA | 2012

SRI-Sarnoff AURORA System at TRECVID 2012: Multimedia Event Detection and Recounting

Hui Cheng; Jingen Liu; Saad Ali; Omar Javed; Qian Yu; Amir Tamrakar; Ajay Divakaran; Harpreet S. Sawhney; R. Manmatha; James Allan; Alexander G. Hauptmann; Mubarak Shah; Subhabrata Bhattacharya; Afshin Dehghan; Gerald Friedland; Benjamin Elizalde; Trevor Darrell; Michael J. Witbrock; Jon Curtis

Archive | 2014