Is this you? Create Your Porfile

Abhijit Mishra

Indian Institute of Technology Bombay

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Abhijit Mishra is active.

Explore More

Publication

Featured researches published by Abhijit Mishra.

meeting of the association for computational linguistics | 2014

Measuring Sentiment Annotation Complexity of Text

Aditya Joshi; Abhijit Mishra; Nivvedan Senthamilselvan; Pushpak Bhattacharyya

The effort required for a human annotator to detect sentiment is not uniform for all texts, irrespective of his/her expertise. We aim to predict a score that quantifies this effort, using linguistic properties of the text. Our proposed metric is called Sentiment Annotation Complexity (SAC). As for training data, since any direct judgment of complexity by a human annotator is fraught with subjectivity, we rely on cognitive evidence from eye-tracking. The sentences in our dataset are labeled with SAC scores derived from eye-fixation duration. Using linguistic features and annotated SACs, we train a regressor that predicts the SAC with a best mean error rate of 22.02% for five-fold cross-validation. We also study the correlation between a human annotator’s perception of complexity and a machine’s confidence in polarity determination. The merit of our work lies in (a) deciding the sentiment annotation cost in, for example, a crowdsourcing setting, (b) choosing the right classifier for sentiment prediction.

workshop on statistical machine translation | 2014

The IIT Bombay Hindi-English Translation System at WMT 2014

Piyush Dungarwal; Rajen Chatterjee; Abhijit Mishra; Anoop Kunchukuttan; Ritesh M. Shah; Pushpak Bhattacharyya

In this paper, we describe our EnglishHindi and Hindi-English statistical systems submitted to the WMT14 shared task. The core components of our translation systems are phrase based (Hindi-English) and factored (English-Hindi) SMT systems. We show that the use of number, case and Tree Adjoining Grammar information as factors helps to improve English-Hindi translation, primarily by generating morphological inflections correctly. We show improvements to the translation systems using pre-procesing and post-processing components. To overcome the structural divergence between English and Hindi, we preorder the source side sentence to conform to the target language word order. Since parallel corpus is limited, many words are not translated. We translate out-of-vocabulary words and transliterate named entities in a post-processing stage. We also investigate ranking of translations from multiple systems to select the best translation.

meeting of the association for computational linguistics | 2014

A cognitive study of subjectivity extraction in sentiment annotation

Abhijit Mishra; Aditya Joshi; Pushpak Bhattacharyya

Existing sentiment analysers are weak AI systems: they try to capture the functionality of human sentiment detection faculty, without worrying about how such faculty is realized in the hardware of the human. These analysers are agnostic of the actual cognitive processes involved. This, however, does not deliver when applications demand order of magnitude facelift in accuracy, as well as insight into characteristics of sentiment detection process. In this paper, we present a cognitive study of sentiment detection from the perspective of strong AI. We study the sentiment detection process of a set of human “sentiment readers”. Using eye-tracking, we show that on the way to sentiment detection, humans first extract subjectivity. They focus attention on a subset of sentences before arriving at the overall sentiment. This they do either through ”anticipation” where sentences are skipped during the first pass of reading, or through ”homing” where a subset of the sentences are read over multiple passes, or through both. ”Homing” behaviour is also observed at the sub-sentence level in complex sentiment phenomena like sarcasm.

conference on computational natural language learning | 2016

Leveraging Cognitive Features for Sentiment Analysis.

Abhijit Mishra; Diptesh Kanojia; Seema Nagar; Kuntal Dey; Pushpak Bhattacharyya

Sentiments expressed in user-generated short text and sentences are nuanced by subtleties at lexical, syntactic, semantic and pragmatic levels. To address this, we propose to augment traditional features used for sentiment analysis and sarcasm detection, with cognitive features derived from the eye-movement patterns of readers. Statistical classification using our enhanced feature set improves the performance (F-score) of polarity detection by a maximum of 3.7% and 9.3% on two datasets, over the systems that use only traditional features. We perform feature significance analysis, and experiment on a held-out dataset, showing that cognitive features indeed empower sentiment analyzers to handle complex constructs.

meeting of the association for computational linguistics | 2017

Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network

Abhijit Mishra; Kuntal Dey; Pushpak Bhattacharyya

Cognitive NLP systems- i.e., NLP systems that make use of behavioral data - augment traditional text-based features with cognitive features extracted from eye-movement patterns, EEG signals, brain-imaging etc. Such extraction of features is typically manual. We contend that manual extraction of features may not be the best way to tackle text subtleties that characteristically prevail in complex classification tasks like Sentiment Analysis and Sarcasm Detection, and that even the extraction and choice of features should be delegated to the learning system. We introduce a framework to automatically extract cognitive features from the eye-movement/gaze data of human readers reading the text and use them as features along with textual features for the tasks of sentiment polarity and sarcasm detection. Our proposed framework is based on Convolutional Neural Network (CNN). The CNN learns features from both gaze and text and uses them to classify the input text. We test our technique on published sentiment and sarcasm labeled datasets, enriched with gaze information, to show that using a combination of automatically learned text and gaze features often yields better classification performance over (i) CNN based systems that rely on text input alone and (ii) existing systems that rely on handcrafted gaze and textual features.

meeting of the association for computational linguistics | 2016

Harnessing Cognitive Features for Sarcasm Detection

Abhijit Mishra; Diptesh Kanojia; Seema Nagar; Kuntal Dey; Pushpak Bhattacharyya

In this paper, we propose a novel mechanism for enriching the feature vector, for the task of sarcasm detection, with cognitive features extracted from eye-movement patterns of human readers. Sarcasm detection has been a challenging research problem, and its importance for NLP applications such as review summarization, dialog systems and sentiment analysis is well recognized. Sarcasm can often be traced to incongruity that becomes apparent as the full sentence unfolds. This presence of incongruity- implicit or explicit- affects the way readers eyes move through the text. We observe the difference in the behaviour of the eye, while reading sarcastic and non sarcastic sentences. Motivated by this observation, we augment traditional linguistic and stylistic features for sarcasm detection with the cognitive features obtained from readers eye movement data. We perform statistical classification using the enhanced feature set so obtained. The augmented cognitive features improve sarcasm detection by 3.7% (in terms of Fscore), over the performance of the best reported system.

Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning | 2016

Leveraging Annotators' Gaze Behaviour for Coreference Resolution

Joe Cheri; Abhijit Mishra; Pushpak Bhattacharyya

This paper aims at utilizing cognitive information obtained from the eye movements behavior of annotators for automatic coreference resolution. We first record eye-movement behavior of multiple annotators resolving coreferences in 22 documents selected from MUC dataset. By inspecting the gaze-regression profiles of our participants, we observe how regressive saccades account for selection of potential antecedents for a certain anaphoric mention. Based on this observation, we then propose a heuristic to utilize gaze data to prune mention pairs in mention-pair model, a popular paradigm for automatic coreference resolution. Consistent improvement in accuracy across several classifiers is observed with our heuristic, demonstrating why cognitive data can be useful for a difficult task like coreference resolution.

international joint conference on natural language processing | 2015

A Computational Approach to Automatic Prediction of Drunk-Texting

Aditya Joshi; Abhijit Mishra; Balamurali Ar; Pushpak Bhattacharyya; Mark James Carman

Alcohol abuse may lead to unsociable behavior such as crime, drunk driving, or privacy leaks. We introduce automatic drunk-texting prediction as the task of identifying whether a text was written when under the influence of alcohol. We experiment with tweets labeled using hashtags as distant supervision. Our classifiers use a set of N-gram and stylistic features to detect drunk tweets. Our observations present the first quantitative evidence that text contains signals that can be exploited to detect drunk-texting.

language resources and evaluation | 2014