Swapnil Hingmire
Tata Consultancy Services
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Swapnil Hingmire.
international acm sigir conference on research and development in information retrieval | 2014
Swapnil Hingmire; Sutanu Chakraborti
Supervised text classifiers require extensive human expertise and labeling efforts. In this paper, we propose a weakly supervised text classification algorithm based on the labeling of Latent Dirichlet Allocation (LDA) topics. Our algorithm is based on the generative property of LDA. In our algorithm, we ask an annotator to assign one or more class labels to each topic, based on its most probable words. We classify a document based on its posterior topic proportions and the class labels of the topics. We also enhance our approach by incorporating domain knowledge in the form of labeled words. We evaluate our approach on four real world text classification datasets. The results show that our approach is more accurate in comparison to semi-supervised techniques from previous work. A central contribution of this work is an approach that delivers effectiveness comparable to the state-of-the-art supervised techniques in hard-to-classify domains, with very low overheads in terms of manual knowledge engineering.
meeting of the association for computational linguistics | 2014
Swapnil Hingmire; Sutanu Chakraborti
Supervised text classification algorithms require a large number of documents labeled by humans, that involve a laborintensive and time consuming process. In this paper, we propose a weakly supervised algorithm in which supervision comes in the form of labeling of Latent Dirichlet Allocation (LDA) topics. We then use this weak supervision to “sprinkle” artificial words to the training documents to identify topics in accordance with the underlying class structure of the corpus based on the higher order word associations. We evaluate this approach to improve performance of text classification on three real world datasets.
forum for information retrieval evaluation | 2016
Rajiv Srivastava; Swapnil Hingmire; Girish Keshav Palshikar; Saheb Chaurasia; Arati M. Dixit
The effectiveness of recommendation systems is improving with the incorporation of richer context. The frequentist recommendation methods such as Markov models are not efficient in simultaneous use of context and preference sequences over items due to state space explosion. On the other end, abstractionist models such as Matrix Factorization where each item or user is represented as a set of abstract features are difficult to explain. An example in this case is, recommending Web Based Trainings (WBTs) to employees, similar to Massively Open Online Courses (MOOCs), wherein use of the sequence information in the recommendation of WBTs would help the user to gradually build expertise in their area of interest. For training recommendation, it is important to identify the held expertise level in technical area, that represents a state and possible sequences of trainings that represent transitions in terms of real world entities such as trainings and associated features. Alternatively the model can estimate expertise as a mixture over a tractable set of latent interests in terms of trainings completed, contextual features such as the training sequences, keywords and user profile. To the best of our knowledge, the state-of-the-art recommendation methods do not consider both explicit context and sequence information in a single model. In this paper, we propose a Context and Sequence Aware Recommendation System (CSRS) based on latent topic modelling framework, identifying topic-memberships for items, contextual features as well as for user interests. We demonstrate benefits of incorporating both context and sequence of items for recommendation on three real world datasets.
text, speech and dialogue | 2018
Ajay Gupta; Devendra Verma; Sachin Pawar; Sangameshwar Patil; Swapnil Hingmire; Girish Keshav Palshikar; Pushpak Bhattacharyya
Legal court judgements have multiple participants (e.g. judge, complainant, petitioner, lawyer, etc.). They may be referred to in multiple ways, e.g., the same person may be referred as lawyer, counsel, learned counsel, advocate, as well as his/her proper name. For any analysis of legal texts, it is important to resolve such multiple mentions which are coreferences of the same participant. In this paper, we propose a supervised approach to this challenging task. To avoid human annotation efforts for Legal domain data, we exploit ACE 2005 dataset by mapping its entities to participants in Legal domain. We use basic Transfer Learning paradigm by training classification models on general purpose text (news in ACE 2005 data) and applying them to Legal domain text. We evaluate our approach on a sample annotated test dataset in Legal domain and demonstrate that it outperforms state-of-the-art baselines.
international acm sigir conference on research and development in information retrieval | 2013
Swapnil Hingmire; Sandeep Chougule; Girish Keshav Palshikar; Sutanu Chakraborti
conference of the european chapter of the association for computational linguistics | 2017
Nitin Ramrakhiyani; Sachin Pawar; Swapnil Hingmire; Girish Keshav Palshikar
Archive | 2014
Rajiv Radheyshyam Srivastava; Girish Keshav Palshikar; Sangameshwar Patil; Pragati Hiralal Dungarwal; Abhay Sodani; Sachin Pawar; Savita Suhas Bhat; Swapnil Hingmire
meeting of the association for computational linguistics | 2018
Sangameshwar Patil; Sachin Pawar; Swapnil Hingmire; Girish Keshav Palshikar; Vasudeva Varma; Pushpak Bhattacharyya
the florida ai research society | 2017
K. V. S. Dileep; Swapnil Hingmire; Sutanu Chakraborti
international conference on knowledge capture | 2017
Swapnil Hingmire; Sutanu Chakraborti; Girish Keshav Palshikar; Abhay Sodani