Sukomal Pal
Indian School of Mines
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sukomal Pal.
Information Processing and Management | 2016
Manajit Chakraborty; Sukomal Pal; Rahul Pramanik; C. Ravindranath Chowdary
Social networking and instant multimedia communication is integral to online existence.Spamming is a new menace in messaging, blogs, video sites, internet telephony etc.The article surveys recent developments on social spam detection and mitigation.A qualitative comparison of different models and their performances are presented.A roadmap on how newer anti-spam techniques can be devised in future is provided. Spam in recent years has pervaded all forms of digital communication.The increase in user base for social platforms like Facebook, Twitter, YouTube, etc., has opened new avenues for spammers. The liberty to contribute content freely has encouraged the spammers to exploit the social platforms for their benefits. E-mail and web search engine being the early victims of spam have attracted serious attention from the information scientists for quite some time. A substantial amount of research has been directed to combat spam on these two platforms. Social networks being quite different in nature from the earlier two, have different kinds of spam and spam-fighting techniques from these domains seldom work. Moreover, due to the continuous and rapid evolution of social media, spam themselves evolve very fast posing a great challenge to the community. Despite being relatively new, there has been a number of attempts in the area of social spam in the recent past and a lot many are certain to come in near future. This paper surveys the recent developments in the area of social spam detection and mitigation, its theoretical models and applications along with their qualitative comparison. We present the state-of-the-art and attempt to provide challenges to be addressed, as the nature and content of spam are bound to get more complicated.
ACM Transactions on Asian Language Information Processing | 2010
Prasenjit Majumder; Mandar Mitra; Dipasree Pal; Ayan Bandyopadhyay; Samaresh Maiti; Sukomal Pal; Deboshree Modak; Sucharita Sanyal
The aim of the Forum for Information Retrieval Evaluation (FIRE) is to create an evaluation framework in the spirit of TREC (Text REtrieval Conference), CLEF (Cross-Language Evaluation Forum), and NTCIR (NII Test Collection for IR Systems), for Indian language Information Retrieval. The first evaluation exercise conducted by FIRE was completed in 2008. This article describes the test collections used at FIRE 2008, summarizes the approaches adopted by various participants, discusses the limitations of the datasets, and outlines the tasks planned for the next iteration of FIRE.
international acm sigir conference on research and development in information retrieval | 2008
Prasenjit Majumder; Mandar Mitra; Dipasree Pal; Ayan Bandyopadhyay; Samaresh Maiti; Sukanya Mitra; Aparajita Sen; Sukomal Pal
The aim of the Forum for Information Retrieval Evaluation (FIRE) is to create a Cranfield-like evaluation framework in the spirit of TREC, CLEF and NTCIR, for Indian Language Information Retrieval. For the first year, six Indian languages have been selected: Bengali, Hindi, Marathi, Punjabi, Tamil, and Telugu. This poster describes the tasks as well as the document and topic collections that are to be used at the FIRE workshop.
Sigkdd Explorations | 2005
Sukomal Pal; Aditya Bagchi
Traditionally, support is considered to be the standard measure for frequent itemset generation in Association Rule mining. This paper provides a new measure called togetherness where dissociation among items is also considered as a parameter in the frequent itemset generation process. Results of performance analysis show that association against dissociation is a more pragmatic approach and discovers truly associated candidate itemsets. Second part of the paper extends this togetherness measure to the domain of variable threshold. Here, like variable minimum support, a variable minimum togetherness has been proposed where this minimum value decreases as the itemset size increases. A simple and pragmatic process has been described, which can be easily implemented. It also provides ample control facilities in the hand of the users. Necessary change and extension of the existing algorithms have been made to establish the concepts. Here as well, results of performance analysis justify the approach.
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval | 2010
Debasis Ganguly; Johannes Leveling; Gareth J. F. Jones; Sauparna Palchowdhury; Sukomal Pal; Mandar Mitra
We describe the participation of Dublin City University (DCU) and Indian Statistical Institute (ISI) in INEX 2010 for the Ad-hoc and Data Centric tracks. The main contributions of this paper are: i) a simplified version of Hierarchical Language Model (HLM), which involves scoring XML elements with a combined probability of generating the given query from itself and the top level articl node, is shown to outperform the baselines of LM and VSM scoring of XML elements; ii) the Expectation Maximization (EM) feedback in LM is shown to be the most effective on the domain specific collection of IMDB; iii) automated removal of sentences indicating aspects of irrelevance from the narratives of INEX ad hoc topics is shown to improve retrieval effectiveness.
international conference on mining intelligence and knowledge exploration | 2015
Sudha Shanker Prasad; Jitendra Kumar; Dinesh Kumar Prabhakar; Sukomal Pal
This paper describes the system we used for Shared Task on Sentiment Analysis in Indian Languages SAIL Tweets, at MIKE-2015. Twitter is one of the most popular platform which allows users to share their opinion in the form of tweets. Since it restricts the users with 140 characters, the tweets are actually very short to carry opinions and sentiments to analyze. We take the help of a twitter training dataset in Indian Language Hindi and apply data mining approaches for analyzing the sentiments. We used a state-of-the-art Data Mining tool Weka to automatically classify the sentiment of Hindi tweets into positive, negative or neutral.
Artificial Intelligence Review | 2017
Ambedkar Kanapala; Sukomal Pal; Rajendra Pamula
Enormous amount of online information, available in legal domain, has made legal text processing an important area of research. In this paper, we attempt to survey different text summarization techniques that have taken place in the recent past. We put special emphasis on the issue of legal text summarization, as it is one of the most important areas in legal domain. We start with general introduction to text summarization, briefly touch the recent advances in single and multi-document summarization, and then delve into extraction based legal text summarization. We discuss different datasets and metrics used in summarization and compare performances of different approaches, first in general and then focused to legal text. we also mention highlights of different summarization techniques. We briefly cover a few software tools used in legal text summarization. We finally conclude with some future research directions.
Focused Access to XML Documents | 2008
Sukomal Pal; Mandar Mitra
This paper describes the work that we did at Indian Statistical Institute towards XML retrieval for INEX 2007. As a continuation of our INEX 2006 work, we applied the Vector Space Model and enhanced our text retrieval system (SMART) to retrieve XML elements against the INEX Adhoc queries. Like last year, we considered Content-Only(CO) queries and submitted two runs for the FOCUSED sub-task. The baseline run does retrieval at the document level; for the second run, we submitted our first attempt at element level retrieval. This run uses a very naive approach and performs poorly, but the relative performance of the baseline run was fairly encouraging. After the official submissions, we conducted a few more experiments involving both document-level and element-level retrieval. These additional runs yield some improvements in retrieval effectiveness. We report the results of those runs in this paper. Though our document-level runs are promising, the element-level runs are still far from satisfactory. Our next step will be to explore ways to improve element-level retrieval.
Proceedings of the Second ACM IKDD Conference on Data Sciences | 2015
Rahul Pramanik; Sukomal Pal; Manajit Chakraborty
In information retrieval, keyword-based queries often fail to capture actual information need, especially when the need is very specific and particular. Using natural language, however, a user can clearly tell what she wants (positive part) and what she does not (negative parts). We propose techniques for automatic removal of negative parts and query augmentation with judicious term inclusion-exclusion from negative parts. Experiments conducted on standard datasets like TREC, ROBUST, WT10G demonstrate that the proposed techniques yield substantial performance gain, often being statistically significant.
forum for information retrieval evaluation | 2014
Ambedkar Kanapala; Sukomal Pal
With proliferation of online discussion forums, legal data on the Web is increasing. A number of online sites provide platforms for discussion, counselling and assistance pertaining to legal problems where a lay person can ask questions and/or seek assistance and volunteers share their views, expert opinions. Although these forums can provide legal help at rudimentary level, increasing number of users consult and often their legal information need gets complemented by the forums. Lack of easy natural language search facility in these forums, however, deprives a novice user from quickly retrieving answers to similar questions asked in the past. Like all other empirical discipline, measurable technological progress in legal IR requires a quantitative evaluation framework that provides standard, well-defined experimental setup, benchmark datasets and evaluation metrics. The aim of building this test collection is to provide a credible testbed for legal IR from online discussion forums. The data has been collected by crawling a number of free online legal discussion forums such as lawguru.com, legalservice.co.in covering different type of legal cases like criminal law, consumer law, constitutional law etc.
Collaboration
Dive into the Sukomal Pal's collaboration.
Dhirubhai Ambani Institute of Information and Communication Technology
View shared research outputs