Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sivaji Bandyopadhyay is active.

Publication


Featured researches published by Sivaji Bandyopadhyay.


IEEE Intelligent Systems | 2013

Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining

Soujanya Poria; Alexander F. Gelbukh; Amir Hussain; Newton Howard; Dipankar Das; Sivaji Bandyopadhyay

SenticNet 1.0 is one of the most widely used, publicly available resources for concept-based opinion mining. The presented methodology enriches SenticNet concepts with affective information by assigning an emotion label.


meeting of the association for computational linguistics | 2006

A Modified Joint Source-Channel Model for Transliteration

Asif Ekbal; Sudip Kumar Naskar; Sivaji Bandyopadhyay

Most machine transliteration systems transliterate out of vocabulary (OOV) words through intermediate phonemic mapping. A framework has been presented that allows direct orthographical mapping between two languages that are of different origins employing different alphabet sets. A modified joint source-channel model along with a number of alternatives have been proposed. Aligned transliteration units along with their context are automatically derived from a bilingual training corpus to generate the collocational statistics. The transliteration units in Bengali words take the pattern C+M where C represents a vowel or a consonant or a conjunct and M represents the vowel modifier or matra. The English transliteration units are of the form C*V* where C represents a consonant and V represents a vowel. A Bengali-English machine transliteration system has been developed based on the proposed models. The system has been trained to transliterate person names from Bengali to English. It uses the linguistic knowledge of possible conjuncts and diphthongs in Bengali and their equivalents in English. The system has been evaluated and it has been observed that the modified joint source-channel model performs best with a Word Agreement Ratio of 69.3% and a Transliteration Unit Agreement Ratio of 89.8%.


international conference on data mining | 2012

Enriching SenticNet Polarity Scores through Semi-Supervised Fuzzy Clustering

Soujanya Poria; Alexander F. Gelbukh; Erik Cambria; Dipankar Das; Sivaji Bandyopadhyay

SenticNet 1.0 is one of the most widely used freely-available resources for concept-level opinion mining, containing about 5,700 common sense concepts and their corresponding polarity scores. Specific affective information associated to such concepts, however, is often desirable for tasks such as emotion recognition. In this work, we propose a method for assigning emotion labels to SenticNet concepts based on a semi-supervised classifier trained on WordNet-Affect emotion lists with features extracted from various lexical resources.


meeting of the association for computational linguistics | 2009

Word to Sentence Level Emotion Tagging for Bengali Blogs

Dipankar Das; Sivaji Bandyopadhyay

In this paper, emotion analysis on blog texts has been carried out for a less privileged language like Bengali. Ekmans six basic emotion types have been selected for reliable and semi automatic word level annotation. An automatic classifier has been applied for recognizing six basic emotion types for different words in a sentence. Application of different scoring strategies to identify sentence level emotion tag based on the acquired word level emotion constituents have produced satisfactory performance.


international conference on information technology | 2008

Part of Speech Tagging in Bengali Using Support Vector Machine

Asif Ekbal; Sivaji Bandyopadhyay

Part of speech (POS) tagging is the task of labeling each word in a sentence with its appropriate syntactic category called part of speech. POS tagging is a very important preprocessing task for language processing activities. This paper reports about task of POS tagging for Bengali using support vector machine (SVM). The POS tagger has been developed using a tagset of 26 POS tags, defined for the Indian languages. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the various POS classes. The POS tagger has been trained, and tested with the 72,341, and 20 K wordforms, respectively. Experimental results show the effectiveness of the proposed SVM based POS tagger with an accuracy of 86.84%. Results show that the lexicon, named entity recognizer and different word suffixes are effective in handling the unknown word problems and improve the accuracy of the POS tagger significantly. Comparative evaluation results have demonstrated that this SVM based system outperforms the three existing systems based on the hidden markov model (HMM), maximum entropy (ME) and conditional random field (CRF).


Polibits | 2008

Web-based Bengali News Corpus for Lexicon Development and POS Tagging

Asif Ekbal; Sivaji Bandyopadhyay

Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing (NLP) applications. The rapid development of these resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. We have used a Bengali news corpus, developed from the web archive of a widely read Bengali newspaper. The corpus contains approximately 34 million wordforms. This corpus is used for lexicon development without employing extensive knowledge of the language. We have developed the POS taggers using Hidden Markov Model (HMM) and Support Vector Machine (SVM). The lexicon contains around 128 thousand entries and a manual check yields the accuracy of 79.6%. Initially, the POS taggers have been developed for Bengali and shown the accuracies of 85.56%, and 91.23% for HMM, and SVM, respectively. Based on the Bengali news corpus, we identify various word-level orthographic features to use in the POS taggers. The lexicon and a Named Entity Recognition (NER) system, developed using this corpus, are also used in POS tagging. The POS taggers are then evaluated with Hindi and Telugu data. Evaluation results demonstrates the fact that SVM performs better than HMM for all the three Indian languages.


Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009) | 2009

Voted NER System using Appropriate Unlabeled Data

Asif Ekbal; Sivaji Bandyopadhyay

This paper reports a voted Named Entity Recognition (NER) system with the use of appropriate unlabeled data. The proposed method is based on the classifiers such as Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM) and has been tested for Bengali. The system makes use of the language independent features in the form of different contextual and orthographic word level features along with the language dependent features extracted from the Part of Speech (POS) tagger and gazetteers. Context patterns generated from the unlabeled data using an active learning method have been used as the features in each of the classifiers. A semi-supervised method has been used to describe the measures to automatically select effective documents and sentences from unlabeled data. Finally, the models have been combined together into a final system by weighted voting technique. Experimental results show the effectiveness of the proposed approach with the overall Recall, Precision, and F-Score values of 93.81%, 92.18% and 92.98%, respectively. We have shown how the language dependent features can improve the system performance.


mexican international conference on artificial intelligence | 2012

Fuzzy clustering for semi-supervised learning --- case study: construction of an emotion lexicon

Soujanya Poria; Alexander F. Gelbukh; Dipankar Das; Sivaji Bandyopadhyay

We consider the task of semi-supervised classification: extending category labels from a small dataset of labeled examples to a much larger set. We show that, at least on our case study task, unsupervised fuzzy clustering of the unlabeled examples helps in obtaining the hard clusters. Namely, we used the membership values obtained with fuzzy clustering as additional features for hard clustering. We also used these membership values to reduce the confusion set for the hard clustering. As a case study, we use applied the proposed method to the task of constructing a large emotion lexicon by extending the emotion labels from the WordNet Affect lexicon using various features of words. Some of the features were extracted from the emotional statements of the freely available ISEAR dataset; other features were WordNet distance and the similarity measured via the polarity scores in the SenticNet resource. The proposed method classified words by emotion labels with high accuracy.


pattern recognition and machine intelligence | 2007

A hidden Markov model based named entity recognition system: Bengali and Hindi as case studies

Asif Ekbal; Sivaji Bandyopadhyay

Named Entity Recognition (NER) has an important role in almost all Natural Language Processing (NLP) application areas including information retrieval, machine translation, question-answering system, automatic summarization etc. This paper reports about the development of a statistical Hidden Markov Model (HMM) based NER system. The system is initially developed for Bengali using a tagged Bengali news corpus, developed from the archive of a leading Bengali newspaper available in the web. The system is trained with a training corpus of 150,000 wordforms, initially tagged with a HMM based part of speech (POS) tagger. Evaluation results of the 10-fold cross validation test yield an average Recall, Precision and F-Score values of 90.2%, 79.48% and 84.5%, respectively. This HMM based NER system is then trained and tested on the Hindi data to show its effectiveness towards the language independent abilities. Experimental results of the 10-fold cross validation test has demonstrated the average Recall, Precision and F-Score values of 82.5%, 74.6% and 78.35%, respectively with 27,151 Hindi wordforms.


Archive | 2017

A Practical Guide to Sentiment Analysis

Erik Cambria; Dipankar Das; Sivaji Bandyopadhyay; Antonio Feraco

Sentiment analysis research has been started long back and recently it is one of the demanding research topics. Research activities on Sentiment Analysis in natural language texts and other media are gaining ground with full swing. But, till date, no concise set of factors has been yet defined that really affects how writers sentiment i.e., broadly human sentiment is expressed, perceived, recognized, processed, and interpreted in natural languages. The existing reported solutions or the available systems are still far from perfect or fail to meet the satisfaction level of the end users. The reasons may be that there are dozens of conceptual rules that govern sentiment and even there are possibly unlimited clues that can convey these concepts from realization to practical implementation. Therefore, the main aim of this book is to provide a feasible research platform to our ambitious researchers towards developing the practical solutions that will be indeed beneficial for our society, business and future researches as well.

Collaboration


Dive into the Sivaji Bandyopadhyay's collaboration.

Top Co-Authors

Avatar

Dipankar Das

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Asif Ekbal

Indian Institute of Technology Patna

View shared research outputs
Top Co-Authors

Avatar

Alexander F. Gelbukh

Instituto Politécnico Nacional

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Partha Pakray

National Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge