Is this you? Create Your Porfile

Sujoy Chatterjee

Kalyani Government Engineering College

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sujoy Chatterjee is active.

Explore More

Publication

Featured researches published by Sujoy Chatterjee.

Information Sciences | 2017

Judgment analysis of crowdsourced opinions using biclustering

Sujoy Chatterjee; Malay Bhattacharyya

The problem of deriving final judgment from crowdsourced opinions is addressed with an unsupervised approach.Biclustering is shown to be useful for identifying the annotators crucial for a judgment.We establish that a suitable fraction of the entire dataset is sufficient for appropriate judgment analysis.As the proposed method does not work over the entire data, it becomes useful for big data analysis. Annotation by the crowd workers serving online is gaining focus in recent years in diverse fields due to its distributed power of problem solving. Distributing the labeling task among a large set of workers (may be experts or non-experts) and obtaining the final consensus is a popular way of performing large-scale annotation in a limited time. Collection of multiple annotations can be effective for annotation of large-scale datasets for applications like natural language processing, image processing, etc. However, as the crowd workers are not necessarily experts, their opinions might not be accurate enough. This causes problem in deriving the final aggregated judgment. Again, majority voting (MV) is not suitable for such problems because the number of annotators is limited and they have multiple options to choose. This might cause too much conflicts among the opinions provided. Additionally, there might exist annotators who randomly try to annotate (provide spam opinions for) too many questions to maximize their payment. This can incorporate noise while deriving the final judgment. In this paper, we address the problem of crowd judgment analysis in an unsupervised way and a biclustering-based approach is proposed to obtain the judgments appropriately. The effectiveness of this approach is demonstrated on four publicly available small-scale Amazon Mechanical Turk datasets, along with a large-scale CrowdFlower dataset. We also compare the algorithm with MV and some other existing algorithms. In most of the cases the proposed approach is competitively better than others. But most importantly, it does not use the entire dataset for deriving the judgment.

Proceedings of the Second ACM IKDD Conference on Data Sciences | 2015

A biclustering approach for crowd judgment analysis

Sujoy Chatterjee; Malay Bhattacharyya

Collection of multiple annotations from the crowd workers is useful for diverse applications. In this paper, the problem of obtaining the final judgment from such crowd-based annotations has been addressed in an unsupervised way using a biclustering-based approach. Results on multiple datasets show that the proposed approach is competitively better than others, without even using the entire dataset.

human factors in computing systems | 2017

A Probabilistic Approach to Group Decision Making

Sujoy Chatterjee; Malay Bhattacharyya

Large-scale judgment analysis from multiple opinions is a challenging job in terms of time and cost involved. Over the last few years, with the popularity of crowd-powered models, the process of decision making is efficiently getting done by using the knowledge of crowd. In management science, a closely related class of problems, popularly known as group decision making, is often addressed. Unfortunately, majority of the algorithms developed for this purpose work for binary or multiple opinions without taking care of the semantic meaning of the options. Moreover, group decision considers a feedback set comprising range of continuous values unlike the judgment analysis problem. In this paper, we address this problem, hereafter termed as multi-opinion group decision making, with a probabilistic approach taking into account the annotator accuracy, annotator bias and question difficulty. The effectiveness of the approach is demonstrated by applying this on a benchmark group decision making dataset.

computational intelligence | 2017

Genetic Algorithm-Based Matrix Factorization for Missing Value Prediction

Sujoy Chatterjee; Anirban Mukhopadhyay

Sparsity is a major problem in the areas like data mining and pattern recognition. In recommender systems, predictions based on these few observations lead to avoidance of inherent latent features of the user corresponding to the item. Similarly, in different crowdsourcing based opinion aggregation models, there is a minimal chance to obtain opinions from all the crowd workers. Even this sparsity problem has an extensive effect in predicting actual rating of a particular item due to limited and incomplete observations. To deal with this issue, in this article, a genetic algorithm based matrix factorization technique is proposed to estimate the missing entries in the response matrix that contains workers’ responses over some questions. We have created three synthetic datasets and used one real-life dataset to show the efficacy of the proposed method over the other state-of-the-art approaches.

Information Sciences | 2017

Dependent judgment analysis

Sujoy Chatterjee; Anirban Mukhopadhyay; Malay Bhattacharyya

We introduce a novel crowd judgment analysis problem where the crowd opinions are dependent on each other.We introduce a novel method to compute the final aggregated judgments from multiple dependent opinions.In this method, the expertise of an annotator is quantified in terms of confidence gap and weights are alloted based on that.We provide a comprehensive evaluation of the proposed method by applying the method on two datasets to evaluate the effectiveness of the method. Annotation of large-scale datasets can promisingly be done by crowd workers in a time and cost effective way. A major challenge in this area is how we aggregate the opinions received from multiple workers to derive the final judgment. Most of the crowd opinion aggregation models known so far deal with independent opinions, where the crowd workers provide their opinions unanimously and these are not visible to everyone. In real life, there are applications where an annotator can see others opinions. This incurs a higher chance of getting biased by the other opinions. This paper addresses a new problem, hereafter termed as dependent judgment analysis, and proposes a method to derive the final judgment from a given set of independent and dependent opinions. Here, a Markov chain based aggregation method is used to handle the opinions of the crowd workers for finding a consensus. We study the performance of the proposed method on a synthetic dataset and another real-life dataset published in recent times. The proposed method is applied on these two datasets to find out the aggregated judgment. The efficacy of our proposed method is shown by comparing it with majority voting.

Archive | 2019

An Evolutionary Matrix Factorization Approach for Missing Value Prediction

Sujoy Chatterjee; Anirban Mukhopadhyay

Sparseness of data is a common problem in many fields such as data mining and pattern recognition. During the last decade, collecting opinions from people has been established to be an useful tool for solving different real-life problems. In crowdsourcing systems, prediction based on very few observations leads to complete disregard for the inherent latent features of the crowd workers corresponding to the items. Similarly in bioinformatics, sparsity has a major negative impact in finding relevant gene from gene expression data. Although this problem is being studied over the last decade, there are some benefits and pitfalls of the different proposed approaches. In this article, we have proposed a genetic algorithm-based matrix factorization technique to estimate the missing entries in the rating matrix of recommender systems. We have created four synthetic datasets and used two real-life gene expression datasets to show the efficacy of the proposed method in comparison with the other state-of-the-art approaches.

Knowledge Based Systems | 2018

A Weighted Rank aggregation approach towards crowd opinion analysis

Sujoy Chatterjee; Anirban Mukhopadhyay; Malay Bhattacharyya

Abstract In crowd opinion aggregation models, the expertise of annotators plays an important role to derive the appropriate judgment. It is seen that in most of the aggregation methods annotators’ accuracy and bias are considered as two important features and based on it the priority of annotators is assigned. But instead of relying upon these limited features, the quality of annotators can be suitably exploited using rank-based features to further improve the prediction. Basically, the annotators are ranked according to various features and therefrom multiple separate rankings are produced. These rankings, if properly weighted, can lead to obtain the final aggregated ranking in a better way. In this paper, we have developed a novel weighted rank aggregation approach and applied the same on three artificially generated ranking datasets with varying noise. Moreover, the comparative effectiveness of the proposed method is demonstrated by applying it on three Amazon Mechanical Turk datasets.

Archive | 2016

Dynamic Congestion Analysis for Better Traffic Management Using Social Media

Sujoy Chatterjee; Sankar Kumar Mridha; Sourav Bhattacharyya; Swapan Shakhari; Malay Bhattacharyya

Social media has emerged as an imperative tool for addressing many real-life problems in an innovative way in recent years. Traffic management is a demanding problem for any populous city in the world. In the current paper, we explore how the dynamic data from social media can be employed for continuous traffic monitoring of cities in a better way. To accomplish this, congestion analysis and clustering of congested areas are performed. With the term congestion, we denote co-gatherings in an area for two different occasions within a defined time interval. While doing so, we introduce a novel measure for quantifying the congestion of different areas in a city. Subsequently, association rule mining is applied to find out the association between congested roads. To our surprise, we observe a major impact of various gatherings on the disorder of traffic control in many cities. With additional analyses, we gain some new insights about the overall status of traffic quality in India from the temporal analysis of data.

Procedia Technology | 2013