Arindam Pal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arindam Pal is active.

Explore More

Publication

Featured researches published by Arindam Pal.

advanced information networking and applications | 2014

Analyzing Cascading Failures in Smart Grids under Random and Targeted Attacks

Sushmita Ruj; Arindam Pal

We model smart grids as complex interdependent networks, and study targeted attacks on smart grids for the first time. A smart grid consists of two networks: the power network and the communication network, interconnected by edges. Occurrence of failures (attacks) in one network triggers failures in the other network, and propagates in cascades across the networks. Such cascading failures can result in disintegration of either (or both) of the networks. Earlier works considered only random failures. In practical situations, an attacker is more likely to compromise nodes selectively. We study cascading failures in smart grids, where an attacker selectively compromises the nodes with probabilities proportional to their degrees, high degree nodes are compromised with higher probability. We mathematically analyze the sizes of the giant components of the networks under targeted attacks, and compare the results with the corresponding sizes under random attacks. We show that networks disintegrate faster for targeted attacks compared to random attacks. A targeted attack on a small fraction of high degree nodes disintegrates one or both of the networks, whereas both the networks contain giant components for random attack on the same fraction of nodes.

theory and applications of models of computation | 2013

k-means++ under Approximation Stability

Manu Agarwal; Ragesh Jaiswal; Arindam Pal

The Lloyd’s algorithm, also known as the k-means algorithm, is one of the most popular algorithms for solving the k-means clustering problem in practice. However, it does not give any performance guarantees. This means that there are datasets on which this algorithm can behave very badly. One reason for poor performance on certain datasets is bad initialization. The following simple sampling based seeding algorithm tends to fix this problem: pick the first center randomly from among the given points and then for i ≥ 2, pick a point to be the i th center with probability proportional to the squared distance of this point from the previously chosen centers. This algorithm is more popularly known as the k-means++ seeding algorithm and is known to exhibit some nice properties. These have been studied in a number of previous works [AV07, AJM09, ADK09, BR11]. The algorithm tends to perform well when the optimal clusters are separated in some sense. This is because the algorithm gives preference to further away points when picking centers. Ostrovsky et al.[ORSS06] discuss one such separation condition on the data. Jaiswal and Garg [JG12] show that if the dataset satisfies the separation condition of [ORSS06], then the sampling algorithm gives a constant approximation with probability Ω(1/k). Another separation condition that is strictly weaker than [ORSS06] is the approximation stability condition discussed by Balcan et al.[BBG09]. In this work, we show that the sampling algorithm gives a constant approximation with probability Ω(1/k) if the dataset satisfies the separation condition of [BBG09] and the optimal clusters are not too small. We give a negative result for datasets that have small optimal clusters.

international conference on communications | 2015

CITEX: A new citation index to measure the relative importance of authors and papers in scientific publications

Arindam Pal; Sushmita Ruj

Evaluating the performance of researchers and measuring the impact of papers written by scientists is the main objective of citation analysis. Various indices and metrics have been proposed for this. In this paper, we propose a new citation index CITEX, which gives normalized scores to authors and papers to determine their rankings. To the best of our knowledge, this is the first citation index which simultaneously assigns scores to both authors and papers. Using these scores, we can get an objective measure of the reputation of an author and the impact of a paper. We model this problem as an iterative computation on a publication graph, whose vertices are authors and papers, and whose edges indicate which author has written which paper. We prove that this iterative computation converges in the limit, by using a powerful theorem from linear algebra. We run this algorithm on several examples, and find that the author and paper scores match closely with what is suggested by our intuition. The algorithm is theoretically sound and runs very fast in practice. We compare this index with several existing metrics and find that CITEX gives far more accurate scores compared to the traditional metrics.

foundations of software technology and theoretical computer science | 2012

Approximation Algorithms for the Unsplittable Flow Problem on Paths and Trees

Khaled M. Elbassioni; Naveen Garg; Divya Gupta; Amit Kumar; Vishal Narula; Arindam Pal

We study the Unsplittable Flow Problem (UFP) and related variants, namely UFP with Bag Constraints and UFP with Rounds, on paths and trees. We provide improved constant factor approximation algorithms for all these problems under the no bottleneck assumption (NBA), which says that the maximum demand for any source-sink pair is at most the minimum capacity of any edge. We obtain these improved results by expressing a feasible solution to a natural LP relaxation of the UFP as a near-convex combination of feasible integral solutions.

international conference on intelligent transportation systems | 2014

Historical Data Based Real Time Prediction of Vehicle Arrival Time

Santa Maiti; Arpan Pal; Arindam Pal; Tanushyam Chattopadhyay; Arijit Mukherjee

In recent times, most of the industries provide transportation facility for their employees from scheduled pick-up and drop points. In order to reduce longer waiting time, it is important to accurately predict the vehicle arrival in real time. This paper proposes a simple, lightweight yet powerful historical data based vehicle arrival time prediction model. Unlike previous work, the proposed model uses very limited input features namely vehicle trajectory and timestamp considering the scarcity and unavailability of data in the developing countries regarding traffic congestion, weather, scheduled arrival time, leg time, dwell time etc. Our proposed model is evaluated against standard Artificial Neural Network (ANN) and Support Vector Machine (SVM) regression models using real bus data of an industry campus at Siruseri, Chennai collected over four months of time period. The result shows that proposed historical data based model can predict two and half (approx.) times faster than ANN model and two (approx.) times faster than SVM model while it also achieves a comparable accuracy (75.56%) with respect to ANN model (76%) and SVM model (71.3%). Hence, the proposed historical data based model is capable of providing a real time system by balancing the trade-off between prediction time and prediction accuracy.

advanced information networking and applications | 2016

Preferential Attachment Model with Degree Bound and Its Application to Key Predistribution in WSN

Sushmita Ruj; Arindam Pal

Preferential attachment models have been widely studied in complex networks, because they can explain the formation of many networks like social networks, citation networks, power grids, and biological networks, to name a few. Motivated by the application of key predistribution in wireless sensor networks (WSN), we initiate the study of preferential attachment with degree bound. Our paper has two important contributions to two different areas. The first is a contribution in the study of complex networks. We propose preferential attachment model with degree bound for the first time. In the normal preferential attachment model, the degree distribution follows a power law, with many nodes of low degree and a few nodes of high degree. In our scheme, the nodes can have a maximum degree dmax, where dmax is an integer chosen according to the application. The second is in the security of wireless sensor networks. We propose a new key predistribution scheme based on the above model. The important features of this model are that the network is fully connected, it has fewer keys, has larger size of the giant component and lower average path length compared with traditional key predistribution schemes and comparable resilience to random node attacks. We argue that in many networks like key predistribution and Internet of Things, having nodes of very high degree will be a bottle-neck in communication. Thus, studying preferential attachment model with degree bound will open up new directions in the study of complex networks, and will have many applications in real world scenarios.

Theoretical Computer Science | 2015

k- Means + + under approximation stability

Manu Agarwal; Ragesh Jaiswal; Arindam Pal

One of the most popular algorithms for finding centers for initializing Lloyds heuristic is the k- means + + seeding algorithm. The algorithm is a simple sampling procedure that can be described as follows: The algorithm picks the first center randomly from among the given points and then for i = 2 , 3 , ? , k , picks a point to be the ith center with probability proportional to the squared Euclidean distance of this point to the nearest center out of the ( i - 1 ) previously chosen centers. The k- means + + seeding algorithm is known to exhibit nice properties. It has been noticed that this seeding algorithm tends to perform well when the optimal clusters are separated in some sense. Intuitively, this is because the algorithm gives preference to further away points when picking centers. One separation condition that has been studied in the past was due to Ostrovsky et al. 9]. Jaiswal and Garg 8] showed that if any dataset satisfies the separation condition of 9], then this sampling algorithm gives a constant approximation with probability ? ( 1 k ) on this dataset. Another separation condition that is strictly weaker than 9] is the approximation stability condition studied by Balcan et al. 5]. In this work, we show that the sampling algorithm gives a constant approximation with probability ? ( 1 k ) on any dataset that satisfies the separation condition of 5] and the optimal k clusters are not too small. We give a negative result for datasets that have small optimal clusters.

Journal of Combinatorial Optimization | 2018

Improved algorithms for the evacuation route planning problem

Gopinath Mishra; Subhra Mazumdar; Arindam Pal

Emergency evacuation is the process of movement of people away from the threat or actual occurrence of hazards such as natural disasters, terrorist attacks, fires and bombs. In this paper, we focus on evacuation from a building, but the ideas can be applied to city and region evacuation. We define the problem and show how it can be modeled using graphs. The resulting optimization problem can be formulated as an Integer Linear Program. Though this can be solved exactly, this approach does not scale well for graphs with thousands of nodes and several hundred thousands of edges. This is impractical for large graphs. First, we study a special case of this problem, where there is only a single source and a single sink. For this case, we give an improved algorithm Single Source Single Sink Evacuation Route Planner, whose evacuation time is always at most that of a famous algorithm Capacity Constrained Route Planner (CCRP), and whose running time is strictly less than that of CCRP. We prove this mathematically and give supporting results by extensive experiments. We also study randomized behavior model of people and prove some interesting results. We design the Multiple Sources Multiple Sinks Evacuation Route Planner (MSEP) algorithm to extend this for multiple sources and multiple sinks. We propose a randomized behavior model for MSEP and give a probabilistic analysis using ChernoffBounds.

acm ieee joint conference on digital libraries | 2017

Understanding the impact of early citers on long-term scientific impact

Mayank Singh; Ajay Jaiswal; Priya Shree; Arindam Pal; Animesh Mukherjee; Pawan Goyal

This paper explores an interesting new dimension to the challenging problem of predicting long-term scientific impact (LTSI) usually measured by the number of citations accumulated by a paper in the long-term. It is well known that early citations (within 1-2 years after publication) acquired by a paper positively affects its LTSI. However, there is no work that investigates if the set of authors who bring in these early citations to a paper also affect its LTSI. In this paper, we demonstrate for the first time, the impact of these authors whom we call early citers (EC) on the LTSI of a paper. Note that this study of the complex dynamics of EC introduces a brand new paradigm in citation behavior analysis. Using a massive computer science bibliographic dataset we identify two distinct categories of EC - we call those authors who have high overall publication/citation count in the dataset as influential and the rest of the authors as non- influential. We investigate three characteristic properties of EC and present an extensive analysis of how each category correlates with LTSI in terms of these properties. In contrast to popular perception, we find that influential EC negatively affects LTSI possibly owing to attention stealing. To motivate this, we present several representative examples from the dataset. A closer inspection of the collaboration network reveals that this stealing effect is more profound if an EC is nearer to the authors of the paper being investigated. As an intuitive use case, we show that incorporating EC properties in the state-of-the-art supervised citation prediction models leads to high performance margins. At the closing, we present an online portal to visualize EC statistics along with the prediction results for a given query paper. The portal is accessible online at: http://www.cnergres.iitkgp.ac.in/earlyciters/. To facilitate reproducible research, we make all the codes and the processed dataset available in the public domain.

Proceedings of the 10th Annual ACM India Compute Conference on | 2017

Measuring Similarity among Legal Court Case Documents

Arpan Mandal; Raktim Chaki; Sarbajit Saha; Kripabandhu Ghosh; Arindam Pal; Saptarshi Ghosh

Computing the similarity between two legal documents is an important challenge in the Legal Information Retrieval domain. Efficient calculation of this similarity has useful applications in various tasks such as identifying relevant prior cases for a given case document. Prior works have proposed network-based and text-based methods for measuring similarity between legal documents. However, there are certain limitations in the prior methods. Network-based measures are not always meaningfully applicable since legal citation networks are usually very sparse. On the other hand, only primitive text-based similarity measures, such as TF-IDF based approaches, have been tried till date. In this work, we focus on improving text-based methodologies for computing the similarity between two legal documents. In addition to TF-IDF based measures, we use advanced similarity measures (such as topic modeling) and neural network models (such as word embeddings and document embeddings). We perform extensive experiments on a large dataset of Indian Supreme Court cases, and compare among various methodologies for measuring the textual similarity of legal documents. Our experiments show that embedding based approaches perform better than other approaches. We also demonstrate that the proposed embedding-based methodologies significantly outperforms a baseline hybrid methodology involving both network-based and text-based similarity.

Explore More