Is this you? Create Your Porfile

Arnab Bhattacharya

Indian Institute of Technology Kanpur

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arnab Bhattacharya is active.

Explore More

Publication

Featured researches published by Arnab Bhattacharya.

international world wide web conferences | 2011

Finding the bias and prestige of nodes in networks based on trust scores

Abhinav Mishra; Arnab Bhattacharya

Many real-life graphs such as social networks and peer-to-peer networks capture the relationships among the nodes by using trust scores to label the edges. Important usage of such networks includes trust prediction, finding the most reliable or trusted node in a local subgraph, etc. For many of these applications, it is crucial to assess the prestige and bias of a node. The bias of a node denotes its propensity to trust/mistrust its neighbours and is closely related to truthfulness. If a node trusts all its neighbours, its recommendation of another node as trustworthy is less reliable. It is based on the idea that the recommendation of a highly biased node should weigh less. In this paper, we propose an algorithm to compute the bias and prestige of nodes in networks where the edge weight denotes the trust score. Unlike most other graph-based algorithms, our method works even when the edge weights are not necessarily positive. The algorithm is iterative and runs in O(km) time where k is the number of iterations and m is the total number of edges in the network. The algorithm exhibits several other desirable properties. It converges to a unique value very quickly. Also, the error in bias and prestige values at any particular iteration is bounded. Further, experiments show that our model conforms well to social theories such as the balance theory (enemy of a friend is an enemy, etc.).

international conference on data engineering | 2011

A continuous query system for dynamic route planning

Nirmesh Malviya; Samuel Madden; Arnab Bhattacharya

In this paper, we address the problem of answering continuous route planning queries over a road network, in the presence of updates to the delay (cost) estimates of links. A simple approach to this problem would be to recompute the best path for all queries on arrival of every delay update. However, such a naive approach scales poorly when there are many users who have requested routes in the system. Instead, we propose two new classes of approximate techniques - K-paths and proximity measures to substantially speed up processing of the set of designated routes specified by continuous route planning queries in the face of incoming traffic delay updates. Our techniques work through a combination of pre-computation of likely good paths and by avoiding complete recalculations on every delay update, instead only sending the user new routes when delays change significantly. Based on an experimental evaluation with 7,000 drives from real taxi cabs, we found that the routes delivered by our techniques are within 5% of the best shortest path and have run times an order of magnitude or less compared to a naive approach.

Journal of Biological Chemistry | 2008

FTDP-17 mutations in Tau alter the regulation of microtubule dynamics: an "alternative core" model for normal and pathological Tau action.

Adria C. LeBoeuf; Sasha F. Levy; Michelle Gaylord; Arnab Bhattacharya; Ambuj K. Singh; Mary Ann Jordan; Leslie Wilson; Stuart C. Feinstein

Mutations affecting either the structure or regulation of the microtubule-associated protein Tau cause neuronal cell death and dementia. However, the molecular mechanisms mediating these deleterious effects remain unclear. Among the most characterized activities of Tau is the ability to regulate microtubule dynamics, known to be essential for proper cell function and viability. Here we have tested the hypothesis that Tau mutations causing neurodegeneration also alter the ability of Tau to regulate the dynamic instability behaviors of microtubules. Using in vitro microtubule dynamics assays to assess average microtubule growth rates, microtubule growth rate distributions, and catastrophe frequencies, we found that all tested mutants possessing amino acid substitutions or deletions mapping to either the repeat or interrepeat regions of Tau do indeed compromise its ability to regulate microtubule dynamics. Further mutational analyses suggest a novel mechanism of Tau regulatory action based on an “alternative core” of microtubule binding and regulatory activities composed of two repeats and the interrepeat between them. In this model, the interrepeat serves as the primary regulator of microtubule dynamics, whereas the flanking repeats serve as tethers to properly position the interrepeat on the microtubule. Importantly, since there are multiple interrepeats on each Tau molecule, there are also multiple cores on each Tau molecule, each with distinct mechanistic capabilities, thereby providing significant regulatory potential. Taken together, the data are consistent with a microtubule misregulation mechanism for Tau-mediated neuronal cell death and provide a novel mechanistic model for normal and pathological Tau action.

extending database technology | 2006

Indexing spatially sensitive distance measures using multi-resolution lower bounds

Vebjorn Ljosa; Arnab Bhattacharya; Ambuj K. Singh

Comparison of images requires a distance metric that is sensitive to the spatial location of objects and features. Such sensitive distance measures can, however, be computationally infeasible due to the high dimensionality of feature spaces coupled with the need to model the spatial structure of the images. We present a novel multi-resolution approach to indexing spatially sensitive distance measures. We derive practical lower bounds for the earth mover’s distance (EMD). Multiple levels of lower bounds, one for each resolution of the index structure, are incorporated into algorithms for answering range queries and k-NN queries, both by sequential scan and using an M-tree index structure. Experiments show that using the lower bounds reduces the running time of similarity queries by a factor of up to 36 compared to a sequential scan without lower bounds. Computing separately for each dimension of the feature vector yields a speedup of ~14. By combining the two techniques, similarity queries can be answered more than 500 times faster.

pacific symposium on biocomputing | 2003

Progress: simultaneous searching of protein databases by sequence and structure.

Arnab Bhattacharya; Tolga Can; Tamer Kahveci; Ambuj K. Singh; Yuan-Fang Wang

We consider the problem of similarity searches on protein databases based on both sequence and structure information simultaneously. Our program extracts feature vectors from both the sequence and structure components of the proteins. These feature vectors are then combined and indexed using a novel multi-dimensional index structure. For a given query, we employ this index structure to find candidate matches from the database. We develop a new method for computing the statistical significance of these candidates. The candidates with high significance are then aligned to the query protein using the Smith-Waterman technique to find the optimal alignment. The experimental results show that our method can classify up to 97% of the superfamilies and up to 100% of the classes correctly according to the SCOP classification. Our method is up to 37 times faster than CTSS, a recent structure search technique, combined with Smith-Waterman technique for sequences.

database and expert systems applications | 2009

On Low Distortion Embeddings of Statistical Distance Measures into Low Dimensional Spaces

Arnab Bhattacharya; Purushottam Kar; Manjish Pal

In this paper, we investigate various statistical distance measures from the point of view of discovering low distortion embeddings into low dimensional spaces. More specifically, we consider the Mahalanobis distance measure, the Bhattacharyya class of divergences and the Kullback-Leibler divergence. We present a dimensionality reduction method based on the Johnson-Lindenstrauss Lemma for the Mahalanobis measure that achieves arbitrarily low distortion. By using the Johnson-Lindenstrauss Lemma again, we further demonstrate that the Bhattacharyya distance admits dimensionality reduction with arbitrarily low additive error. We also examine the question of embeddability into metric spaces for these distance measures due to the availability of efficient indexing schemes on metric spaces. We provide explicit constructions of point sets under the Bhattacharyya and the Kullback-Leibler divergences whose embeddings into any metric space incur arbitrarily large distortions. To the best of our knowledge, this is the first investigation into these distance measures from the point of view of dimensionality reduction and embeddability into metric spaces.

international conference on computer communications | 2015

Trajectory aware macro-cell planning for mobile users

Shubhadip Mitra; Sayan Ranu; Vinay Kolar; Aditya Telang; Arnab Bhattacharya; Ravi Kokku; Sriram Raghavan

We handle the problem of efficient user-mobility driven macro-cell planning in cellular networks. As cellular networks embrace heterogeneous technologies (including long range 3G/4G and short range WiFi, Femto-cells, etc.), most traffic generated by static users gets absorbed by the short-range technologies, thereby increasingly leaving mobile user traffic to macro-cells. To this end, we consider a novel approach that factors in the trajectories of mobile users as well as the impact of city geographies and their associated road networks for macro-cell planning. Given a budget k of base-stations that can be upgraded, our approach selects a deployment that improves the most number of user trajectories. The generic formulation incorporates the notion of quality of service of a user trajectory as a parameter to allow different application-specific requirements, and operator choices. We show that the proposed trajectory utility maximization problem is NP-hard, and design multiple heuristics. We evaluate our algorithms with real and synthetic datasets emulating different city geographies to demonstrate their efficacy. For instance, with an upgrade budget k of 20%, our algorithms perform 3-8 times better in improving the user quality of service on trajectories when compared to greedy location-based base-station upgrades.

international conference on management of data | 2014

Mining statistically significant connected subgraphs in vertex labeled graphs

Akhil Arora; Mayank Sachan; Arnab Bhattacharya

The steady growth of graph data in various applications has resulted in wide-spread research in finding significant sub-structures in a graph. In this paper, we address the problem of finding statistically significant connected subgraphs where the nodes of the graph are labeled. The labels may be either discrete where they assume values from a pre-defined set, or continuous where they assume values from a real domain and can be multi-dimensional. We motivate the problem citing applications in spatial co-location rule mining and outlier detection. We use the chi-square statistic as a measure for quantifying the statistical significance. Since the number of connected subgraphs in a general graph is exponential, the naive algorithm is impractical. We introduce the notion of contracting edges that merge vertices together to form a super-graph. We show that if the graph is dense enough to start with, the number of super-vertices is quite low, and therefore, running the naive algorithm on the super-graph is feasible. If the graph is not dense, we provide an algorithm to reduce the number of super-vertices further, thereby providing a trade-off between accuracy and time. Empirically, the chi-square value obtained by this reduction is always within 96% of the optimal value, while the time spent is only a fraction of that for the optimal. In addition, we also show that our algorithm is scalable and it significantly enhances the ability to analyze real datasets.

very large data bases | 2012

Mining statistically significant substrings using the chi-square statistic

Mayank Sachan; Arnab Bhattacharya

The problem of identification of statistically significant patterns in a sequence of data has been applied to many domains such as intrusion detection systems, financial models, web-click records, automated monitoring systems, computational biology, cryptology, and text analysis. An observed pattern of events is deemed to be statistically significant if it is unlikely to have occurred due to randomness or chance alone. We use the chi-square statistic as a quantitative measure of statistical significance. Given a string of characters generated from a memoryless Bernoulli model, the problem is to identify the substring for which the empirical distribution of single letters deviates the most from the distribution expected from the generative Bernoulli model. This deviation is captured using the chi-square measure. The most significant substring (MSS) of a string is thus defined as the substring having the highest chi-square value. Till date, to the best of our knowledge, there does not exist any algorithm to find the MSS in better than O(n2) time, where n denotes the length of the string. In this paper, we propose an algorithm to find the most significant substring, whose running time is O(n3/2) with high probability. We also study some variants of this problem such as finding the top-t set, finding all substrings having chi-square greater than a fixed threshold and finding the MSS among substrings greater than a given length. We experimentally demonstrate the asymptotic behavior of the MSS on varying the string size and alphabet size. We also describe some applications of our algorithm on cryptology and real world data from finance and sports. Finally, we compare our technique with the existing heuristics for finding the MSS.

database and expert systems applications | 2011

Caching stars in the sky: a semantic caching approach to accelerate skyline queries

Arnab Bhattacharya; B. Palvali Teja; Sourav Dutta

Although multi-criteria decision making has emerged with the advent of skyline queries, processing such queries for high dimensional datasets remains a time consuming task. Real-time applications are thus infeasible, especially for non-indexed skyline techniques where the datasets arrive online. In this paper, we propose a caching mechanism that uses the semantics of previous skyline queries to improve the processing time of a new query. In addition to exact queries, such special semantics allow accelerating related queries. We achieve this by generating partial results guaranteed to be in the skyline sets. We also propose an index structure for efficient organization of the cached queries that improve the efficiency. Experiments show the efficiency and scalability of our proposed methods.

Explore More