Vinay Deolalikar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vinay Deolalikar is active.

Explore More

Publication

Featured researches published by Vinay Deolalikar.

embedded and ubiquitous computing | 2006

Perturbative time and frequency allocations for RFID reader networks

Vinay Deolalikar; Malena Mesarina; John Recker; Salil Pradhan

RFID reader networks often have to operate in frequency and time constrained regimes. One approach to the allocation of frequency and time to various readers in such regimes is to perturb the network slightly so as to ease the constraints. We investigate how to perform these perturbations in a manner that is profitable from time and frequency allocation point of view.

Journal of Logic and Computation | 2005

P ≠ NP ∩ co- NP for Infinite Time Turing Machines

Vinay Deolalikar; Joel David Hamkins; Ralf-Dieter Schindler

Extending results of Schindler, Hamkins and Welch, we establish in the context of infinite time Turing machines that P is properly contained in NP ∩ co-NP. For higher analogues of these classes, we exhibit positive and negative results.

IEEE Transactions on Neural Networks | 2002

A two-layer paradigm capable of forming arbitrary decision regions in input space

Vinay Deolalikar

It is well known that a two-layer perceptron network with threshold neurons is incapable of forming arbitrary decision regions in input space, while a three-layer perceptron has that capability. The effect of replacing the output neuron in a two-layer perceptron with a bithreshold element is studied. The limitations of this modified two-layer perceptron are observed. Results on the separating capabilities of a pair of parallel hyperplanes are obtained. Based on these, a new two-layer neural paradigm based on increasing the dimensionality of the output of the first layer is proposed and is shown to be capable of forming any arbitrary decision region in input space. Then a type of logic called bithreshold logic, based on the bithreshold neuron transfer function, is studied. Results on the limits of switching function realizability using bithreshold gates are obtained.

Journal of Number Theory | 2002

Determining irreducibility and ramification groups for an additive extension of the rational function field

Vinay Deolalikar

Abstract Let K be a subfield of F p , not necessarily proper, and a(T) be an additive polynomial defined over K. Suppose that f(x)∈K[x] and consider the polynomial a(T)−f(x) over K(x). We provide a method to (i) determine its irreducibility and (ii) compute the ramification groups in the resulting additive extension of K(x). In several cases, our method is easier to use and provides more information than the Artin–Schreier theorem. As an application of this method, we study families of additive extensions of K(x) obtained using symmetric polynomials. We prove irreducibility and determine their ramification groups, genera, and number of rational places. We show that these families contain examples of function fields with many rational places.

embedded and ubiquitous computing | 2005

Optimal scheduling for networks of RFID readers

Vinay Deolalikar; John Recker; Malena Mesarina; Salil Pradhan

Devising switching schemes for networks of colliding and correlated RFID readers is a core challenge in the deployment of RFID networks. We derive optimal scheduling schemes for readers in RFID networks in four cases of practical importance. Most other cases can be reduced to a combination of these basic cases.

international conference on big data | 2015

Big data gathering and mining pipelines for CRM using open-source

Kang Li; Vinay Deolalikar; Neeraj Pradhan

Customer Relationship Management (CRM) is currently the fastest growing sector of enterprise software, estimated to increase to

international conference on big data | 2016

Extensive large-scale study of error surfaces in sampling-based distinct value estimators for databases

Vinay Deolalikar; Hernan Laffitte

36.5B worldwide by 2017. CRM technologies increasingly use data mining primitives across multiple applications. At the same time, the growth of big data has led to the evolution of an open source big data software stack (primarily powered by Apache software) that rivals traditional enterprise database (RDBMS) stacks. New technologies such as Kafka, Storm, HBase have significantly enriched this open source stack, alongside more established technologies such as Hadoop MapReduce and Mahout. Today, enterprises have a choice to make regarding which stack they will choose to power their big data applications. However, there are no published studies in literature on enterprise big data pipelines built using open source components supporting CRM. Specific questions that enterprises have include: how is the data processed and analyzed in such pipelines? What are the building blocks of such pipelines? How long does each step of this processing take? In this work, we answer these questions for a large scale (serving over a 100M customers) industrial CRM pipeline that incorporates data mining, and serves several applications. Our pipeline has, broadly, two parts. The first is a data gathering part that uses Kafka, Storm, and HBase. The second is a data mining part that uses Mahout and Hadoop MapReduce. We also provide timings for common tasks in the second part such as data preprocessing for machine learning, clustering, reservoir sampling, and frequent itemset extraction.

international conference on big data | 2014

Query revision during cluster based search on large unstructured corpora

Vinay Deolalikar

The problem of distinct value estimation has many applications. Being a critical component of query optimizers in databases, it also has high commercial impact. Many distinct value estimators have been proposed, using various statistical approaches. However, characterizing the errors incurred by these estimators is an open problem: existing analytical approaches are not powerful enough, and extensive empirical studies at large scale do not exist. We conduct an extensive large-scale empirical study of 11 distinct value estimators from four different approaches to the problem over families of Zipfian distributions whose parameters model real-world applications. Our study is the first that scales to the size of a billion-rows that todays large commercial databases have to operate in. This allows us to characterize the error that is encountered in real-world applications of distinct value estimation. By mining the generated data, we show that estimator error depends on a key latent parameter — the average uniform class size — that has not been studied previously. This parameter also allows us to unearth error patterns that were previously unknown. Importantly, ours is the first approach that provides a framework for visualizing the error patterns in distinct value estimation, facilitating discussion of this problem in enterprise settings. Our characterization of errors can be used for several problems in distinct value estimation, such as the design of hybrid estimators. This work aims at the practitioner and the researcher alike, and addresses questions frequently asked by both audiences.

international conference on big data | 2014

Topological models of document-query sets in retrieval for Enterprise Information Management

Vinay Deolalikar

We investigate a frequently occurring issue in search (retrieval) in the age of big unstructured data. Searches conducted on large unstructured corpora result in long results lists. Such results lists are often clustered and reranked for ease of navigation. Should a query be revised during time-critical examinations of such long cluster based reranked lists? This question arises naturally during early stages of commercially important applications of IR such as eDiscovery, but has not yet been given any research attention. Four factors compound the difficulty of this question in the setting of eDiscovery: (a) the query sources (the technical experts) are different from the legal staff that are actually executing the query and using the retrieval system, (b) the retrieved lists for each query tend to be very long, and (c) the user might be accessing these retrieved results through a clustering interface, and (c) all decisions must be transparent and easy to explain due to the litigious nature of the application. Analogous difficulties arise in other applications involving search over large unstructured corpora. We introduce a framework to help users make the decision of “whether to revise.” Our framework consists of two components. First, we introduce a “limited view” which is a summary of a long cluster-based reranked list. This is the first input to the user. This provides the user a summary of the long cluster-based list. Second, we construct query predictors for this limited view, and provide their prediction as a second input to the user. This prediction is used to corroborate the inspection of the summary limited view. The proposed combination of a limited view and query performance prediction can assist search staff in determining whether to pursue an expensive query revision or not, as well as save precious time by precluding inspections of lists with very few relevant documents during the early stages of commercially important applications such as eDiscovery.

international conference on big data | 2014

Feature selection for text clustering in limited memory using Monte Carlo wrapper

Vinay Deolalikar

The tasks, challenges, and techniques of Information Retrieval (IR) should reflect the structure of the underlying document-query sets, and the needs of the domain. Are document-query sets obtained from the enterprise domain fundamentally different from standard research corpora gathered from the web? In order to identify, understand, and characterize such structural differences, we build a framework using point set topology to analyze document-query sets. Our framework tailors topological notions such as subbasis, cover, compactness, towards IR. Unlike previous topological approaches, we use the reverse of the relevance map to topologize the set of queries, not the set of documents. We show that the topological approach exposes sharp differences between enterprise and web-collected standard research document-query sets. These differences readily motivate research into new retrieval tasks that are of commercial importance in Enterprise Information Management (EIM).

Explore More