Amit Sasturkar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amit Sasturkar is active.

Explore More

Publication

Featured researches published by Amit Sasturkar.

ieee computer security foundations symposium | 2006

Policy analysis for administrative role based access control

Amit Sasturkar; Ping Yang; Scott D. Stoller; C. R. Ramakrishnan

Role-based access control (RBAC) is a widely used model for expressing access control policies. In large organizations, the RBAC policy may be collectively managed by many administrators. Administrative RBAC (ARBAC) is a model for expressing the authority of administrators, thereby specifying how an organizations RBAC policy may change. Changes by one administrator may interact in unintended ways with changes by other administrators. Consequently, the effect of an ARBAC policy is hard to understand by simple inspection. In this paper, we consider the problem of analyzing ARBAC policies, in particular to determine reachability properties (e.g., whether a user can eventually be assigned to a role by a group of administrators) and availability properties (e.g., whether a user cannot be removed from a role by a group of administrators) implied by a policy. We first establish the connection between security policy analysis and planning in artificial intelligence. Based partly on this connection, we show that reachability analysis for ARBAC is PSPACE-complete. We also give algorithms and complexity results for reachability and related analysis problems for several categories of ARBAC policies, defined by simple restrictions on the policy language.

acm sigplan symposium on principles and practice of parallel programming | 2005

Automated type-based analysis of data races and atomicity

Amit Sasturkar; Rahul Agarwal; Liqiang Wang; Scott D. Stoller

Concurrent programs are notorious for containing errors that are difficult to reproduce and diagnose at run-time. This motivated the development of type systems that statically ensure the absence of some common kinds of concurrent programming errors including data races and atomicity violations. A method is atomic if every execution of the concurrent program is equivalent to an execution in which the atomic method is executed without being interleaved with other concurrently executed methods. Atomicity is a common correctness requirement in concurrent programs; atomicity violations may indicate incorrect synchronization. This paper presents Extended Parameterized Atomic Java (EPAJ), a type system for specifying and verifying atomicity in Java programs. EPAJ combines Flanagan and Qadeers atomicity types [11] with a new and significantly more expressive type system for analyzing data races, called Extended Parameterized Race-Free Java (EPRFJ), allowing a more accurate analysis of atomicity. The paper also presents a type discovery algorithm to automatically obtain EPRFJ types, and a static interprocedural type inference algorithm that, given EPRFJ types, infers atomicity types. These algorithms can be incorporated into testing and debugging tools, benefiting users who know nothing about type systems. We report our experience with a prototype implementation.

automated software engineering | 2005

Optimized run-time race detection and atomicity checking using partial discovered types

Rahul Agarwal; Amit Sasturkar; Liqiang Wang; Scott D. Stoller

Concurrent programs are notorious for containing errors that are difficult to reproduce and diagnose. Two common kinds of concurrency errors are data races and atomicity violations (informally, atomicity means that executing methods concurrently is equivalent to executing them serially). Several static and dynamic (run-time) analysis techniques exist to detect potential races and atomicity violations. Run-time checking may miss errors in unexecuted code and incurs significant run-time overhead. On the other hand, run-time checking generally produces fewer false alarms than static analysis; this is a significant practical advantage, since diagnosing all of the warnings from static analysis of large codebases may be prohibitively expensive.This paper explores the use of static analysis to significantly decrease the overhead of run-time checking. Our approach is based on a type system for analyzing data races and atomicity. A type discovery algorithm is used to obtain types for as much of the program as possible (complete type inference for this type system is NP-hard, and parts of the program might be untypable). Warnings from the typechecker are used to identify parts of the program from which run-time checking can safely be omitted. The approach is completely automatic, scalable to very large programs, and significantly reduces the overhead of run-time checking for data races and atomicity violations.

web search and data mining | 2010

Learning URL patterns for webpage de-duplication

Hema Swetha Koppula; Krishna P. Leela; Amit Agarwal; Krishna Prasad Chitrapura; Sachin Garg; Amit Sasturkar

Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we present a set of techniques to mine rules from URLs and utilize these rules for de-duplication using just URL strings without fetching the content explicitly. Our technique is composed of mining the crawl logs and utilizing clusters of similar pages to extract transformation rules, which are used to normalize URLs belonging to each cluster. Preserving each mined rule for de-duplication is not efficient due to the large number of such rules. We present a machine learning technique to generalize the set of rules, which reduces the resource footprint to be usable at web-scale. The rule extraction techniques are robust against web-site specific URL conventions. We compare the precision and scalability of our approach with recent efforts in using URLs for de-duplication. Experimental results demonstrate that our approach achieves 2 times more reduction in duplicates with only half the rules compared to the most recent previous approach. Scalability of the framework is demonstrated by performing a large scale evaluation on a set of 3 Billion URLs, implemented using the MapReduce framework.

knowledge discovery and data mining | 2008

De-duping URLs via rewrite rules

Anirban Dasgupta; Ravi Kumar; Amit Sasturkar

A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal functions of a search engine, including crawling, indexing, ranking, and presentation, are adversely impacted by the presence of duplicate URLs. Traditionally, the de-duping problem has been addressed by fetching and examining the content of the URL; our approach here is different. Given a set of URLs partitioned into equivalence classes based on the content (URLs in the same equivalence class have similar content), we address the problem of mining this set and learning URL rewrite rules that transform all URLs of an equivalence class to the same canonical form. These rewrite rules can then be applied to eliminate duplicates among URLs that are encountered for the first time during crawling, even without fetching their content. In order to express such transformation rules, we propose a simple framework that is general enough to capture the most common URL rewrite patterns occurring on the web; in particular, it encapsulates the DUST (Different URLs with similar text) framework [5]. We provide an efficient algorithm for mining and learning URL rewrite rules and show that under mild assumptions, it is complete, i.e., our algorithm learns every URL rewrite rule that is correct, for an appropriate notion of correctness. We demonstrate the expressiveness of our framework and the effectiveness of our algorithm by performing a variety of extensive large-scale experiments.

symposium on code generation and optimization | 2004

Static identification of delinquent loads

Vlad-Mihai Panait; Amit Sasturkar; Weng-Fai Wong

The effective use of processor caches is crucial to the performance of applications. It has been shown that cache misses are not evenly distributed throughout a program. In applications running on RISC-style processors, a small number of delinquent load instructions are responsible for most of the cache misses. Identification of delinquent loads is the key to the success of many cache optimization and prefetching techniques. We propose a method for identifying delinquent loads that can be implemented at compile time. Our experiments over eighteen benchmarks from the SPEC suite shows that our proposed scheme is stable across benchmarks, inputs, and cache structures, identifying an average of 10% of the total number of loads in the benchmarks we tested that account for over 90% of all data cache misses. As far as we know, this is the first time a technique for static delinquent load identification with such a level of precision and coverage has been reported. While comparable techniques can also identify load instructions that cover 90% of all data cache misses, they do so by selecting over 50% of all load instructions in the code, resulting in a high number of false positives. If basic block profiling is used in conjunction with our heuristic, then our results show that it is possible to pin down just 1.3% of the load instructions that account for 82% of all data cache misses.

Theoretical Computer Science | 2011

Policy analysis for Administrative Role-Based Access Control

Amit Sasturkar; Ping Yang; Scott D. Stoller; C. R. Ramakrishnan

Role-Based Access Control (RBAC) is a widely used model for expressing access control policies. In large organizations, the RBAC policy may be collectively managed by many administrators. Administrative RBAC (ARBAC) models express the authority of administrators, thereby specifying how an organizations RBAC policy may change. Changes by one administrator may interact in unintended ways with changes by other administrators. Consequently, the effect of an ARBAC policy is hard to understand by simple inspection. In this paper, we consider the problem of analyzing ARBAC policies. Specifically, we consider reachability properties (e.g., whether a user can eventually be assigned to a role by a group of administrators), availability properties (e.g., whether a user cannot be removed from a role by a group of administrators), containment properties (e.g., every member of one role is also a member of another role) satisfied by a policy, and information flow properties. We show that reachability analysis for ARBAC is PSPACE-complete. We also give algorithms and complexity results for reachability and related analysis problems for several categories of ARBAC policies, defined by simple restrictions on the policy language. Some of these results are based on the connection we establish between security policy analysis and planning problems in Artificial Intelligence.

conference on information and knowledge management | 2009

URL normalization for de-duplication of web pages

Amit Agarwal; Hema Swetha Koppula; Krishna P. Leela; Krishna Prasad Chitrapura; Sachin Garg; Pavan Kumar Gm; Chittaranjan Haty; Anirban Roy; Amit Sasturkar

Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we present a set of techniques to mine rules from URLs and utilize these learnt rules for de-duplication using just URL strings without fetching the content explicitly. Our technique is composed of mining the crawl logs and utilizing clusters of similar pages to extract specific rules from URLs belonging to each cluster. Preserving each mined rules for de-duplication is not efficient due to the large number of specific rules. We present a machine learning technique to generalize the set of rules, which reduces the resource footprint to be usable at web-scale. The rule extraction techniques are robust against web-site specific URL conventions. We demonstrate the effectiveness of our techniques through experimental evaluation.

Archive | 2007