Gregory Buehrer
Ohio State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gregory Buehrer.
Proceedings of the 5th international workshop on Software engineering and middleware | 2005
Gregory Buehrer; Bruce W. Weide; Paolo A. G. Sivilotti
An SQL injection attack targets interactive web applications that employ database services. Such applications accept user input, such as form fields, and then include this input in database requests, typically SQL statements. In SQL injection, the attacker provides user input that results in a different database request than was intended by the application programmer. That is, the interpretation of the user input as part of a larger SQL statement, results in an SQL statement of a different form than originally intended. We describe a technique to prevent this kind of manipulation and hence eliminate SQL injection vulnerabilities. The technique is based on comparing, at run time, the parse tree of the SQL statement before inclusion of user input with that resulting after inclusion of input. Our solution is efficient, adding about 3 ms overhead to database query costs. In addition, it is easily adopted by application programmers, having the same syntactic structure as current popular record set retrieval methods. For empirical analysis, we provide a case study of our solution in J2EE. We implement our solution in a simple static Java class, and show its effectiveness and scalability.
web search and data mining | 2008
Gregory Buehrer; Kumar Chellapilla
A link server is a system designed to support efficient implementations of graph computations on the web graph. In this work, we present a compression scheme for the web graph specifically designed to accommodate community queries and other random access algorithms on link servers. We use a frequent pattern mining approach to extract meaningful connectivity formations. Our Virtual Node Miner achieves graph compression without sacrificing random access by generating virtual nodes from frequent itemsets in vertex adjacency lists. The mining phase guarantees scalability by bounding the pattern mining complexity to O(E log E). We facilitate global mining, relaxing the requirement for the graph to be sorted by URL, enabling discovery for both inter-domain as well as intra-domain patterns. As a consequence, the approach allows incremental graph updates. Further, it not only facilitates but can also expedite graph computations such as PageRank and local random walks by implementing them directly on the compressed graph. We demonstrate the effectiveness of the proposed approach on several publicly available large web graph data sets. Experimental results indicate that the proposed algorithm achieves a 10- to 15-fold compression on most real word web graph data sets
very large data bases | 2007
Amol Ghoting; Gregory Buehrer; Srinivasan Parthasarathy; Daehyun Kim; Anthony D. Nguyen; Yen-Kuang Chen; Pradeep Dubey
Algorithms are typically designed to exploit the current state of the art in processor technology. However, as processor technology evolves, said algorithms are often unable to derive the maximum achievable performance on these modern architectures. In this paper, we examine the performance of frequent pattern mining algorithms on a modern processor. A detailed performance study reveals that even the best frequent pattern mining implementations, with highly efficient memory managers, still grossly under-utilize a modern processor. The primary performance bottlenecks are poor data locality and low instruction level parallelism (ILP). We propose a cache-conscious prefix tree to address this problem. The resulting tree improves spatial locality and also enhances the benefits from hardware cache line prefetching. Furthermore, the design of this data structure allows the use of path tiling, a novel tiling strategy, to improve temporal locality. The result is an overall speedup of up to 3.2 when compared with state of the art implementations. We then show how these algorithms can be improved further by realizing a non-naive thread-based decomposition that targets simultaneously multi-threaded processors (SMT). A key aspect of this decomposition is to ensure cache re-use between threads that are co-scheduled at a fine granularity. This optimization affords an additional speedup of 50%, resulting in an overall speedup of up to 4.8. The proposed optimizations also provide performance improvements on SMPs, and will most likely be beneficial on emerging processors.
acm sigplan symposium on principles and practice of parallel programming | 2007
Gregory Buehrer; Srinivasan Parthasarathy; Shirish Tatikonda; Tahsin M. Kurç; Joel H. Saltz
We present a strategy for mining frequent item sets from terabyte-scale data sets on cluster systems. The algorithm embraces the holistic notion of architecture-conscious datamining, taking into account the capabilities of the processor, the memory hierarchy and the available network interconnects. Optimizations have been designed for lowering communication costs using compressed data structures and a succinct encoding. Optimizations for improving cache, memory and I/O utilization using pruningand tiling techniques, and smart data placement strategies are also employed. We leverage the extended memory spaceand computational resources of a distributed message-passing clusterto design a scalable solution, where each node can extend its metastructures beyond main memory by leveraging 64-bit architecture support. Our solution strategy is presented in the context of FPGrowth, a well-studied and rather efficient frequent pattern mining algorithm. Results demonstrate that the proposed strategy result in near-linearscaleup on up to 48 nodes.
international conference on data mining | 2006
Gregory Buehrer; Srinivasan Parthasarathy; Yen-Kuang Chen
Mining graph data is an increasingly popular challenge, which has practical applications in many areas, including molecular substructure discovery, Web link analysis, fraud detection, and social network analysis. The problem statement is to enumerate all subgraphs occurring in at least sigma graphs of a database, where sigma is a user specified parameter. Chip multiprocessors (CMPs) provide true parallel processing, and are expected to become the de facto standard for commodity computing. In this work, building on the state-of-the-art, we propose an efficient approach to parallelize such algorithms for CMPs. We show that an algorithm which adapts its behavior based on the runtime state of the system can improve system utilization and lower execution times. Most notably, we incorporate dynamic state management to allow memory consumption to vary based on availability. We evaluate our techniques on current day shared memory systems (SMPs) and expect similar performance for CMPs. We demonstrate excellent speedup, 27-fold on 32 processors for several real world datasets. Additionally, we show our dynamic techniques afford this scalability while consuming up to 35% less memory than static techniques.
knowledge discovery and data mining | 2006
Gregory Buehrer; Srinivasan Parthasarathy; Amol Ghoting
In this work we focus on the problem of frequent itemset mining on large, out-of-core data sets. After presenting a characterization of existing out-of-core frequent itemset mining algorithms and their drawbacks, we introduce our efficient, highly scalable solution. Presented in the context of the FPGrowth algorithm, our technique involves several novel I/O-conscious optimizations, such as approximate hash-based sorting and blocking, and leverages recent architectural advancements in commodity computers, such as 64-bit processing. We evaluate the proposed optimizations on truly large data sets,up to 75GB, and show they yield greater than a 400-fold execution time improvement. Finally, we discuss the impact of this research in the context of other pattern mining challenges, such as sequence mining and graph mining.
adversarial information retrieval on the web | 2008
Gregory Buehrer; Jack W. Stokes; Kumar Chellapilla
As web search providers seek to improve both relevance and response times, they are challenged by the ever-increasing tax of automated search query traffic. Third party systems interact with search engines for a variety of reasons, such as monitoring a websites rank, augmenting online games, or possibly to maliciously alter click-through rates. In this paper, we investigate automated traffic in the query stream of a large search engine provider. We define automated traffic as any search query not generated by a human in real time. We first provide examples of different categories of query logs generated by bots. We then develop many different features that distinguish between queries generated by people searching for information, and those generated by automated processes. We categorize these features into two classes, either an interpretation of the physical model of human interactions, or as behavioral patterns of automated interactions. We believe these features formulate a basis for a production-level query stream classifier.
data management on new hardware | 2005
Amol Ghoting; Gregory Buehrer; Srinivasan Parthasarathy; Daehyun Kim; Anthony D. Nguyen; Yen-Kuang Chen; Pradeep Dubey
In this paper, we characterize the performance and memory access behavior of several data mining algorithms. Specically, we consider algorithms for frequent itemset mining, sequence mining, graph mining, clustering, outlier detection, and decision tree induction. Our study reveals that data mining algorithms are compute and memory intensive. Furthermore, some algorithms have poor spatial locality, while most algorithms have poor temporal locality. Hardware prefetching helps the algorithms with good spatial locality, but most algorithms are unable to leverage simultaneous multithreading because of their memory intensive nature. Consequently, all these algorithms grossly under-utilize a modern day processor. Using the knowledge gleaned in this investigation, we briey show how we improve the performance of a frequent itemset mining algorithm, FPGrowth, on a modern processor. Our study suggests that a specialized memory system with several thread contexts per processor is needed to allow these algorithms to scale on future microprocessors.
international conference on supercomputing | 2008
Gregory Buehrer; Srinivasan Parthasarathy; Matthew Goyder
The STI Cell Broadband Engine architecture represents an interesting design point along the spectrum of chipsets with multiple processing elements. In this article we investigate key mining tasks such as clustering, classification, anomaly detection and PageRank on the Cell along the axes of performance, programming complexity and algorithm design. As part of our comparative analysis we juxtapose these algorithms with similar ones implemented on state-of-the-art uniprocessor and multicore architectures. For the workloads that are more oating point intensive, and where data is accessed in a streaming fashion the Cell processor is up to seven times faster than competing technologies, when the underlying algorithm uses the hardware efficiently. However, when required to write in a non-streaming fashion, as with PageRank, the processor is up to twenty times slower than competing processors. An outcome of our benchmark study, beyond the results on these particular algorithms is that we answer several higher level questions, which are designed to provide a fast and reliable estimate to application designers for how well other workloads will scale on the Cell.
international conference on data engineering | 2015
Gregory Buehrer; Roberto L. de Oliveira; David Fuhry; Srinivasan Parthasarathy
Extracting interesting patterns from large data stores efficiently is a challenging problem in many domains. In the data mining literature, pattern frequency has often been touted as a proxy for interestingness and has been leveraged as a pruning criteria to realize scalable solutions. However, while there exist many frequent pattern algorithms in the literature, all scale exponentially in the worst case, restricting their utility on very large data sets. Furthermore, as we theoretically argue in this article, the problem is very hard to approximate within a reasonable factor, with a polynomial time algorithm. As a counter point to this theoretical result, we present a practical algorithm called Localized Approximate Miner (LAM) that scales linearithmically with the input data. Instead of fully exploring the top of the search lattice to a user-defined point, as traditional mining algorithms do, we instead explore different parts of the complete lattice, efficiently. The key to this efficient exploration is the reliance on min-wise independent permutations to collect the data into highly similar subsets of a partition. It is straightforward to implement and scales to very large data sets. We illustrate its utility on a range of data sets, and demonstrate that the algorithm finds more patterns of higher utility in much less time than several state-of-the-art algorithms. Moreover, we realize a natural multi-level parallelization of LAM that further reduces runtimes by up to 193-fold when leveraging 256 CMP cores spanning 32 machines.