Kanat Tangwongsan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kanat Tangwongsan is active.

Explore More

Publication

Featured researches published by Kanat Tangwongsan.

acm symposium on parallel algorithms and architectures | 2012

Brief announcement: the problem based benchmark suite

Julian Shun; Guy E. Blelloch; Jeremy T. Fineman; Phillip B. Gibbons; Aapo Kyrola; Harsha Vardhan Simhadri; Kanat Tangwongsan

This announcement describes the problem based benchmark suite (PBBS). PBBS is a set of benchmarks designed for comparing parallel algorithmic approaches, parallel programming language styles, and machine architectures across a broad set of problems. Each benchmark is defined concretely in terms of a problem specification and a set of input distributions. No requirements are made in terms of algorithmic approach, programming language, or machine architecture. The goal of the benchmarks is not only to compare runtimes, but also to be able to compare code and other aspects of an implementation (e.g., portability, robustness, determinism, and generality). As such the code for an implementation of a benchmark is as important as its runtime, and the public PBBS repository will include both code and performance results. The benchmarks are designed to make it easy for others to try their own implementations, or to add new benchmark problems. Each benchmark problem includes the problem specification, the specification of input and output file formats, default input generators, test codes that check the correctness of the output for a given input, driver code that can be linked with implementations, a baseline sequential implementation, a baseline multicore implementation, and scripts for running timings (and checks) and outputting the results in a standard format. The current suite includes the following problems: integer sort, comparison sort, remove duplicates, dictionary, breadth first search, spanning forest, minimum spanning forest, maximal independent set, maximal matching, K-nearest neighbors, Delaunay triangulation, convex hull, suffix arrays, n-body, and ray casting. For each problem, we report the performance of our baseline multicore implementation on a 40-core machine.

very large data bases | 2013

Counting and sampling triangles from a graph stream

Aduri Pavan; Kanat Tangwongsan; Srikanta Tirthapura; Kun-Lung Wu

This paper presents a new space-efficient algorithm for counting and sampling triangles--and more generally, constant-sized cliques--in a massive graph whose edges arrive as a stream. Compared to prior work, our algorithm yields significant improvements in the space and time complexity for these fundamental problems. Our algorithm is simple to implement and has very good practical performance on large graphs.

ACM Transactions on Programming Languages and Systems | 2009

An experimental analysis of self-adjusting computation

Umut A. Acar; Guy E. Blelloch; Matthias Blume; Robert Harper; Kanat Tangwongsan

Recent work on adaptive functional programming (AFP) developed techniques for writing programs that can respond to modifications to their data by performing change propagation. To achieve this, executions of programs are represented with dynamic dependence graphs (DDGs) that record data dependences and control dependences in a way that a change-propagation algorithm can update the computation as if the program were from scratch, by re-executing only the parts of the computation affected by the changes. Since change-propagation only re-executes parts of the computation, it can respond to certain incremental modifications asymptotically faster than recomputing from scratch, potentially offering significant speedups. Such asymptotic speedups, however, are rare: for many computations and modifications, change propagation is no faster than recomputing from scratch. In this article, we realize a duality between dynamic dependence graphs and memoization, and combine them to give a change-propagation algorithm that can dramatically increase computation reuse. The key idea is to use DDGs to identify and re-execute the parts of the computation that are affected by modifications, while using memoization to identify the parts of the computation that remain unaffected by the changes. We refer to this approach as self-adjusting computation. Since DDGs are imperative, but (traditional) memoization requires purely functional computation, reusing computation correctly via memoization becomes a challenge. We overcome this challenge with a technique for remembering and reusing not just the results of function calls (as in conventional memoization), but their executions represented with DDGs. We show that the proposed approach is realistic by describing a library for self-adjusting computation, presenting efficient algorithms for realizing the library, and describing and evaluating an implementation. Our experimental evaluation with a variety of applications, ranging from simple list primitives to more sophisticated computational geometry algorithms, shows that the approach is effective in practice: compared to recomputing from-scratch; self-adjusting programs respond to small modifications to their data orders of magnitude faster.

international conference on data engineering | 2015

Multicore triangle computations without tuning

Julian Shun; Kanat Tangwongsan

Triangle counting and enumeration has emerged as a basic tool in large-scale network analysis, fueling the development of algorithms that scale to massive graphs. Most of the existing algorithms, however, are designed for the distributed-memory setting or the external-memory setting, and cannot take full advantage of a multicore machine, whose capacity has grown to accommodate even the largest of real-world graphs. This paper describes the design and implementation of simple and fast multicore parallel algorithms for exact, as well as approximate, triangle counting and other triangle computations that scale to billions of nodes and edges. Our algorithms are provably cache-friendly, easy to implement in a language that supports dynamic parallelism, such as Cilk Plus or OpenMP, and do not require parameter tuning. On a 40-core machine with two-way hyper-threading, our parallel exact global and local triangle counting algorithms obtain speedups of 17-50x on a set of real-world and synthetic graphs, and are faster than previous parallel exact triangle counting algorithms. We can compute the exact triangle count of the Yahoo Web graph (over 6 billion edges) in under 1.5 minutes. In addition, for approximate triangle counting, we are able to approximate the count for the Yahoo graph to within 99.6% accuracy in under 10 seconds, and for a given accuracy we are much faster than existing parallel approximate triangle counting implementations.

acm symposium on parallel algorithms and architectures | 2011

Linear-work greedy parallel approximate set cover and variants

Guy E. Blelloch; Richard Peng; Kanat Tangwongsan

We present parallel greedy approximation algorithms for set cover and related problems. These algorithms build on an algorithm for solving a graph problem we formulate and study called Maximal Nearly Independent Set (MaNIS)---a graph abstraction of a key component in existing work on parallel set cover. We derive a randomized algorithm for MaNIS that has O(m) work and O(log2 m) depth on input with m edges. Using MaNIS, we obtain RNC algorithms that yield a (1+ε)Hn-approximation for set cover, a (1 - 1/e -ε)-approximation for max cover and a (4 + ε)-approximation for min-sum set cover all in linear work; and an O(log* n)-approximation for asymmetric k-center for k ≤ logO(1) n and a (1.861+ε)-approximation for metric facility location both in essentially the same work bounds as their sequential counterparts.

Electronic Notes in Theoretical Computer Science | 2006

A Library for Self-Adjusting Computation

Umut A. Acar; Guy E. Blelloch; Matthias Blume; Robert Harper; Kanat Tangwongsan

We present a Standard ML library for writing programs that automatically adjust to changes to their data. The library combines modifiable references and memoization to achieve efficient updates. We describe an implementation of the library and apply it to the problem of maintaining the convex hull of a dynamically changing set of points. Our experiments show that the overhead of the library is small, and that self-adjusting programs can adjust to small changes three-orders of magnitude faster than recomputing from scratch. The implementation relies on invariants that could be enforced by a modal type system. We show, using an existing language, abstract interfaces for modifiable references and for memoization that ensure the same safety properties without the use of modal types. The interface for memoization, however, does not scale well, suggesting a language-based approach to be preferable after all.

very large data bases | 2015

General incremental sliding-window aggregation

Kanat Tangwongsan; Martin Hirzel; Scott Schneider; Kun-Lung Wu

Stream processing is gaining importance as more data becomes available in the form of continuous streams and companies compete to promptly extract insights from them. In such applications, sliding-window aggregation is a central operator, and incremental aggregation helps avoid the performance penalty of re-aggregating from scratch for each window change. This paper presents Reactive Aggregator (RA), a new framework for incremental sliding-window aggregation. RA is general in that it does not require aggregation functions to be invertible or commutative, and it does not require windows to be FIFO. We implemented RA as a drop-in replacement for the Aggregate operator of a commercial streaming engine. Given m updates on a window of size n, RA has an algorithmic complexity of O(m + m log (n/m)), rivaling the best prior algorithms for any m. Furthermore, RAs implementation minimizes overheads from allocation and pointer traversals by using a single flat array.

conference on information and knowledge management | 2013

Parallel triangle counting in massive streaming graphs

Kanat Tangwongsan; Aduri Pavan; Srikanta Tirthapura

The number of triangles in a graph is a fundamental metric widely used in social network analysis, link classification and recommendation, and more. In these applications, modern graphs of interest tend to both large and dynamic. This paper presents the design and implementation of a fast parallel algorithm for estimating the number of triangles in a massive undirected graph whose edges arrive as a stream. Our algorithm is designed for shared-memory multicore machines and can make efficient use of parallelism and the memory hierarchy. We provide theoretical guarantees on performance and accuracy, and our experiments on real-world datasets show accurate results and substantial speedups compared to an optimized sequential implementation.

european symposium on algorithms | 2008

Robust Kinetic Convex Hulls in 3D

Umut A. Acar; Guy E. Blelloch; Kanat Tangwongsan; Duru Türkoğlu

Kinetic data structures provide a framework for computing combinatorial properties of continuously moving objects. Although kinetic data structures for many problems have been proposed, some difficulties remain in devising and implementing them, especially robustly. One set of difficulties stems from the required update mechanisms used for processing certificate failures--devising efficient update mechanisms can be difficult, especially for sophisticated problems such as those in 3D. Another set of difficulties arises due to the strong assumption in the framework that the update mechanism is invoked with a single event. This assumption requires ordering the events precisely, which is generally expensive. This assumption also makes it difficult to deal with simultaneous events that arise due to degeneracies or due to intrinsic properties of the kinetized algorithms. In this paper, we apply advances on self-adjusting computation to provide a robust motion simulation technique that combines kinetic event-based scheduling and the classic idea of fixed-time sampling. The idea is to divide time into a lattice of fixed-size intervals, and process events at the resolution of an interval. We apply the approach to the problem of kinetic maintenance of convex hulls in 3D, a problem that has been open since 90s. We evaluate the effectiveness of the proposal experimentally. Using the approach, we are able to run simulations consisting of tens of thousands of points robustly and efficiently.

acm symposium on parallel algorithms and architectures | 2010

Parallel approximation algorithms for facility-location problems

Guy E. Blelloch; Kanat Tangwongsan

This paper presents the design and analysis of parallel approximation algorithms for facility-location problems, including NC and RNC algorithms for (metric) facility location, k-center, k-median, and k-means. These problems have received considerable attention during the past decades from the approximation algorithms community, which primarily concentrates on improving the approximation guarantees. In this paper, we ask: Is it possible to parallelize some of the beautiful results from the sequential setting?. Our starting point is a small, but diverse, subset of results in approximation algorithms for facility-location problems, with a primary goal of developing techniques for devising their efficient parallel counterparts. We focus on giving algorithms with low depth, near work efficiency (compared to the sequential versions), and low cache complexity.

Explore More