Satish Puri
Georgia State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Satish Puri.
international parallel and distributed processing symposium | 2012
Dinesh Agarwal; Satish Puri; Xi He; Sushil K. Prasad
GIS polygon-based (also know as vector-based) spatial data overlay computation is much more complex than raster data computation. Processing of polygonal spatial data files has been a long standing research question in GIS community due to the irregular and data intensive nature of the underlying computation. The state-of-the-art software for overlay computation in GIS community is still desktop-based. We present a cluster-based distributed solution for end-to-end polygon overlay processing, modeled after our Windows Azure cloud-based Crayons system [1]. We present the details of porting Crayons system to MPI-based Linux cluster and show the improvements made by employing efficient data structures such as R-trees. We present performance report and show the scalability of our system, along with the remaining bottlenecks. Our experimental results show an absolute speedup of 15x for end-to-end overlay computation employing up to 80 cores.
ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013
Satish Puri; Dinesh Agarwal; Xi He; Sushil K. Prasad
Polygon overlay is one of the complex operations in computational geometry. It is applied in many fields such as Geographic Information Systems (GIS), computer graphics and VLSI CAD. Sequential algorithms for this problem are in abundance in literature but there is a lack of distributed algorithms especially for MapReduce platform. In GIS, spatial data files tend to be large in size (in GBs) and the underlying overlay computation is highly irregular and compute intensive. The MapReduce paradigm is now standard in industry and academia for processing large-scale data. Motivated by the MapReduce programming model, we revisit the distributed polygon overlay problem and its implementation on MapReduce platform. Our algorithms are geared towards maximizing local processing and minimizing the communication overhead inherent with shuffle and sort phases in MapReduce. We have experimented with two data sets and achieved up to 22x speedup with dataset 1 using 64 CPU cores.
Sigspatial Special | 2015
Sushil K. Prasad; Michael McDermott; Satish Puri; Dhara Shah; Danial Aghajarian; Shashi Shekhar; Xun Zhou
We summarize the need and present our vision for accelerating geo-spatial computations and analytics using a combination of shared and distributed memory parallel platforms, with general-purpose Graphics Processing Units (GPUs) with 100s to 1000s of processing cores in a single chip forming a key architecture to parallelize over. A GPU can yield one-to-two orders of magnitude speedups and will become increasingly more affordable and energy efficient due to mass marketing for gaming. We also survey the current landscape of representative geo-spatial problems and their parallel, GPU-based solutions.
international workshop on analytics for big geospatial data | 2013
Sushil K. Prasad; Shashi Shekhar; Michael McDermott; Xun Zhou; Michael R. Evans; Satish Puri
It is imperative that for scalable solutions of GIS computations the modern hybrid architecture comprising a CPU-GPU pair is exploited fully. The existing parallel algorithms and data structures port reasonably well to multi-core CPUs, but poorly to GPGPUs because of latters atypical fine-grained, single-instruction multiple-thread (SIMT) architecture, extreme memory hierarchy and coalesced access requirements, and delicate CPU-GPU coordination. Recently, our parallelization of the state-of-art interesting sequence discovery algorithms calculates one-dimensional interesting intervals over an image representing the normalized difference vegetation indices of Africa within 31 ms on an nVidia 480GTX. To our knowledge, this paper reports the first parallelization of these algorithms. This allowed us to process 612 images representing biweekly data from July 1981 through Dec 2006 within 22 seconds. We were also able to pipe the output to a display in almost real-time, which would interest climate scientists. We have also undertaken parallelization of two key tree-based data structures, namely R-tree and heap, and have employed parallel R-tree in polygon overlay system. These data structure parallelization are hard because of the underlying tree topology and the fine-grained computation leading to frequent access to such data structures severely stifling parallel efficiency.
advances in geographic information systems | 2016
Danial Aghajarian; Satish Puri; Sushil K. Prasad
Given two layers of large polygonal datasets, detecting those pairs of cross-layer polygons which satisfy a join predicate, such as intersection or contain, is one of the most computationally intensive primitive operations in the spatial domain applications. In this work, we introduce GCMF, an end-to-end software system, that is able to handle spatial join (with ST_Intersect operation) over non-indexed polygonal datasets with over 3 GB file size comprising more than 600, 000 polygons on a single GPU within less than 8 sec by applying innovative filter and refinement techniques. GCMF performs a two-step filtering phase. 1) A sort-based Minimum Bounding Rectangle (MBR) filtering step detects potentially overlapping polygon pairs up to 20 times faster than the optimized GEOS library routine. 2) A linear time Common MBR filtering step (based on the overlapping area of two given MBRs) that not only eliminates two-third of the candidate polygon pairs but also reduces the number of edges to be considered in the refinement phase by 40-fold on an average based on our experimental results with real datasets. Furthermore, for the refinement phase, GCMF implements a load-balanced parallel point-in-polygon and edge-intersection tests over GPU. Our experimental results with three different real datasets show up to 39-fold end-to- end speedup versus optimized sequential routines of GEOS C++ library as well as PostgreSQL spatial database with PostGIS.
ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013
Satish Puri; Sushil K. Prasad
Polygon overlay is one of the complex operations in Geographic Information Systems (GIS). In GIS, a typical polygon tends to be large in size often consisting of thousands of vertices. Sequential algorithms for this problem are in abundance in literature and most of the parallel algorithms concentrate on parallelizing edge intersection phase only. Our research aims to develop parallel algorithms to find overlay for two input polygons which can be extended to handle multiple polygons and implement it on General Purpose Graphics Processing Units (GPGPU) which offers massive parallelism at relatively low cost. Moreover, spatial data files tend to be large in size (in GBs) and the underlying overlay computation is highly irregular and compute intensive. MapReduce paradigm is now standard in industry and academia for processing large-scale data. Motivated by MapReduce programming model, we propose to develop and implement scalable distributed algorithms to solve large-scale overlay processing in this dissertation.
international congress on big data | 2017
Sushil K. Prasad; Danial Aghajarian; Michael McDermott; Dhara Shah; Mohamed F. Mokbel; Satish Puri; Sergio J. Rey; Shashi Shekhar; Yiqun Xe; Ranga Raju Vatsavai; Fusheng Wang; Yanhui Liang; Hoang Vo; Shaowen Wang
This vision paper reviews the current state-ofart and lays out emerging research challenges in parallel processing of spatial-temporal large datasets relevant to a variety of scientific communities. The spatio-temporal data, whether captured through remote sensors (global earth observations), ground and ocean sensors (e.g., soil moisture sensors, buoys), social media and hand-held, traffic-related sensors and cameras, medical imaging (e.g., MRI), or large scale simulations (e.g., climate) have always been “big.” A common thread among all these big collections of datasets is that they are spatial and temporal. Processing and analyzing these datasets requires high-performance computing (HPC) infrastructures. Various agencies, scientific communities and increasingly the society at large rely on spatial data management, analysis, and spatial data mining to gain insights and produce actionable plans. Therefore, an ecosystem of integrated and reliable software infrastructure is required for spatialtemporal big data management and analysis that will serve as crucial tools for solving a wide set of research problems from different scientific and engineering areas and to empower users with next-generation tools. This vision requires a multidisciplinary effort to significantly advance domain research and have a broad impact on the society. The areas of research discussed in this paper include (i) spatial data mining, (ii) data analytics over remote sensing data, (iii) processing medical images, (iv) spatial econometrics analyses, (v) Map-Reducebased systems for spatial computation and visualization, (vi) CyberGIS systems, and (vii) foundational parallel algorithms and data structures for polygonal datasets, and why HPC infrastructures, including harnessing graphics accelerators, are needed for time-critical applications.
international parallel and distributed processing symposium | 2015
Sushil K. Prasad; Michael McDermott; Xi He; Satish Puri
An R-tree is a data structure for organizing and querying multi-dimensional non-uniform and overlapping data. Efficient parallelization of R-tree is an important problem due to societal applications such as geographic information systems (GIS), spatial database management systems, and VLSI layout which employ R-trees for spatial analysis tasks such as map-overlay. As graphics processing units (GPUs) have emerged as powerful computing platforms, these R-tree related applications demand efficient R-tree construction and search algorithms on GPUs. This problem is hard both due to (i) non-linear tree topology of the data structure itself and (ii) the unconventional single-instruction multiple-thread (SIMT) architecture of modern GPUs requiring careful engineering of a host of issues. Therefore, the current best parallelizations of R-tree on GPU has limited speedup of only about 20-fold. We present a space-efficient data structure design and a non-trivial bottom-up construction algorithm for R-tree on GPUs. This has yielded the first demonstrated 226-fold speedup in parallel construction of an R-tree on a GPU compared to one-core execution on a CPU. We also present innovative R-tree search algorithms that are designed to overcome GPUs architectural and resource limitations. The best of these algorithms gives a speed up of 91-fold to 180-fold on an R-tree with 16384 base objects for query sizes ranging from 2k to 16k.
cluster computing and the grid | 2014
Dinesh Agarwal; Sara Karamati; Satish Puri; Sushil K. Prasad
Message Passing Interface (MPI) has been the predominant standardized system for writing parallel and distributed applications. However, while MPI has been the software system of choice for traditional parallel and distributed computing platforms such as large compute clusters and Grid, MPI is not the system of choice for cloud platforms. The primary reasons for this is the lack of low latency high bandwidth network capabilities of the cloud platforms and the inherent architectural differences from traditional compute clusters. Prior studies suggest that the message latency of cloud platforms could be as much as 35x slower than that of an infiniband-connected cluster [1] for popular MPI implementations. MPI-like environment on cloud platforms is desirable for a large class of applications that run for long time spans with varying computing needs, such as the modeling and analysis to predict swath of a hurricane. Such applications could benefit from clouds resiliency and on-demand access for a robust and green solution. Interestingly, most of the cloud vendors provide APIs to access cloud resources in an efficient manner different than how an MPI implementation would avail of those resources. We have done extensive research to identify the pain-points for designing and implementing an MPI-like framework for cloud platforms. Our research has provided us with vital guidelines that we are sharing in this paper. We present the details of the key components required for such a framework along with our experience while implementing a preliminary MPI-like framework over Azure dubbed cloud MPI and evaluate its pros and cons. A large GIS application has been ported over cloud MPI to study its effectiveness and limitations.
Archive | 2019
Dinesh Agarwal; Satish Puri; Sushil K. Prasad
Efficient end-to-end parallel/distributed processing of vector-based spatial data has been a long-standing research question in GIS community. The irregular and data intensive nature of the underlying computation has impeded the research in this space. We have created an open-architecture-based system named Crayons for Azure cloud platform using state-of-the-art techniques. The design and development of Crayons system is an engineering feat both due to (i) the emerging nature of the Azure cloud platform which lacks traditional support for parallel processing, and (ii) the tedious exploration of design space for suitable techniques for parallelizing various workflow components including file I/O, partitioning, task creation, and load balancing. Crayons is an open-source system available for both download and online access, to foster academic activities. We believe Crayons to be the first distributed GIS system over cloud capable of end-to-end spatial overlay analysis. We demonstrate how Azure platforms storage, communication, and computation mechanisms can support high performance computing (HPC) application development. Crayons scales well for sufficiently large data sets, achieving end-to-end absolute speedup of over 28-fold employing 100 Azure processors. For smaller, more irregular workload, it still yields over 9-fold absolute speedup.