Philip J. Rhodes | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Philip J. Rhodes is active.

Explore More

Publication

Featured researches published by Philip J. Rhodes.

international conference on e science | 2005

Iteration aware prefetching for remote data access

Philip J. Rhodes; Sridhar Ramakrishnan

Although processing speed, storage capacity and network bandwidth are steadily increasing, network latency remains a bottleneck for scientists accessing large remote data sets. This problem is most acute with n-dimensional data. Grid researchers have only recently begun to develop tools for efficient remote access to n-dimensional data sets. Within the context of the Granite Scientific Database system, we show that latency penalties can be dramatically reduced using explicit knowledge of a users access pattern represented as an iterator. The iterator not only performs an n-dimensional iteration for the user, but also communicates the access pattern to Granite so that a prefetching cache can be constructed that is tuned to the users access pattern. We experimentally evaluate a scenario for incorporating Granites prefetching mechanism into the grid, demonstrating extraordinary performance gains. In light of these results, we describe planned additions to existing grid services to allow selection of datasets according to the user access pattern

Archive | 2003

A Data Model for Distributed Multiresolution Multisource Scientific Data

Philip J. Rhodes; R. Daniel Bergeron; Ted M. Sparr

Modern dataset sizes present major obstacles to understanding and interpreting the significant underlying phenomena represented in the data. There is a critical need to support scientists in the process of interactive exploration of these very large data sets. Using multiple resolutions of the data set (multiresolution), the scientist can identify potentially interesting regions with a coarse overview, followed by narrower views at higher resolutions.

visualization and data analysis | 2005

Out of core visualization using iterator aware multidimensional prefetching

Philip J. Rhodes; Xuan Tang; R. Daniel Bergeron; Ted M. Sparr

Visualization of multidimensional data presents special challenges for the design of efficient out-of-core data access. Elements that are nearby in the visualization may not be nearby in the underlying data file, which can severely tax the operating system’s disk cache. The Granite Scientific Database System can address these problems because it is aware of the organization of the data on disk, and it knows the visualization method’s pattern of access. The access pattern is expressed using a toolkit of iterators that both describe the access pattern and perform the iteration itself. Because our system has knowledge of both the data organization and the access pattern, we are able to provide significant performance improvements while hiding the details of out-of-core access from the visualization programmer. This paper presents a brief description of our disk access system placing special emphasis on the benefits offered to a visualization application. We describe a simple demonstration application that shows dramatic performance improvements when used with the 39GB Visible Woman Dataset.

grid computing | 2011

A Fast Location Service for Partial Spatial Replicas

Yun Tian; Philip J. Rhodes

This paper describes a design and implementation of a distributed high-performance partial spatial replica location service. Our replica location service identifies the set of partial replicas that intersect with a region of interest, an important component of partial spatial replica selection. We find that using an R-Tree data structure is superior to relying on a relational database alone when handling spatial data queries. We have also added a collection of optimizations that together improve performance. In particular, database Query Aggregation and using a Morton curve during R-tree construction produce significant performance gains. Experimental results show that the proposed partial spatial replica location service scales well for multi-client and distributed large spatial queries, queries that return more than 10,000 replicas. Individual servers with one million pieces of replica metadata in the backend database can support up to 100 clients concurrently when handling large spatial queries. Our previous work solved the same problem using an unmodified Globus Toolkit, but the work described here modifies and extends existing Globus Toolkit code to handle spatial metadata operations.

acm southeast regional conference | 2010

Optimizing memory access on GPUs using morton order indexing

Anthony Nocentino; Philip J. Rhodes

High performance computing environments are not freely available to every scientist or programmer. However, massively parallel computational devices are available in nearly every workstation class computer and laptop sold today. The programmable GPU gives immense computational power to a user in a standard office environment; however, programming a GPU to function efficiently is not a trivial task. An issue of primary concern is memory latency, if not managed properly it can cost the GPU in performance resulting in increased runtimes waiting for data. In this paper we describe an optimization of memory access methods on GPUs using Morton order indexing, sometimes referred to as Z-order index.

conference on current trends in theory and practice of informatics | 2002

Database Support for Multisource Multiresolution Scientific Data

Philip J. Rhodes; R. Daniel Bergeron; Ted M. Sparr

We extend database technology to provide more meaningful support for exploration of scientific data. We have developed a new data model that incorporates spatial semantics with localized error and are implementing a prototype database system based on the model. Our data model and system focus on support for retrieval and visualization of gridded scientific data at multiple resolutions. While these semantics may not apply naturally to every scientific application, they are common to many. This paper summarizes the data model and describes the key functionality of our prototype system.

international conference on big data | 2013

Iteration aware prefetching for unstructured grids

Oyindamola O. Akande; Philip J. Rhodes

Due to the increasing quality of instruments and availability of computational resources, the size of spatial scientific datasets has been steadily increasing. However, much of the research on efficient storage and access to spatial datasets has focused on large multidimensional arrays. In contrast, unstructured datasets consisting of collections of simplices (e.g. triangles or tetrahedra) present special challenges that have received less attention. Data values found at the vertices of the simplices may be dispersed throughout a datafile, producing especially poor disk locality. In this paper, we address this important problem of poor locality in two major ways. First, we reorganize the unstructured dataset to improve locality in both the dataset space and in the data file on disk using a specialized chunking approach that maintains the spatial neighborhood relationships inherent in the unstructured data. This reorganization produces significant gains in performance by reducing the number of accesses made to the data file. Second, we extend our previous work and describe a prefetching method that takes advantage of prior knowledge of the users access pattern. Applying this prefetching method to unstructured data produces further performance gains over and above the gains seen from reorganization alone.

international conference on e-science | 2012

Partial replica selection for spatial datasets

Yun Tian; Philip J. Rhodes

The implementation of partial or incomplete replicas, which represent only a subset of a larger dataset, has been an active topic of research. Partial Spatial Replicas extend this functionality to spatial data, allowing us to distribute a spatial dataset in pieces over several locations. Accessing only a subset of a spatial replica usually results in a large number of relatively small read requests made to the underlying storage device. For this reason, an accurate model of disk access is important when working with spatial subsets. We make two primary contributions in this paper. First, we describe a model for disk access performance that takes filesystem prefetching into account and is sufficiently accurate for spatial replica selection. Second, making a few simplifying assumptions, we propose a fast replica selection algorithm for partial spatial replicas. The algorithm uses a greedy approach that attempts to maximize performance by choosing a collection of replica subsets that allow fast data retrieval by a client machine. Experiments show that the performance of the solution found by our algorithm is on average always at least 91% and 93.4% of the performance of the optimal solution in 4-node and 8-node tests respectively.

grid computing | 2010

The Globus Toolkit R-tree for partial spatial replica selection

Yun Tian; Philip J. Rhodes

Partial Replicas have been used to parallelize access to regions of large spatial data sets on geographically distributed machines, saving network bandwidth and improving data availability. In this paper, we present the Globus Toolkit R-tree, (GTR-tree) to efficiently select partial replicas using the Globus Toolkit Replica Location Service (RLS) middleware. First, the limitations inherent in the Globus RLS service for spatial data are analyzed, motivating the usefulness of the GTR-tree for solving the partial replica selection problem. We then describe our implementation of the R-tree data structure on top of an unmodified Globus RLS. The R-tree is an important data structure for spatial computation, and results in very significant performance gains. Our performance results and evaluation demonstrate enormous improvements for spatial replica selection over a plain RLS.

high performance distributed computing | 2008

Toward automatic parallelization of spatial computation for computing clusters

Baoqiang Yan; Philip J. Rhodes

High performance parallel computing infrastructures, such as computing clusters, have recently become freely available for scientific researchers to solve problems of unprecedented scale through data parallelization. However scientists are not necessarily skilled in writing efficient parallel code, especially when dealing with spatial datasets. Two important performance issues involved are the heavy I/O costs and the communication overhead. To address this issue, we are developing an scheme that helps scientists realize I/O friendly and scalable data parallelization for spatial computation. Built upon our iteration aware spatial prefetching and caching techniques, this data parallelization scheme takes an explicit specification of data dependency, identifies the best feasible access patterns while applying some I/O efficiency rules and then wraps them in separate spatial data iterators for efficient cache loading and data partitioning respectively. This scheme prioritizes but reconciles the I/O costs in the different stages of a data intensive cluster application to achieve the overall best I/O performance while maintaining fair computational scalability.

Explore More