Daniel W. Watson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel W. Watson is active.

Explore More

Publication

Featured researches published by Daniel W. Watson.

IEEE Transactions on Parallel and Distributed Systems | 1998

Parallel genetic simulated annealing: a massively parallel SIMD algorithm

Hao Chen; Nicholas S. Flann; Daniel W. Watson

Many significant engineering and scientific problems involve optimization of some criteria over a combinatorial configuration space. The two methods most often used to solve these problems effectively-simulated annealing (SA) and genetic algorithms (GA)-do not easily lend themselves to massive parallel implementations. Simulated annealing is a naturally serial algorithm, while GA involves a selection process that requires global coordination. This paper introduces a new hybrid algorithm that inherits those aspects of GA that lend themselves to parallelization, and avoids serial bottle-necks of GA approaches by incorporating elements of SA to provide a completely parallel, easily scalable hybrid GA/SA method. This new method, called Genetic Simulated Annealing, does not require parallelization of any problem specific portions of a serial implementation-existing serial implementations can be incorporated as is. Results of a study on two difficult combinatorial optimization problems, a 100 city traveling salesperson problem and a 24 word, 12 bit error correcting code design problem, performed on a 16 K PE MasPar MP-1, indicate advantages over previous parallel GA and SA approaches. One of the key results is that the performance of the algorithm scales up linearly with the increase of processing elements, a feature not demonstrated by any previous parallel GA or SA approaches, which enables the new algorithm to utilize massive parallel architecture with maximum effectiveness. Additionally, the algorithm does not require careful choice of control parameters, a significant advantage over SA and GA.

Environmental Modelling and Software | 2011

Extraction of hydrological proximity measures from DEMs using parallel processing

Teklu K. Tesfa; David G. Tarboton; Daniel W. Watson; K. A. T. Schreuders; Matthew E. Baker; Robert M. Wallace

Land surface topography is one of the most important terrain properties which impact hydrological, geomorphological, and ecological processes active on a landscape. In our previous efforts to develop a soil depth model based upon topographic and land cover variables, we derived a set of hydrological proximity measures (HPMs) from a Digital Elevation Model (DEM) as potential explanatory variables for soil depth. These HPMs are variations of the distance up to ridge points (cells with no incoming flow) and variations of the distance down to stream points (cells with a contributing area greater than a threshold), following the flow path. The HPMs were computed using the D-infinity flow model that apportions flow between adjacent neighbors based on the direction of steepest downward slope on the eight triangular facets constructed in a 3 x 3 grid cell window using the center cell and each pair of adjacent neighboring grid cells in turn. The D-infinity model typically results in multiple flow paths between 2 points on the topography, with the result that distances may be computed as the minimum, maximum or average of the individual flow paths. In addition, each of the HPMs, are calculated vertically, horizontally, and along the land surface. Previously, these HPMs were calculated using recursive serial algorithms which suffered from stack overflow problems when used to process large datasets, limiting the size of DEMs that could be analyzed. To overcome this limitation, we developed a message passing interface (MPI) parallel approach designed to both increase the size and speed with which these HPMs are computed. The parallel HPM algorithms spatially partition the input grid into stripes which are each assigned to separate processes for computation. Each of those processes then uses a queue data structure to order the processing of cells so that each cell is visited only once and the cross-process communications that are a standard part of MPI are handled in an efficient manner. This parallel approach allows efficient analysis of much larger DEMs than were possible using the serial recursive algorithms. The HPMs given here may also have other, more general modeling applicability in hydrology, geomorphology and ecology, and so are described here from a general perspective. In this paper, we present the definitions of the HPMs, the serial and parallel algorithms used in their computation and their potential applications.

Information Sciences | 1998

Generational scheduling for dynamic task management in heterogeneous computing systems

Brent R. Carter; Daniel W. Watson; Richard F. Freund; Elaine G. Keith; Francesca Mirabile; Howard Jay Siegel

Heterogeneous computing (HC) is the coordinated use of different types of machines, networks, and interfaces in order to maximize performance and/or cost effectiveness. In recent years, research related to HC has addressed one of its most fundamental challenges: how to develop a schedule of tasks on a set of heterogeneous hosts that minimizes the time required to execute the given tasks. The development of such a schedule is made difficult by diverse processing abilities among the hosts, data and precedence dependencies among the tasks, and other factors. This paper outlines a straightforward approach to solving this problem, termed generational scheduling (GS). GS provides fast, efficient matching of tasks to hosts and requires little overhead to implement. This study introduces the GS approach and illustrates its effectiveness in terms of the time to determine schedules and the quality of schedules produced. A communication-inclusive extension of GS is presented to illustrate how GS can be used when the overhead of transferring data produced be some tasks and consumed by others is significant. Finally, to illustrate the effectiveness of GS in a real-world environment, a series of experiments are presented using GS in the SmartNet scheduling framework, developed at US Navys facility at the Naval Command, Control, and Ocean Surveillance Center in San Diego, California.

Proceedings Heterogeneous Computing Workshop | 1994

Static program decomposition among machines in an SIMD/SPMD heterogeneous environment with non-constant mode switching costs

Daniel W. Watson; John K. Antonio; Howard Jay Siegel; Mikhail J. Atallah

The problem of minimizing the execution time of programs within a heterogeneous environment is considered. Different computational characteristics within a parallel algorithm may make switching execution from one machine to another beneficial; however, the cost of switching between machines during the execution of a program must be considered. This cost is not constant, but depends on data transfers needed as a result of the move. Therefore, determining a minimum-cost assignment of machines to program segments is not straightforward. A previously presented block-based mode selection (BBMS) approach is used as a basis to develop a heuristic method for assigning machines to program segments of data-parallel algorithms. Simulation results of parallel program behavior using the heuristic indicate that good assignments are possible without resorting to exhaustive search techniques.<<ETX>>

Computers & Geosciences | 2015

A virtual tile approach to raster-based calculations of large digital elevation models in a shared-memory system

Ahmet Artu Yildirim; Daniel W. Watson; David G. Tarboton; Robert M. Wallace

Grid digital elevation models (DEMs) are commonly used in hydrology to derive information related to topographically driven flow. Advances in technology for creating DEMs have increased their resolution and data size with the result that algorithms for processing them are frequently memory limited. This paper presents a new approach to the management of memory in the parallel solution of hydrologic terrain processing using a user-level virtual memory system for shared-memory multithreaded systems. The method includes tailored virtual memory management of raster-based calculations for datasets that are larger than available memory and a novel order-of-calculations approach to parallel hydrologic terrain analysis applications. The method is illustrated for the pit filling algorithm used first in most hydrologic terrain analysis workflows. HighlightsWe presented a tiled virtual memory system to process large DEM datasets on a machine with limited memoryEfficiencies were achieved using algorithm refinements such as multi-threading, load balancing and the pre-fetching approaches.We found optimal tile sizes for the cases of best performance and better memory usage efficiency.

Journal of Parallel and Distributed Computing | 1994

A Block-Based Mode Selection Model for SIMD/SPMD Parallel Environments

Daniel W. Watson; Howard Jay Siegel; John K. Antonio; Mark A. Nichols; Mikhail J. Atallah

One of the challenges for parallel compilers and compiler-related tools is, given a machine-independent parallel language, to generate executable code for a variety of computational models, and to identify those specific parallel modes for which a program is well-suited. One portion of this problem, developing a method for estimating the relative execution time of a data-parallel algorithm in an environment capable of the SIMD and SPMD (MIMD) modes of parallelism, is presented. Given a data-parallel program in a language whose syntax is mode-independent and empirical information about instruction execution time characteristics, the goal is to use static source-code analysis to determine an implementation that results in an optimal execution time for a mixed-mode machine capable of SIMD and SPMD parallelism. Statistical information about individual operation execution times and paths of execution through a parallel program is assumed. A secondary goal of this study is to indicate language, algorithm, and machine characteristics that must be researched to learn how to provide the information needed to obtain an optimal assignment of parallel modes to program segments.

collaboration technologies and systems | 2005

Path planning for altruistically negotiating processes

Deepthi Devalarazu; Daniel W. Watson

Autonomous negotiating systems are composed of logically (even geographically) separated software agents that control logical or physical resources that altruistically seek to perform useful work in a cooperative manner. These systems are multi-agent systems that consist of a population of autonomous agents collaborating to work for a common goal while simultaneously performing their individual tasks (i.e., computational resources are distributed amongst interconnected agents). With the increasing capabilities of the collaborative agents, the need for faster and more efficient methods of utilizing the distributed resources has also increased. This paper focuses on improving the performance of one such multi-agent system that deals with the path planning for autonomous robots. This is achieved by exploiting parallelism among processing resources embedded in the autonomous vehicles, using a distributed memory, message-passing execution model

parallel computing | 1999

Aspects of computational mode and data distribution for parallel range image segmentation

Nicholas Giolmas; Daniel W. Watson; David M. Chelberg; Peter V. Henstock; Juneho Yi; Howard Jay Siegel

Abstract Parallel processing methods are a means to achieve significant speedup of computationally expensive image understanding algorithms, such as those applied to range images. Practical implementations of these algorithms must deal with the problems of selecting an appropriate parallel architecture and mapping the algorithm onto that architecture. The parallel implementation approaches for range image segmentation that are presented here are applicable to many low-level image understanding algorithms in a variety of parallel architectures. An evaluation of initial data distribution is presented to determine whether a square subimage or a striped subimage distribution would result in the greatest overall reduction in execution time for the given range image segmentation problem. Novel implementations that consider each data distributions treatment of edge pixels in window operations yield a trade-off between the number of data transfers versus the amount of computation. This trade-off is examined both analytically and experimentally. Additionally, using the same initial data distributions, a technique is introduced for changing the allocation of work to each of the processors to reduce the number of network settings by one half. This technique and the method for determining the better initial data distribution can be used with any machine and any window-based technique that requires a full window to perform image calculations. Comparisons of range image processing algorithms are performed using “pure” SIMD algorithms, “pure” MIMD algorithms, and mixed-mode implementations with both SIMD and MIMD elements. Each of these approaches are quantitatively analyzed and compared for implementing the different phases of a particular hybrid range segmentation algorithm. Results of this implementation study indicate that quantifiable reductions in execution time result from the proper choice of parallel mode for each portion of the segmentation process.

international conference on parallel processing | 1996

A massively parallel SIMD algorithm for combinatorial optimization

Ranjit A. Henry; Nicholas S. Flann; Daniel W. Watson

Many significant engineering and scientific problems involve optimization of some criteria over a combinatorial configuration space. The two methods most often used to solve these problems effectively-simulated annealing (SA) and genetic algorithms (GA)-do not easily lend themselves to massive parallel implementations. This paper introduces a new hybrid algorithm that inherits those aspects of GA that lend themselves to parallelization, and avoids serial bottlenecks of GA approaches by incorporating elements of SA to provide a completely parallel, easily scalable hybrid GA/SA method. This new method, called genetic simulated annealing, does not require parallelization of any problem specific portions of a serial implementation-existing serial implementations can be incorporated as-is. Results of a study on two difficult combinatorial optimization problems, a 100 city traveling salesperson problem and a 24 word, 12 bit error correcting code design problem, performed on a 16 K PE MasPar MP-1, indicate significant advantages of the method. One of the key results is that the performance of the algorithm scales up almost linearly with the increase of processing elements. Additionally, the algorithm does not require careful choice of control parameters, a significant advantage over SA and GA.

The Journal of Supercomputing | 2015

A comparative study of the parallel wavelet-based clustering algorithm on three-dimensional dataset

Ahmet Artu Yildirim; Daniel W. Watson

Cluster analysis—as a technique for grouping a set of objects into similar clusters—is an integral part of data analysis and has received wide interest among data mining specialists. The parallel wavelet-based clustering algorithm using discrete wavelet transforms has been shown to extract the approximation component of the input data on which objects of the clusters are detected based on the object connectivity property. However, this algorithm suffers from inefficient I/O operations and performance degradation due to redundant data processing. We address these issues to improve the parallel algorithm’s efficiency and extend the algorithm further by investigating two merging techniques (both merge-table and priority-queue based approaches), and apply them on three-dimensional data. In this study, we compare two parallel WaveCluster algorithms and a parallel K-means algorithm to evaluate the implemented algorithms’ effectiveness.

Explore More