Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Walid Keyrouz is active.

Publication


Featured researches published by Walid Keyrouz.


international conference on parallel processing | 2014

A Hybrid CPU-GPU System for Stitching Large Scale Optical Microscopy Images

Timothy Blattner; Walid Keyrouz; Joe Chalfoun; Bertrand C. Stivalet; Mary Brady; Shujia Zhou

Researchers in various fields are using optical microscopy to acquire very large images, 10000 - 200000 of pixels per side. Optical microscopes acquire these images as grids of overlapping partial images (thousands of pixels per side) that are then stitched together via software. Composing such large images is a compute and data intensive task even for modern machines. Researchers compound this difficulty further by obtaining time-series, volumetric, or multiple channel images with the resulting data sets now having or approaching terabyte sizes. We present a scalable hybrid CPU-GPU implementation of image stitching that processes large image sets at near interactive rates. Our implementation scales well with both image sizes and the number of CPU cores and GPU cards in a machine. It processes a grid of 42 × 59 tiles into a 17 k × 22 k pixels image in 43 s (end-to-end execution times) when using one NVIDIA Tesla C2070 card and two Intel Xeon E-5620 quad-core CPUs, and in 29 s when using two Tesla C2070 cards and the same two CPUs. It also composes and renders the composite image without saving it in 15 s. In comparison, ImageJ/Fiji, which is widely used by biologists, has an image stitching plugin that takes > 3.6 h for the same workload despite being multithreaded and executing the same mathematical operators, it composes and saves the large image in an additional 1.5 h. This implementation takes advantage of coarse-grain parallelism. It organizes the computation into a pipeline architecture that spans CPU and GPU resources and overlaps computation with data motion. The implementation achieves a nearly 10× performance improvement over our optimized non-pipeline GPU implementation and demonstrates near-linear speedup when increasing CPU thread count and increasing number of GPUs.


Scientific Reports | 2017

MIST: Accurate and Scalable Microscopy Image Stitching Tool with Stage Modeling and Error Minimization

Joe Chalfoun; Michael P. Majurski; Tim Blattner; Kiran Bhadriraju; Walid Keyrouz; Peter Bajcsy; Mary Brady

Automated microscopy can image specimens larger than the microscope’s field of view (FOV) by stitching overlapping image tiles. It also enables time-lapse studies of entire cell cultures in multiple imaging modalities. We created MIST (Microscopy Image Stitching Tool) for rapid and accurate stitching of large 2D time-lapse mosaics. MIST estimates the mechanical stage model parameters (actuator backlash, and stage repeatability ‘r’) from computed pairwise translations and then minimizes stitching errors by optimizing the translations within a (4r)2 square area. MIST has a performance-oriented implementation utilizing multicore hybrid CPU/GPU computing resources, which can process terabytes of time-lapse multi-channel mosaics 15 to 100 times faster than existing tools. We created 15 reference datasets to quantify MIST’s stitching accuracy. The datasets consist of three preparations of stem cell colonies seeded at low density and imaged with varying overlap (10 to 50%). The location and size of 1150 colonies are measured to quantify stitching accuracy. MIST generated stitched images with an average centroid distance error that is less than 2% of a FOV. The sources of these errors include mechanical uncertainties, specimen photobleaching, segmentation, and stitching inaccuracies. MIST produced higher stitching accuracy than three open-source tools. MIST is available in ImageJ at isg.nist.gov.


ieee global conference on signal and information processing | 2015

A hybrid task graph scheduler for high performance image processing workflows

Timothy Blattner; Walid Keyrouz; Milton Halem; Mary Brady; Shuvra S. Bhattacharyya

Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) increases programmer productivity when implementing hybrid workflows that scale to multi-core and multi-GPU systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. We present an implementation of hybrid microscopy image stitching using HTGS that reduces code size by ≈ 25% and shows favorable performance compared to a similar hybrid workflow implementation without HTGS. The HTGS-based implementation reuses the computational functions of the hybrid workflow implementation.


signal processing systems | 2017

Model-based dynamic scheduling for multicore implementation of image processing systems

Jiahao Wu; Timothy Blattner; Walid Keyrouz; Shuvra S. Bhattacharyya

In this paper, we present a new software tool, called HTGS Model-based Engine (HMBE), for the design and implementation of multicore signal processing applications. HMBE provides complementary capabilities to HTGS (Hybrid Task Graph Scheduler), which is a recently-introduced software tool for implementing scalable workflows for high performance computing applications. HMBE integrates advanced design optimization techniques provided in HTGS with model-based approaches that are founded on dataflow principles. Such integration contributes to (a) making the application of HTGS more systematic and less time consuming, (b) incorporating additional dataflow-based optimization capabilities with HTGS optimizations, and (c) automating significant parts of the HTGS-based design process. In this paper, we present HMBE with an emphasis on novel dynamic scheduling techniques that are developed as part of the tool. We demonstrate the utility of HMBE through a case study involving an image stitching application for large scale microscopy images.


bioinformatics and biomedicine | 2015

MIST: Microscopy Image Stitching Tool

Joe Chalfoun; Michael P. Majurski; Timothy Blattner; Walid Keyrouz; Peter Bajcsy; Mary Brady

Summary form only given. Motivation: Automated microscopy enables scientists to image an area of an experimental sample that is much larger than the microscopes Field of View (FOV) and to carry out time-lapse studies of cell cultures. An automated microscope acquires these images by generating a grid of partially overlapping images. This process generates hundreds to hundreds of thousands of image tiles that need to be stitched into a wide image. We address the problem of creating image mosaics from a grid of overlapping tiles constrained to only translational offsets. The challenges of creating a large mosaic image are: (1) sensitivity to image features in the overlapping regions of adjacent tiles (e.g., during the early period of cell colony growth), (2) computational requirements needed to assemble the resulting mosaic image, and (3) absence of ground truth needed for evaluating the accuracy of a stitching method. Results: This paper describes a stitching method called MIST (Microscopy Image Stitching Tool) with minimized translational uncertainty for large collections of grid-based microscopy tiles. The method improves tile translations computed using a registration method, such as the Fourier transform based phase correlation, by optimizing the normalized cross correlation between the overlap of adjacent tiles. The optimization incorporates mechanical properties of a microscope stage to filter translations with high errors. We estimate the microscope stage repeatability from the computed translations of the grid-based image tiles and then improve all translations using constrained Hill Climbing restricted to searching a square area of 4 times the stage repeatability per side. We also present a methodology for evaluating stitching accuracy based on creating reference centroid distance and area measurements of regions of interests that fit inside one FOV. The regions of interests (ROI) are segmented first and their mutual centroid distances and areas are measured using the microscope stage coordinates. The stitching accuracy is quantified by comparing the reference measurements to the measurements obtained by stitching a set of grid-based tiles by means of four NIST -derived metrics: false positive (added ROIs), false negative (undetected ROI), centroid distance error and area error. Following this methodology, we prepared three large reference datasets of stem cell colonies with low colony seeding which result in high uncertainty associated with the translation offsets. MIST generated a stitched image with an average colony centroid distance error less than 2 % that of a field of view and an average area error of 5 %. The sources of these errors include mechanical uncertainties, sample photobleaching, segmentation and stitching. We also show that the area error is mainly due to photobleaching and not stitching. We compared MIST stitching to the top five popular methods used in the literature. MIST produced the most accurate stitching result among all methods. Conclusions: MIST is an accurate stitching tool that can be applied to grid-based tiles with unknown translational offsets. Its performance-oriented implementation yields a fast execution time that makes the algorithm suitable for creating large mosaics (up to TBs in size). The evaluation methodology for stitching accuracy along with NIST -derived four performance metrics provides a general approach to characterize stitching algorithm performance. The application of the methodology in our case generated three reusable reference datasets with cell colonies. Availability: MIST is available as a Matlab executable or an ImageJ plugin. MIST ImageJ plugin has a CPU and a GPU implementation. All the information regarding this tool and its source code can be found at the following link: https://isg.nist.gov/.


signal processing systems | 2017

A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows

Timothy Blattner; Walid Keyrouz; Shuvra S. Bhattacharyya; Milton Halem; Mary Brady

Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3x and 1.8x speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.


Journal of Research of the National Institute of Standards and Technology | 2017

ZENO: Software for calculating hydrodynamic, electrical, and shape properties of polymer and particle suspensions | NIST

Derek Juba; Debra J. Audus; Michael V. Mascagni; Jack F. Douglas; Walid Keyrouz


international conference on conceptual structures | 2016

Acceleration and Parallelization of ZENO/Walk-on-Spheres

Derek Juba; Walid Keyrouz; Michael Mascagni; Mary Brady


design, automation, and test in europe | 2018

A design tool for high performance image processing on multicore platforms

Jiahao Wu; Timothy Blattner; Walid Keyrouz; Shuvra S. Bhattacharyya


Journal of Signal Processing Systems | 2018

Model-Based Dynamic Scheduling for Multicore Signal Processing

Jiahao Wu; Timothy Blattner; Walid Keyrouz; Shuvra S. Bhattacharyya

Collaboration


Dive into the Walid Keyrouz's collaboration.

Top Co-Authors

Avatar

Timothy Blattner

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Mary Brady

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Joe Chalfoun

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Derek Juba

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Michael P. Majurski

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Peter Bajcsy

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Shujia Zhou

University of Maryland

View shared research outputs
Top Co-Authors

Avatar

Jack F. Douglas

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Beatriz A Pazmino Betancourt

National Institute of Standards and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge