50th International Conference on Parallel Processing Workshop | 2021

An Intelligent Parallel Distributed Streaming Framework for near Real-time Science Sensors and High-Resolution Medical Images

 
 
 
 
 
 

Abstract


Our goals are to address challenges such as latency, scalability, throughput and heterogeneous data sources of streaming analytics and deep learning pipelines in science sensors and medical imaging applications. We present a prototype Intelligent Parallel Distributed Streaming Framework (IPDSF) that is capable of distributed streaming processing as well as performing distributed deep training in batch mode. IPDSF is designed to run streaming Artificial Intelligent (AI) analytic tasks using data parallelism including partitions of multiple streams of short time sensing data and high-resolution 3D medical images, and fine grain tasks distribution. We will show the implementation of IPDSF for two real world applications, (i) an Air Quality Index based on near real time streaming of aerosol Lidar backscatter and (ii) data generation of Covid-19 Computing Tomography (CT) scans using deep learning. We evaluate the latency, throughput, scalability, and quantitative evaluation of training and prediction compared against a baseline single instance. As the results, IPDSF scales to process thousands of streaming science sensors in parallel for Air Quality Index application. IPDSF uses novel 3D conditional Generative Adversarial Network (cGAN) training using parallel distributed Graphic Processing Units (GPU) nodes to generate realistic 3D high resolution Computed Tomography scans of Covid-19 patient lungs. We will show that IPDSF can reduce cGAN training time linearly with the number of GPUs.

Volume None
Pages None
DOI 10.1145/3458744.3474039
Language English
Journal 50th International Conference on Parallel Processing Workshop

Full Text