Drew Schmidt
University of Tennessee
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Drew Schmidt.
ieee international conference on high performance computing data and analytics | 2012
Drew Schmidt; George Ostrouchov; Wei-Chen Chen; Pragneshkumar Patel
We present a new distributed programming extension of the R programming language. By tightly coupling R to the well-known ScaLAPACK and MPI libraries, we are able to achieve highly scalable implementations of common statistical methods, allowing the user to analyze bigger datasets with R than ever before. Early benchmarks show great optimism for the project and its future.
Big Data Research | 2017
Drew Schmidt; Wei Chen Chen; Michael A. Matheson; George Ostrouchov
Abstract We present a tutorial overview showing how one can achieve scalable performance with R. We do so by utilizing several package extensions, including those from the pbdR project. These packages consist of high performance, high-level interfaces to and extensions of MPI, PBLAS, ScaLAPACK, I/O libraries, profiling libraries, and more. While these libraries shine brightest on large distributed platforms, they also work rather well on small clusters and often, surprisingly, even on a laptop with only two cores. Our tutorial begins with recommendations on how to get more performance out of your R code before considering parallel implementations. Because R is a high-level language, a function can have a deep hierarchy of operations. For big data, this can easily lead to inefficiency. Profiling is an important tool to understand the performance of an R code for both serial and parallel improvements. The pbdR packages provide a highly scalable capability for the development of novel distributed data analysis algorithms. This level of scalability is unmatched in other analysis software. Interactive speeds (seconds) are achieved for complex analysis algorithms on data 100 GB and more. This is possible because the interfaces add little overhead to the scalable libraries and their extensions. Furthermore, this is often achieved with little or no change to serial R codes. Our overview includes codes of varying complexity, illustrating reading data in parallel, the process of changing a serial code to a distributed parallel code, and how to engage distributed matrix computation from within R.
Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale | 2016
Drew Schmidt; Wei-Chen Chen; George Ostrouchov
Historically, large scale computing and interactivity have been at odds. This is a particularly sore spot for data analytics applications, which are typically interactive in nature. To help address this problem, we introduce a new client/server framework for the R language. This framework allows the R programmer to remotely control anywhere from one to thousands of batch servers running as cooperating instances of R. And all of this is done from the users local R session. Additionally, no specialized software environment is needed; the framework is a series of R packages, available from CRAN. The communication between client and server(s) is handled by the well-known ZeroMQ library. To handle server side computations, we use our established pbdR packages for large scale distributed computing. These packages utilize HPC standards like MPI and ScaLAPACK to handle complex, tightly-coupled computations on large datasets. In this paper, we outline the new client/server architecture components, discuss the pros and cons to this approach, and provide several example workflows that bring interactivity to potentially terabyte size computations.
Archive | 2016
Wei-Chen Chen; George Ostrouchov; Drew Schmidt; Pragneshkumar Patel; Hao Yu
Archive | 2017
George Ostrouchov; Wei-Chen Chen; Drew Schmidt
Archive | 2016
Wei-Chen Chen; Drew Schmidt; George Ostrouchov; Pragneshkumar Patel
Archive | 2016
Drew Schmidt; Wei-Chen Chen; George Ostrouchov; Pragneshkumar Patel
Archive | 2016
Wei-Chen Chen; Drew Schmidt; Christian Heckendorf; George Ostrouchov
Archive | 2014
Pragneshkumar Patel; George Ostrouchov; Wei-Chen Chen; Drew Schmidt; David Pierce
Archive | 2014
Drew Schmidt; Wei-Chen Chen; George Ostrouchov; Pragneshkumar Patel