Joseph K. Bradley | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joseph K. Bradley is active.

Explore More

Publication

Featured researches published by Joseph K. Bradley.

international conference on management of data | 2015

Spark SQL: Relational Data Processing in Spark

Michael Armbrust; Reynold S. Xin; Cheng Lian; Yin Huai; Davies Liu; Joseph K. Bradley; Xiangrui Meng; Tomer Kaftan; Michael J. Franklin; Ali Ghodsi; Matei Zaharia

Spark SQL is a new module in Apache Spark that integrates relational processing with Sparks functional programming API. Built on our experience with Shark, Spark SQL lets Spark programmers leverage the benefits of relational processing (e.g. declarative queries and optimized storage), and lets SQL users call complex analytics libraries in Spark (e.g. machine learning). Compared to previous systems, Spark SQL makes two main additions. First, it offers much tighter integration between relational and procedural processing, through a declarative DataFrame API that integrates with procedural Spark code. Second, it includes a highly extensible optimizer, Catalyst, built using features of the Scala programming language, that makes it easy to add composable rules, control code generation, and define extension points. Using Catalyst, we have built a variety of features (e.g. schema inference for JSON, machine learning types, and query federation to external databases) tailored for the complex needs of modern data analysis. We see Spark SQL as an evolution of both SQL-on-Spark and of Spark itself, offering richer APIs and optimizations while keeping the benefits of the Spark programming model.

international symposium on information theory | 2014

The SPRIGHT algorithm for robust sparse Hadamard Transforms

Xiao Li; Joseph K. Bradley; Sameer Pawar; Kannan Ramchandran

In this paper, we consider the problem of computing a K-sparse N-point Hadamard Transforms (HT) from noisy time domain samples, where K = O(Nα) scales sub-linearly in N for some α ∈ (0; 1). The SParse Robust Iterative Graph-based Hadamard Transform (SPRIGHT) algorithm is proposed to recover the sparse HT coefficients in a stable manner that is robust to additive Gaussian noise. In particular, it is shown that the K-sparse HT of the signal can be reconstructed from noisy time domain samples with a vanishing error probability using the same sample complexity O(K logN) as in the noiseless case of [1] and computational complexity1 O(N logN). Last but not least, given the complexity orders of the SPRIGHT algorithm, our numerical experiments further validate that the big-Oh constants in the complexity are small.

Journal of Machine Learning Research | 2016

MLlib: machine learning in apache spark

Xiangrui Meng; Joseph K. Bradley; Burak Yavuz; Evan R. Sparks; Shivaram Venkataraman; Davies Liu; Jeremy Freeman; D. B. Tsai; Manish Amde; Sean Owen; Doris Xin; Reynold S. Xin; Michael J. Franklin; Reza Bosagh Zadeh; Matei Zaharia; Ameet Talwalkar

international conference on machine learning | 2011