Ryan R. Curtin | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ryan R. Curtin is active.

Explore More

Publication

Featured researches published by Ryan R. Curtin.

similarity search and applications | 2016

Fast Approximate Furthest Neighbors with Data-Dependent Candidate Selection

Ryan R. Curtin; Andrew B. Gardner

We present a novel strategy for approximate furthest neighbor search that selects a candidate set using the data distribution. This strategy leads to an algorithm, which we call DrusillaSelect, that is able to outperform existing approximate furthest neighbor strategies. Our strategy is motivated by an empirical study of the behavior of the furthest neighbor search problem, which lends intuition for where our algorithm is most useful. We also present a variant of the algorithm that gives an absolute approximation guarantee; under some assumptions, the guaranteed approximation can be achieved in provably less time than brute-force search. Performance studies indicate that DrusillaSelect can achieve comparable levels of approximation to other algorithms while giving up to an order of magnitude speedup. An implementation is available in the mlpack machine learning library (found at http://www.mlpack.org).

international conference on signal processing and communication systems | 2017

An open source C++ implementation of multi-threaded Gaussian mixture models, k-means and expectation maximisation

Conrad Sanderson; Ryan R. Curtin

Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded versions of the Expectation Maximisation (EM) and k-means training algorithms. Multi-threading is achieved through reformulation of the EM and k-means algorithms into a MapReduce-like framework. Furthermore, the implementation uses several techniques to improve numerical stability and modelling accuracy. We demonstrate that the multi-threaded implementation achieves a speedup of an order of magnitude on a recent 16 core machine, and that it can achieve higher modelling accuracy than a previously well-established publically accessible implementation. The multi-threaded implementation is included as a user-friendly class in recent releases of the open source Armadillo C++ linear algebra library. The library is provided under the permissive Apache 2.0 license, allowing unencumbered use in commercial products.

Journal of Social Structure | 2017

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

Conrad Sanderson; Ryan R. Curtin

Statistical modelling of multivariate data through a convex mixture of Gaussians, also known as a Gaussian mixture model (GMM), has many applications in fields such as signal processing, econometrics, and pattern recognition (Bishop 2006). Each component (Gaus-sian) in a GMM is parameterised with a weight, mean vector (centroid), and covariance matrix.

international congress on mathematical software | 2018

A User-Friendly Hybrid Sparse Matrix Class in C++

Conrad Sanderson; Ryan R. Curtin

When implementing functionality which requires sparse matrices, there are numerous storage formats to choose from, each with advantages and disadvantages. To achieve good performance, several formats may need to be used in one program, requiring explicit selection and conversion between the formats. This can be both tedious and error-prone, especially for non-expert users. Motivated by this issue, we present a user-friendly sparse matrix class for the C++ language, with a high-level application programming interface deliberately similar to the widely used MATLAB language. The class internally uses two main approaches to achieve efficient execution: (i) a hybrid storage framework, which automatically and seamlessly switches between three underlying storage formats (compressed sparse column, coordinate list, Red-Black tree) depending on which format is best suited for specific operations, and (ii) template-based meta-programming to automatically detect and optimise execution of common expression patterns. To facilitate relatively quick conversion of research code into production environments, the class and its associated functions provide a suite of essential sparse linear algebra functionality (eg., arithmetic operations, submatrix manipulation) as well as high-level functions for sparse eigendecompositions and linear equation solvers. The latter are achieved by providing easy-to-use abstractions of the low-level ARPACK and SuperLU libraries. The source code is open and provided under the permissive Apache 2.0 license, allowing unencumbered use in commercial products.

Journal of Social Structure | 2018

mlpack 3: a fast, flexible machine learning library

Ryan R. Curtin; Marcus Edel; Mikhail Lozhnikov; Yannis Mentekidis; Sumedh Ghaisas; Shangtong Zhang

In the past several years, the field of machine learning has seen an explosion of interest and excitement, with hundreds or thousands of algorithms developed for different tasks every year. But a primary problem faced by the field is the ability to scale to larger and larger data—since it is known that training on larger datasets typically produces better results (Halevy, Norvig, and Pereira 2009). Therefore, the development of new algorithms for the continued growth of the field depends largely on the existence of good tooling and libraries that enable researchers and practitioners to quickly prototype and develop solutions (Sonnenburg et al. 2007). Simultaneously, useful libraries must also be efficient and well-implemented. This has motivated our development of mlpack.

Information Systems | 2018

Exploiting the structure of furthest neighbor search for fast approximate results

Ryan R. Curtin; Javier Echauz; Andrew B. Gardner

Abstract We present a novel strategy for approximate furthest neighbor search that selects a set of candidate points using the data distribution. This strategy leads to an algorithm, which we call DrusillaSelect , that is able to outperform existing approximate furthest neighbor strategies. Our strategy is motivated by a study of the behavior of the furthest neighbor search problem, which has significantly different structure than the nearest neighbor search problem, and can be understood with the help of an information-theoretic hardness measure that we introduce. We also present a variant of the algorithm that gives an absolute approximation guarantee; under some assumptions, the guaranteed approximation can be achieved in provably less time than brute-force search. Performance studies indicate that DrusillaSelect can achieve comparable levels of approximation to other algorithms, even on the hardest datasets, while giving up to an order of magnitude speedup. An implementation is available in the mlpack machine learning library (found at http://www.mlpack.org ).

Journal of Social Structure | 2016

Armadillo: a template-based C++ library for linear algebra

Conrad Sanderson; Ryan R. Curtin

Archive | 2017

Designing and building the mlpack open-source machine learning library.

Ryan R. Curtin; Marcus Edel

arXiv: Mathematical Software | 2018

ensmallen: a flexible C++ library for efficient function optimization

Shikhar Bhardwaj; Ryan R. Curtin; Marcus Edel; Yannis Mentekidis; Conrad Sanderson

Archive | 2018

Mlpack 3.0.2

Ryan R. Curtin; Marcus Edel; Mikhail Lozhnikov; Yannis Mentekidis; Sumedh Ghaisas; Shangtong Zhang

Explore More

Collaboration

Dive into the Ryan R. Curtin's collaboration.

Top Co-Authors

Conrad Sanderson

University of Queensland

View shared research outputs

Top Co-Authors

Andrew B. Gardner

Symantec

View shared research outputs

Top Co-Authors

Javier Echauz

Symantec

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Ryan R. Curtin is active.

Publication

Featured researches published by Ryan R. Curtin.

Fast Approximate Furthest Neighbors with Data-Dependent Candidate Selection

An open source C++ implementation of multi-threaded Gaussian mixture models, k-means and expectation maximisation

gmm_diag and gmm_full: C++ classes for multi-threaded Gaussian mixture models and Expectation-Maximisation

A User-Friendly Hybrid Sparse Matrix Class in C++

mlpack 3: a fast, flexible machine learning library

Exploiting the structure of furthest neighbor search for fast approximate results

Armadillo: a template-based C++ library for linear algebra

Designing and building the mlpack open-source machine learning library.

ensmallen: a flexible C++ library for efficient function optimization

Mlpack 3.0.2

Collaboration

Dive into the Ryan R. Curtin's collaboration.