Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Duling is active.

Publication


Featured researches published by David Duling.


european conference on machine learning | 2012

Massively parallel feature selection: an approach based on variance preservation

Zheng Zhao; James Cox; David Duling; Warren Sarle

Advances in computer technologies have enabled corporations to accumulate data at an unprecedented speed. Large-scale business data might contain billions of observations and thousands of features, which easily brings their scale to the level of terabytes. Most traditional feature selection algorithms are designed for a centralized computing architecture. Their usability significantly deteriorates when data size exceeds hundreds of gigabytes. High-performance distributed computing frameworks and protocols, such as the Message Passing Interface (MPI) and MapReduce, have been proposed to facilitate software development on grid infrastructures, enabling analysts to process large-scale problems efficiently. This paper presents a novel large-scale feature selection algorithm that is based on variance analysis. The algorithm selects features by evaluating their abilities to explain data variance. It supports both supervised and unsupervised feature selection and can be readily implemented in most distributed computing environments. The algorithm was developed as a SAS High-Performance Analytics procedure, which can read data in distributed form and perform parallel feature selection in both symmetric multiprocessing mode and massively parallel processing mode. Experimental results demonstrated the superior performance of the proposed method for large scale feature selection.


arXiv: Numerical Analysis | 2014

Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization.

Amy N. Langville; Carl D. Meyer; Russell Albright; James Cox; David Duling


Archive | 2008

TWO-STAGE VARIABLE CLUSTERING FOR LARGE DATA SETS

Taiyeong Lee; David Duling; Song Liu; Dominique Latour; Sas Presents


Archive | 2011

Systems And Methods For Clustering Time Series Data Based On Forecast Distributions

Taiyeong Lee; David Duling


Archive | 2008

Computer-implemented systems and methods for variable clustering in large data sets

Taiyeong Lee; David Duling; Dominique Latour


Archive | 2008

Constrained Optimized Binning For Scorecards

Ivan Oliveira; Manoj Keshavmurthi Chari; David Duling; Susan Haller; Robert William Pratt


Archive | 2013

Systems and Methods for Providing a Unified Variable Selection Approach Based on Variance Preservation

Zheng Zhao; James Cox; David Duling; Warren Sarle


Archive | 2010

Improving Credit Risk Scorecards with Memory-Based Reasoning to Reject Inference with SAS ® Enterprise Miner ™

Billie Anderson; Susan Haller; Naeem Siddiqi; James Cox; David Duling


Archive | 2009

Predictive Models Based on Reduced Input Space That Uses Rejected Variables

Taiyeong Lee; David Duling; Dominique Latour


Archive | 2008

From Soup to Nuts: Practices in Data Management for Analytical Performance

David Duling; Howard Plemmons; Nc Nancy Rausch

Collaboration


Dive into the David Duling's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge