Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vu Dinh is active.

Publication


Featured researches published by Vu Dinh.


international conference on tools with artificial intelligence | 2012

Mel-frequency Cepstral Coefficients for Eye Movement Identification

Nguyen Viet Cuong; Vu Dinh; Lam Si Tung Ho

Human identification is an important task for various activities in society. In this paper, we consider the problem of human identification using eye movement information. This problem, which is usually called the eye movement identification problem, can be solved by training a multiclass classification model to predict a persons identity from his or her eye movements. In this work, we propose using Mel-frequency cepstral coefficients (MFCCs) to encode various features for the classification model. Our experiments show that using MFCCs to represent useful features such as eye position, eye difference, and eye velocity would result in a much better accuracy than using Fourier transform, cepstrum, or raw representations. We also compare various classification models for the task. From our experiments, linear-kernel SVMs achieve the best accuracy with 93.56% and 91.08% accuracy on the small and large datasets respectively. Besides, we conduct experiments to study how the movements of each eye contribute to the final classification accuracy.


advances in computing and communications | 2014

Robust explicit nonlinear model predictive control with integral sliding mode

Ankush Chakrabarty; Vu Dinh; Gregery T. Buzzard; Stanislaw H. Zak; Ann E. Rundell

A robust control strategy for stabilizing nonlinear systems in the presence of additive bounded disturbances is proposed. The proposed control architecture is a novel combination of explicit nonlinear model predictive control (EMPC) and integral sliding mode control (ISMC). Feasibility analysis of a finite-horizon optimal control problem involved in deriving the EMPC control action is performed over a polytope of interest in the state space. A sparse sampling-based boundary detection algorithm is employed to compute an approximating polynomial bounding the feasible region. A sparse-grid based interpolation scheme with Chebyshev-Gauss-Lobatto nodes and Legendre-basis polynomials are used to design the stabilizing EMPC surface. The proposed method is appealing because of the simplicity of the controller construction in conjunction with its applicability to higher-dimensional problems, which stems from the scale-ability property of sparse-grids. Robustness to the designed EMPC is provided by the ISMC. A simulated example is provided to illustrate the efficacy and performance of the proposed control strategy for the stabilization of an uncertain nonlinear dynamical system.


IEEE Transactions on Automatic Control | 2017

Support Vector Machine Informed Explicit Nonlinear Model Predictive Control Using Low-Discrepancy Sequences

Ankush Chakrabarty; Vu Dinh; Martin J. Corless; Ann E. Rundell; Stanislaw H. Zak; Gregery T. Buzzard

In this paper, an explicit nonlinear model predictive controller (ENMPC) for the stabilization of nonlinear systems is investigated. The proposed ENMPC is constructed using tensored polynomial basis functions and samples drawn from low-discrepancy sequences. Solutions of a finite-horizon optimal control problem at the sampled nodes are used (1) to learn an inner and outer approximation of the feasible region of the ENMPC using support vector machines, and (2) to construct the ENMPC control surface on the computed feasible region using regression or sparse-grid interpolation, depending on the shape of the feasible region. The attractiveness of the proposed control scheme lies in its tractability to higher-dimensional systems with feasibility and stability guarantees, significantly small online computation times, and ease of implementation.


Systematic Biology | 2018

Online Bayesian Phylogenetic Inference: Theoretical Foundations via Sequential Monte Carlo

Vu Dinh; Aaron E. Darling; Frederick A. Matsen; Edward Susko

Abstract Phylogenetics, the inference of evolutionary trees from molecular sequence data such as DNA, is an enterprise that yields valuable evolutionary understanding of many biological systems. Bayesian phylogenetic algorithms, which approximate a posterior distribution on trees, have become a popular if computationally expensive means of doing phylogenetics. Modern data collection technologies are quickly adding new sequences to already substantial databases. With all current techniques for Bayesian phylogenetics, computation must start anew each time a sequence becomes available, making it costly to maintain an up-to-date estimate of a phylogenetic posterior. These considerations highlight the need for an online Bayesian phylogenetic method which can update an existing posterior with new sequences. Here, we provide theoretical results on the consistency and stability of methods for online Bayesian phylogenetic inference based on Sequential Monte Carlo (SMC) and Markov chain Monte Carlo. We first show a consistency result, demonstrating that the method samples from the correct distribution in the limit of a large number of particles. Next, we derive the first reported set of bounds on how phylogenetic likelihood surfaces change when new sequences are added. These bounds enable us to characterize the theoretical performance of sampling algorithms by bounding the effective sample size (ESS) with a given number of particles from below. We show that the ESS is guaranteed to grow linearly as the number of particles in an SMC sampler grows. Surprisingly, this result holds even though the dimensions of the phylogenetic model grow with each new added sequence.


algorithmic learning theory | 2013

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Nguyen Viet Cuong; Lam Si Tung Ho; Vu Dinh

We analyze the generalization and robustness of the batched weighted average algorithm for V-geometrically ergodic Markov data. This algorithm is a good alternative to the empirical risk minimization algorithm when the latter suffers from overfitting or when optimizing the empirical risk is hard. For the generalization of the algorithm, we prove a PAC-style bound on the training sample size for the expected L 1-loss to converge to the optimal loss when training data are V-geometrically ergodic Markov chains. For the robustness, we show that if the training target variable’s values contain bounded noise, then the generalization bound of the algorithm deviates at most by the range of the noise. Our results can be applied to the regression problem, the classification problem, and the case where there exists an unknown deterministic target hypothesis.


Systematic Biology | 2018

Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals

Mathieu Fourment; Brian C. Claywell; Vu Dinh; Connor O. McCoy; Frederick A. Matsen; Aaron E. Darling

&NA; Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC‐based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy.


theory and applications of models of computation | 2015

Learning from Non-iid Data: Fast Rates for the One-vs-All Multiclass Plug-in Classifiers

Vu Dinh; Lam Si Tung Ho; Nguyen Viet Cuong; Duy Nguyen; Binh T. Nguyen

We prove new fast learning rates for the one-vs-all multiclass plug-in classifiers trained either from exponentially strongly mixing data or from data generated by a converging drifting distribution. These are two typical scenarios where training data are not iid. The learning rates are obtained under a multiclass version of Tsybakov’s margin assumption, a type of low-noise assumption, and do not depend on the number of classes. Our results are general and include a previous result for binary-class plug-in classifiers with iid data as a special case. In contrast to previous works for least squares SVMs under the binary-class setting, our results retain the optimal learning rate in the iid case.


Molecular Biology and Evolution | 2018

A Surrogate Function for One-Dimensional Phylogenetic Likelihoods

Brian C. Claywell; Vu Dinh; Mathieu Fourment; Connor O. McCoy; Frederick A. Matsen

Phylogenetics has seen a steady increase in data set size and substitution model complexity, which require increasing amounts of computational power to compute likelihoods. This motivates strategies to approximate the likelihood functions for branch length optimization and Bayesian sampling. In this article, we develop an approximation to the 1D likelihood function as parametrized by a single branch length. Our method uses a four-parameter surrogate function abstracted from the simplest phylogenetic likelihood function, the binary symmetric model. We show that it offers a surrogate that can be fit over a variety of branch lengths, that it is applicable to a wide variety of models and trees, and that it can be used effectively as a proposal mechanism for Bayesian sampling. The method is implemented as a stand-alone open-source C library for calling from phylogenetics algorithms; it has proven essential for good performance of our online phylogenetic algorithm sts.


Journal of Statistical Computation and Simulation | 2017

Convergence of Griddy Gibbs sampling and other perturbed Markov chains

Vu Dinh; Ann E. Rundell; Gregery T. Buzzard

ABSTRACT The Griddy Gibbs sampling was proposed by Ritter and Tanner [Facilitating the Gibbs Sampler: the Gibbs Stopper and the Griddy–Gibbs Sampler. J Am Stat Assoc. 1992;87(419):861—868] as a computationally efficient approximation of the well-known Gibbs sampling method. The algorithm is simple and effective and has been used successfully to address problems in various fields of applied science. However, the approximate nature of the algorithm has prevented it from being widely used: the Markov chains generated by the Griddy Gibbs sampling method are not reversible in general, so the existence and uniqueness of its invariant measure is not guaranteed. Even when such an invariant measure uniquely exists, there was no estimate of the distance between it and the probability distribution of interest, hence no means to ensure the validity of the algorithm as a means to sample from the true distribution. In this paper, we show, subject to some fairly natural conditions, that the Griddy Gibbs method has a unique, invariant measure. Moreover, we provide estimates on the distance between this invariant measure and the corresponding measure obtained from Gibbs sampling. These results provide a theoretical foundation for the use of the Griddy Gibbs sampling method. We also address a more general result about the sensitivity of invariant measures under small perturbations on the transition probability. That is, if we replace the transition probability P of any Monte Carlo Markov chain by another transition probability Q where Q is close to P, we can still estimate the distance between the two invariant measures. The distinguishing feature between our approach and previous work on convergence of perturbed Markov chain is that by considering the invariant measures as fixed points of linear operators on function spaces, we do not need to impose any further conditions on the rate of convergence of the Markov chain. For example, the results we derived in this paper can address the case when the considered Monte Carlo Markov chains are not uniformly ergodic.


Annals of Applied Probability | 2017

The shape of the one-dimensional phylogenetic likelihood function

Vu Dinh; Frederick A. Matsen

By fixing all parameters in a phylogenetic likelihood model except for one branch length, one obtains a one-dimensional likelihood function. In this work, we introduce a mathematical framework to characterize the shapes of such one-dimensional phylogenetic likelihood functions. This framework is based on analyses of algebraic structures on the space of all frequency patterns with respect to a polynomial representation of the likelihood functions. Using this framework, we provide conditions under which the one-dimensional phylogenetic likelihood functions are guaranteed to have at most one stationary point, and this point is the maximum likelihood branch length. These conditions are satisfied by common simple models including all binary models, the Jukes-Cantor model and the Felsenstein 1981 model. We then prove that for the simplest model that does not satisfy our conditions, namely, the Kimura 2-parameter model, the one-dimensional likelihood functions may have multiple stationary points. As a proof of concept, we construct a non-degenerate example in which the phylogenetic likelihood function has two local maxima and a local minimum. To construct such examples, we derive a general method of constructing a tree and sequence data with a specified frequency pattern at the root. We then extend the result to prove that the space of all rescaled and translated one-dimensional phylogenetic likelihood functions under the Kimura 2-parameter model is dense in the space of all non-negative continuous functions on

Collaboration


Dive into the Vu Dinh's collaboration.

Top Co-Authors

Avatar

Frederick A. Matsen

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lam Si Tung Ho

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nguyen Viet Cuong

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian C. Claywell

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar

Cheng Zhang

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar

Connor O. McCoy

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge