Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Donald E. K. Martin is active.

Publication


Featured researches published by Donald E. K. Martin.


The Annals of Applied Statistics | 2007

Distributions associated with general runs and patterns in hidden Markov models

John A. D. Aston; Donald E. K. Martin

This paper gives a method for computing distributions associated with patterns in the state sequence of a hidden Markov model, conditional on observing all or part of the observation sequence. Probabilities are computed for very general classes of patterns (competing patterns and generalized later patterns), and thus, the theory includes as special cases results for a large class of problems that have wide application. The unobserved state sequence is assumed to be Markovian with a general order of dependence. An auxiliary Markov chain is associated with the state sequence and is used to simplify the computations. Two examples are given to illustrate the use of the methodology. Whereas the first application is more to illustrate the basic steps in applying the theory, the second is a more detailed application to DNA sequences, and shows that the methods can be adapted to include restrictions related to biological knowledge.


Journal of Computational Biology | 2014

A Coverage Criterion for Spaced Seeds and its applications to Support Vector Machine String Kernels and k-mer Distances

Laurent Noé; Donald E. K. Martin

Abstract Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances, and to provide a lower misclassification rate when used with Support Vector Machines (SVMs). We confirm by independent experiments these two results, and propose in this article to use a coverage criterion to measure the seed efficiency in both cases in order to design better seed patterns. We show first how this coverage criterion can be directly measured by a full automaton-based approach. We then illustrate how this criterion performs when compared with two other criteria frequently used, namely the single-hit and multiple-hit criteria, through correlation coefficients with the correct classification/the true distance. At the end, for alignment-free distances, we propose an extension by adopting the coverage criterion, show how it performs, and indicate how it can be efficiently computed.Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances, and to provide a lower misclassification rate when used with Support Vector Machines (SVMs). We confirm by independent experiments these two results, and propose in this article to use a coverage criterion to measure the seed efficiency in both cases in order to design better seed patterns. We show first how this coverage criterion can be directly measured by a full automaton-based approach. We then illustrate how this criterion performs when compared with two other criteria frequently used, namely the single-hit and multiple-hit criteria, through correlation coefficients with the correct classification/the true distance. At the end, for alignment-free distances, we propose an extension by adopting the coverage criterion, show how it performs, and indicate how it can be efficiently computed.


Statistics and Computing | 2012

Implied distributions in multiple change point problems

John A. D. Aston; Jyh-Ying Peng; Donald E. K. Martin

A method for efficiently calculating exact marginal, conditional and joint distributions for change points defined by general finite state Hidden Markov Models is proposed. The distributions are not subject to any approximation or sampling error once parameters of the model have been estimated. It is shown that, in contrast to sampling methods, very little computation is needed. The method provides probabilities associated with change points within an interval, as well as at specific points.


Journal of Applied Statistics | 1999

Paired comparison models applied to the design of the Major League baseball play-offs

Donald E. K. Martin

This paper presents an analysis of the eff ect of various baseball play-off configurations on the probability of advancing to the World Series. Play-off games are assumed to be independent. Several paired comparisons models are considered for modeling the probability of a home team winning a single game as a function of the winning percentages of the contestants over the course of the season. The uniform and logistic regression models are both adequate, whereas the Bradley-Terry model (modified for within-pair order eff ects, i.e. the home field advantage) is not. The single-game probabilities are then used to compute the probability of winning the play-off s under various structures. The extra round of play-off s, instituted in 1994, significantly lowers the probability of the team with the best record advancing to the World Series, whereas home field advantage and the diff erent possible play-offdraws have a minimal eff ect.


Communications in Statistics - Simulation and Computation | 2015

p-values for the Discrete Scan Statistic through Slack Variables

Donald E. K. Martin

The discrete scan statistic is used in many areas of applied probability and statistics to study local clumping of patterns. Testing based on the statistic requires tail probabilities. Whereas the distribution has been studied extensively, most of the results are approximations, due to the difficulties associated with the computation. Results for exact p-values for the statistic have been given for a binary sequence that is independent or first-order Markovian. We give an algorithm to obtain probabilities for the statistic over multi-state trials that are Markovian of a general order of dependence, and explore the algorithms usefulness.


Journal of Applied Statistics | 2015

Multiple window discrete scan statistic for higher-order Markovian sequences

Deidra A. Coleman; Donald E. K. Martin; Brian J. Reich

Accurate and efficient methods to detect unusual clusters of abnormal activity are needed in many fields such as medicine and business. Often the size of clusters is unknown; hence, multiple (variable) window scan statistics are used to identify clusters using a set of different potential cluster sizes. We give an efficient method to compute the exact distribution of multiple window discrete scan statistics for higher-order, multi-state Markovian sequences. We define a Markov chain to efficiently keep track of probabilities needed to compute p-values for the statistic. The state space of the Markov chain is set up by a criterion developed to identify strings that are associated with observing the specified values of the statistic. Using our algorithm, we identify cases where the available approximations do not perform well. We demonstrate our methods by detecting unusual clusters of made free throw shots by National Basketball Association players during the 2009–2010 regular season.


IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOLUME 2: Special Edition of the#N#World Congress on Engineering and Computer Science | 2009

Exact Distribution Of Statistics Of Hidden State Sequences Via Message Passing in Factor Graphs

Donald E. K. Martin; John A. D. Aston

We compute exact distributions of statistics of hidden state sequences through the sum‐product algorithm defined over cycle‐free factor graphs. Matrix operators are included to sequentially update a vector that indicates the statistic value corresponding to sums of products of evaluated potential functions. The methodology may be used for both undirected and directed models, with applications to discrete hidden state sequences perturbed by noise and/or missing values, and state sequences that serve to classify observations. Examples are given to illustrate the computational procedure.


Communications in Statistics - Simulation and Computation | 2018

Minimal auxiliary Markov chains through sequential elimination of states

Donald E. K. Martin

ABSTRACT When using an auxiliary Markov chain to compute the distribution of a pattern statistic, the computational complexity is directly related to the number of Markov chain states. Theory related to minimal deterministic finite automata have been applied to large state spaces to reduce the number of Markov chain states so that only a minimal set remains. In this paper, a characterization of equivalent states is given so that extraneous states are deleted during the process of forming the state space, improving computational efficiency. The theory extends the applicability of Markov chain based methods for computing the distribution of pattern statistics.


Journal of Applied Probability | 2005

Waiting time distributions of competing patterns in higher-order Markovian sequences

John A. D. Aston; Donald E. K. Martin


Methodology and Computing in Applied Probability | 2013

Distribution of Statistics of Hidden State Sequences Through the Sum-Product Algorithm

Donald E. K. Martin; John A. D. Aston

Collaboration


Dive into the Donald E. K. Martin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Deidra A. Coleman

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar

Brian J. Reich

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge