Jason M. Schwier | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jason M. Schwier is active.

Explore More

Publication

Featured researches published by Jason M. Schwier.

Pattern Recognition Letters | 2009

Zero knowledge hidden Markov model inference

Jason M. Schwier; Richard R. Brooks; Christopher Griffin; Satish T. S. Bukkapatnam

Hidden Markov models (HMMs) are widely used in pattern recognition. HMM construction requires an initial model structure that is used as a starting point to estimate the models parameters. To construct a HMM without a priori knowledge of the structure, we use an approach developed by Crutchfield and Shalizi that requires only a sequence of observations and a maximum data window size. Values of the maximum data window size that are too small result in incorrect models being constructed. Values that are too large reduce the number of data samples that can be considered and exponentially increase the algorithms computational complexity. In this paper, we present a method for automatically inferring this parameter directly from training data as part of the model construction process. We present theoretical and experimental results that confirm the utility of the proposed extension.

systems man and cybernetics | 2009

Behavior Detection Using Confidence Intervals of Hidden Markov Models

Richard R. Brooks; Jason M. Schwier; Christopher Griffin

Markov models are commonly used to analyze real-world problems. Their combination of discrete states and stochastic transitions is suited to applications with deterministic and stochastic components. Hidden Markov models (HMMs) are a class of Markov models commonly used in pattern recognition. Currently, HMMs recognize patterns using a maximum-likelihood approach. One major drawback with this approach is that data observations are mapped to HMMs without considering the number of data samples available. Another problem is that this approach is only useful for choosing between HMMs. It does not provide a criterion for determining whether or not a given HMM adequately matches the data stream. In this paper, we recognize complex behaviors using HMMs and confidence intervals. The certainty of a data match increases with the number of data samples considered. Receiver operating characteristic curves are used to find the optimal threshold for either accepting or rejecting an HMM description. We present one example using a family of HMMs to show the utility of the proposed approach. A second example using models extracted from a database of consumer purchases provides additional evidence that this approach can perform better than existing techniques.

IEEE Transactions on Knowledge and Data Engineering | 2013

Inferring Statistically Significant Hidden Markov Models

Lu Yu; Jason M. Schwier; Ryan Craven; Richard R. Brooks; Christopher Griffin

Hidden Markov models (HMMs) are used to analyze real-world problems. We consider an approach that constructs minimum entropy HMMs directly from a sequence of observations. If an insufficient amount of observation data is used to generate the HMM, the model will not represent the underlying process. Current methods assume that observations completely represent the underlying process. It is often the case that the training data size is not large enough to adequately capture all statistical dependencies in the system. It is, therefore, important to know the statistical significance level for that the constructed model representing the underlying process, not only the training set. In this paper, we present a method to determine if the observation data and constructed model fully express the underlying process with a given level of statistical significance. We use the statistics of the process to calculate an upper bound on the number of samples required to guarantee that the model has a given level significance. We provide theoretical and experimental results that confirm the utility of this approach. The experiment is conducted on a real private Tor network.

IEEE Transactions on Automatic Control | 2011

A Hybrid Statistical Technique for Modeling Recurrent Tracks in a Compact Set

Christopher Griffin; Richard R. Brooks; Jason M. Schwier

In this technical note we present a hybrid statistical approach for modeling a vehicles behavior as it traverses a compact set in Euclidean space. We use Symbolic Transfer Functions (STF), developed by the authors for modeling stochastic input/output systems whose inputs and outputs are both purely symbolic. We apply STF to our problem by assuming that the input symbols represent regions of space through which a track is passing while the output represents specific linear functions that more precisely model the behavior of the track. A targets behavior is modeled at two levels of precision: The symbolic model provides a probability distribution on the next region of space and behavior (linear function) that a vehicle will execute, while the continuous model predicts the position of the vehicle using classical statistical methods. The following results are presented: (i) An algorithm that parsimoniously partitions the space of the vehicle and models the behavior in the partitions with linear functions. (ii) A demonstration of our approach using real-world ship track data.

IEEE Transactions on Systems, Man, and Cybernetics | 2013

A Normalized Statistical Metric Space for Hidden Markov Models

Chen Lu; Jason M. Schwier; Ryan Craven; Lu Yu; Richard R. Brooks; Christopher Griffin

In this paper, we present a normalized statistical metric space for hidden Markov models (HMMs). HMMs are widely used to model real-world systems. Like graph matching, some previous approaches compare HMMs by evaluating the correspondence, or goodness of match, between every pair of states, concentrating on the structure of the models instead of the statistics of the process being observed. To remedy this, we present a new metric space that compares the statistics of HMMs within a given level of statistical significance. Compared with the Kullback-Leibler divergence, which is another widely used approach for measuring model similarity, our approach is a true metric, can always return an appropriate distance value, and provides a confidence measure on the metric value. Experimental results are given for a sample application, which quantify the similarity of HMMs of network traffic in the Tor anonymization system. This application is interesting since it considers models extracted from a system that is intentionally trying to obfuscate its internal workings. In the conclusion, we discuss applications in less-challenging domains, such as data mining.

the internet of things | 2011

Side-Channel Analysis for Detecting Protocol Tunneling

Harakrishnan Bhanu; Jason M. Schwier; Ryan Craven; Richard R. Brooks; Kathryn Hempstalk; Daniele Gunetti; Christopher Griffin

Protocol tunneling is widely used to add security and/or privacy to Internet applications. Recent research has exposed side channel vulnerabilities that leak information about tunneled protocols. We first discuss the timing side channels that have been found in protocol tunneling tools. We then show how to infer Hidden Markov models (HMMs) of network protocols from timing data and use the HMMs to detect when protocols are active. Unlike previous work, the HMM approach we present requires no a priori knowledge of the protocol. To illustrate the utility of this approach, we detect the use of English or Italian in interactive SSH sessions. For this example application, keystroke-timing data associates inter-packet delays with keystrokes. We first use clustering to extract discrete information from continuous timing data. We use discrete symbols to infer a HMM model, and finally use statistical tests to determine if the observed timing is consistent with the language typing statistics. In our tests, if the correct window size is used, fewer than 2% of data windows are incorrectly identified. Experimental verification shows that on-line detection of language use in interactive encrypted protocol tunnels is reliable. We compare maximum likelihood and statistical hypothesis testing for detecting protocol tunneling. We also discuss how this approach is useful in monitoring mix networks like The Onion Router (Tor).

systems man and cybernetics | 2011

Methods to Window Data to Differentiate Between Markov Models

Jason M. Schwier; Richard R. Brooks; Christopher Griffin

In this paper, we consider how we can detect patterns in data streams that are serial Markovian, where target behaviors are Markovian, but targets may switch from one Markovian behavior to another. We want to reliably and promptly detect behavior changes. Traditional Markov-model-based pattern detection approaches, such as hidden Markov models, use maximum likelihood techniques over the entire data stream to detect behaviors. To detect changes between behaviors, we use statistical pattern matching calculations performed on a sliding window of data samples. If the window size is very small, the system will suffer from excessive false-positive rates. If the window is very large, change-point detection is delayed. This paper finds both necessary and sufficient bounds on the window size. We present two methods of calculating window sizes based on the state and transition structures of the Markov models. Two application examples are presented to verify our results. Our first example problem uses simulations to illustrate the utility of the proposed approaches. The second example uses models extracted from a database of consumer purchases to illustrate their use in a real application.

systems man and cybernetics | 2009

Markovian Search Games in Heterogeneous Spaces

Richard R. Brooks; Jason M. Schwier; Christopher Griffin

In this paper, we consider how to search for a mobile evader in a large heterogeneous region when sensors are used for detection. Sensors are modeled using probability of detection. Due to environmental effects, this probability will not be constant over the entire region. We map this problem to a graph-search problem, and even though deterministic graph search is NP-complete, we derive a tractable optimal probabilistic search strategy. We do this by defining the problem as a dynamic game played on a Markov chain. We prove that this strategy is optimal in the sense of Nash. Simulations of an example problem illustrate our approach and verify our claims.

american control conference | 2008

Determining a purely symbolic transfer function from symbol streams: Theory and algorithms

Christopher Griffin; Richard R. Brooks; Jason M. Schwier

Transfer function modeling is a standard technique in classical linear time invariant and statistical process control. The work of Box and Jenkins was seminal in developing methods for identifying parameters associated with classical (r, s, k) transfer functions. Discrete event systems are often used for modeling hybrid control structures and high-level decision problems. Examples include discrete time, discrete strategy repeated games. For these games, a discrete transfer function in the form of an accurate hidden Markov model of input-output relations could be used to derive optimal response strategies. In this paper, we develop an algorithm for creating probabilistic Mealy machines that act as transfer function models for discrete event dynamic systems (DEDS). Our models are defined by three parameters, (l1,l2,k) just as the Box-Jenkins transfer function models. Here h is the maximal input history lengths to consider, l2 is the maximal output history lengths to consider and k is the response lag. Using related results, We show that our Mealy machine transfer functions are optimal in the sense that they maximize the mutual information between the current known state of the DEDS and the next observed input/output pair.

international conference on wireless communications and mobile computing | 2011

Noise tolerant symbolic learning of Markov models of tunneled protocols

Harakrishnan Bhanu; Jason M. Schwier; Ryan Craven; İlker Özçelik; Christopher Griffin; Richard R. Brooks

Recent research has exposed timing side channel vulnerabilities in many security applications. Hidden Markov models (HMMs) have used timing data to extract passwords from cryptographically protected communications tunnels. We extend that work to show how HMM models of protocols can be extracted directly from observations of protocol timing artifacts with no a priori knowledge. Since our approach uses symbolic reasoning, an important question is how to best translate continuous data observations to symbolic data. This translation is problematic when observation variance makes continuous to symbolic translation unreliable. We examine this problem and show that the HMMs we infer compensate automatically for significant observation jitter and symbol misclassification. Experimental verification is presented.

Explore More