Blake Hunter
University of California, Davis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Blake Hunter.
international conference on data mining | 2012
Yves van Gennip; Huiyi Hu; Blake Hunter; Mason A. Porter
We apply spectral clustering and multislice modularity optimization to a Los Angeles Police Department field interview card data set. To detect communities (i.e., cohesive groups of vertices), we use both geographic and social information about stops involving street gang members in the LAPD district of Hollenbeck. We then compare the algorithmically detected communities with known gang identifications and argue that discrepancies are due to sparsity of social connections in the data as well as complex underlying sociological factors that blur distinctions between communities.
Journal of statistical theory and practice | 2008
Blake Hunter; Alan Krinik; C. Nguyen; Jennifer Switkes; H.F. von Bremen
We compute ruin probabilities, in both infinite-time and finite-time, for a Gambler’s Ruin problem with both catastrophes and windfalls in addition to the customary win/loss probabilities. For constant transition probabilities, the infinite-time ruin probabilities are derived using difference equations. Finite-time ruin probabilities of a system having constant win/loss probabilities and variable catastrophe/windfall probabilities are determined using lattice path combinatorics. Formulae for expected time till ruin and the expected duration of gambling are also developed. The ruin probabilities (in infinite-time) for a system having variable win/loss/catastrophe probabilities but no windfall probability are found. Finally, the infinite-time ruin probabilities of a system with variable win/loss/catastrophe/windfall probabilities are determined.
ICNAAM 2010: International Conference of Numerical Analysis and Applied Mathematics 2010 | 2010
Blake Hunter; Thomas Strohmer
Data mining has become one of the fastest growing research topics in mathematics and computer science. Data such as high dimensional signals, magnetic resonance images, and hyperspectral images can be costly to acquire or it could be unobtainable to make even simple direct comparisons. Compressed sensing is a technique that addresses this issue. It is used for exact recovery of sparse signals using fewer measurements than the ambient dimension. Compressed sensing provides a bound on the error derived from making these few measurements of a signal. Our goal is to take advantage of these compressed sensing techniques to perform spectral clustering using much fewer measurements than the ambient dimension. The goal of clustering is to partition objects into groups such that objects within the same group are similar. Standard clustering such as k-means requires the space in which the objects are represented to be linearly separable. Spectral clustering methods allow for a wider range of underlying geometries, making them more flexible. Classification is the procedure of assigning labels to objects such that objects’ labels within the same cluster will match previously labeled objects from a training set. Classification is traditionally a type of supervised learning problem that tries to learn a function from the data in order to predict the output of an unknown input from known input and output pairs. Clustering is an unsupervised learning problem where one is only given the unlabeled data and the goal is to learn the underlying structure. Spectral clustering uses local distance between data points, (e.g. the Euclidian distance, d(xi, xj) = ‖xi − xj‖2) to construct a graph G = (V , E). A traditional choice of edge weights uses the Gaussian kernel,
Medical Imaging 2018: Image Processing | 2018
Baichuan Yuan; Yoni Dukler; Long Zhao; Yizhou Qian; Yurun Ge; Shintaro Yamamoto; Blake Hunter; Andrea L. Bertozzi; Jesse T. Yen; Rafael Llerena
We consider the problem of automatically tracking the mitral valve in cardiac ultrasound time series and present an unsupervised method for decomposing and segmenting the mitral valve from noisy ultrasound videos. To do so we propose a Robust Nonnegative Matrix Factorization (RNMF) method that naturally decomposes the time series into three separate parts, highlighting the cardiac cycle, mitral valve, and ultrasound noise. The low rank component of RNMF captures the simple motions of the cardiac cycle effectively aside from the sporadic motion of the mitral valve tissue that is captured innately in our RNMF sparse signal term. Using the RNMF representation, we introduce a simple valve object detection algorithm. Our method performs especially well in noisy time series when existing methods fail, differentiating general noise from the subtle and complex motions of the mitral valve. The valve is then segmented using simple thresholding and diffusion. The method presented is highly robust to low quality ultrasound video, and does not require manual preprocessing, prior labeling, or any training data.
Journal of data science | 2018
Yves van Gennip; Blake Hunter; Anna Ma; Daniel Moyer; Ryan de Vera; Andrea L. Bertozzi
We consider the problem of duplicate detection in noisy and incomplete data: Given a large data set in which each record has multiple entries (attributes), detect which distinct records refer to the same real-world entity. This task is complicated by noise (such as misspellings) and missing data, which can lead to records being different, despite referring to the same entity. Our method consists of three main steps: creating a similarity score between records, grouping records together into “unique entities”, and refining the groups. We compare various methods for creating similarity scores between noisy records, considering different combinations of string matching, term frequency-inverse document frequency methods, and n-gram techniques. In particular, we introduce a vectorized soft term frequency-inverse document frequency method, with an optional refinement step. We also discuss two methods to deal with missing data in computing similarity scores. We test our method on the Los Angeles Police Department Field Interview Card data set, the Cora Citation Matching data set, and two sets of restaurant review data. The results show that the methods that use words as the basic units are preferable to those that use 3-grams. Moreover, in some (but certainly not all) parameter ranges soft term frequency-inverse document frequency methods can outperform the standard term frequency-inverse document frequency method. The results also confirm that our method for automatically determining the number of groups typically works well in many cases and allows for accurate results in the absence of a priori knowledge of the number of unique entities in the data set.
Siam Journal on Applied Mathematics | 2013
Yves van Gennip; Blake Hunter; Raymond Ahn; Peter Elliott; Kyle Luh; Megan Halvorson; Shannon E. Reid; Matthew Valasik; James Wo; George E. Tita; Andrea L. Bertozzi; P. Jeffrey Brantingham
arXiv: Numerical Analysis | 2010
Blake Hunter; Thomas Strohmer
Ima Journal of Applied Mathematics | 2016
Eric L. Lai; Daniel Moyer; Baichuan Yuan; Eric Warren Fox; Blake Hunter; Andrea L. Bertozzi; P. Jeffrey Brantingham
international conference on data mining | 2012
Huiyi Hu; Yves van Gennip; Blake Hunter; Andrea L. Bertozzi; Mason A. Porter
Archive | 2011
Blake Hunter