Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jerod J. Weinman is active.

Publication


Featured researches published by Jerod J. Weinman.


International Journal of Computer Vision | 2012

On Learning Conditional Random Fields for Stereo

Chris Pal; Jerod J. Weinman; Lam C. Tran; Daniel Scharstein

Until recently, the lack of ground truth data has hindered the application of discriminative structured prediction techniques to the stereo problem. In this paper we use ground truth data sets that we have recently constructed to explore different model structures and parameter learning techniques. To estimate parameters in Markov random fields (MRFs) via maximum likelihood one usually needs to perform approximate probabilistic inference. Conditional random fields (CRFs) are discriminative versions of traditional MRFs. We explore a number of novel CRF model structures including a CRF for stereo matching with an explicit occlusion model. CRFs require expensive inference steps for each iteration of optimization and inference is particularly slow when there are many discrete states. We explore belief propagation, variational message passing and graph cuts as inference methods during learning and compare with learning via pseudolikelihood. To accelerate approximate inference we have developed a new method called sparse variational message passing which can reduce inference time by an order of magnitude with negligible loss in quality. Learning using sparse variational message passing improves upon previous approaches using graph cuts and allows efficient learning over large data sets when energy functions violate the constraints imposed by graph cuts.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

Jerod J. Weinman; Erik G. Learned-Miller; Allen R. Hanson

Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and storefronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19 percent, the lexicon reduces word recognition error by 35 percent, and sparse belief propagation reduces the lexicon words considered by 99.9 percent with a 12X speedup and no loss in accuracy.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

Toward Integrated Scene Text Reading

Jerod J. Weinman; Zachary Butler; Dugan Knoll; Jacqueline L. Feild

The growth in digital camera usage combined with a worldly abundance of text has translated to a rich new era for a classic problem of pattern recognition, reading. While traditional document processing often faces challenges such as unusual fonts, noise, and unconstrained lexicons, scene text reading amplifies these challenges and introduces new ones such as motion blur, curved layouts, perspective projection, and occlusion among others. Reading scene text is a complex problem involving many details that must be handled effectively for robust, accurate results. In this work, we describe and evaluate a reading system that combines several pieces, using probabilistic methods for coarsely binarizing a given text region, identifying baselines, and jointly performing word and character segmentation during the recognition process. By using scene context to recognize several words together in a line of text, our system gives state-of-the-art performance on three difficult benchmark data sets.


computer vision and pattern recognition | 2005

Automatic Sign Detection and Recognition in Natural Scenes

Piyanuch Silapachote; Jerod J. Weinman; Allen R. Hanson; Marwan A. Mattar; Richard S. Weiss

Visually impaired individuals are unable to utilize the significant amount of information in signs. VIDI is a system for detecting and recognizing signs in the environment and voice synthesizing their contents. The wide variety of signs and unconstrained imaging conditions make the problem challenging. We detect signs using local color and texture features to classify image regions with a conditional maximum entropy model. Detected sign regions are then recognized by matching them against a known database of signs. A support vector machine classifier uses color to focus the search, and a match is found based on the correspondences of corners and their associated shape contexts. Our dataset includes images of downtown scenes with several signs exhibiting both illumination differences and projective distortions. A wide range of signs are detected and recognized including both text and symbolic information. The detection and the recognition components each perform well on their respective tasks, and initial evaluations of a complete detection and recognition system are promising.


Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004. | 2004

Sign detection in natural images with conditional random fields

Jerod J. Weinman; Allen R. Hanson; Andrew McCallum

Traditional generative Markov random fields for segmenting images model the image data and corresponding labels jointly, which requires extensive independence assumptions for tractability. We present the conditional random field for an application in sign detection, using typical scale and orientation selective texture filters and a nonlinear texture operator based on the grating cell. The resulting model captures dependencies between neighboring image region labels in a data-dependent way that escapes the difficult problem of modeling image formation, instead focusing effort and computation on the labeling task. We compare the results of training the model with pseudo-likelihood against an approximation of the full likelihood with the iterative tree reparameterization algorithm and demonstrate improvement over previous methods


computer vision and pattern recognition | 2006

Improving Recognition of Novel Input with Similarity

Jerod J. Weinman; Erik G. Learned-Miller

Many sources of information relevant to computer vision and machine learning tasks are often underused. One example is the similarity between the elements from a novel source, such as a speaker, writer, or printed font. By comparing instances emitted by a source, we help ensure that similar instances are given the same label. Previous approaches have clustered instances prior to recognition. We propose a probabilistic framework that unifies similarity with prior identity and contextual information. By fusing information sources in a single model, we eliminate unrecoverable errors that result from processing the information in separate stages and improve overall accuracy. The framework also naturally integrates dissimilarity information, which has previously been ignored. We demonstrate with an application in printed character recognition from images of signs in natural scenes.


international conference on pattern recognition | 2008

A discriminative semi-Markov model for robust scene text recognition

Jerod J. Weinman; Erik G. Learned-Miller; Allen R. Hanson

We present a semi-Markov model for recognizing scene text that integrates character and word segmentation with recognition. Using wavelet features, it requires only approximate location of the text baseline and font size; no binarization or prior word segmentation is necessary. Our system is aided by a lexicon, yet it also allows non-lexicon words. To facilitate inference with a large lexicon, we use an approximate Viterbi beam search. Our system performs robustly on low-resolution images of signs containing text in fonts atypical of documents.


international conference on pattern recognition | 2010

Typographical Features for Scene Text Recognition

Jerod J. Weinman

Scene text images feature an abundance of font style variety but a dearth of data in any given query. Recognition methods must be robust to this variety or adapt to the query datas characteristics. To achieve this, we augment a semi-Markov model---integrating character segmentation and recognition---with a bigram model of character widths. Softly promoting segmentations that exhibit font metrics consistent with those learned from examples, we use the limited information available while avoiding error-prone direct estimates and hard constraints. Incorporating character width bigrams in this fashion improves recognition on low-resolution images of signs containing text in many fonts.


european conference on computer vision | 2008

Efficiently Learning Random Fields for Stereo Vision with Sparse Message Passing

Jerod J. Weinman; Lam C. Tran; Chris Pal

As richer models for stereo vision are constructed, there is a growing interest in learning model parameters. To estimate parameters in Markov Random Field (MRF) based stereo formulations, one usually needs to perform approximate probabilistic inference. Message passing algorithms based on variational methods and belief propagation are widely used for approximate inference in MRFs. Conditional Random Fields (CRFs) are discriminative versions of traditional MRFs and have recently been applied to the problem of stereo vision. However, CRF parameter training typically requires expensive inference steps for each iteration of optimization. Inference is particularly slow when there are many discrete disparity levels, due to high state space cardinality. We present a novel CRF for stereo matching with an explicit occlusion model and propose sparse message passing to dramatically accelerate the approximate inference needed for parameter optimization. We show that sparse variational message passing iteratively minimizes the KL divergence between the approximation and model distributions by optimizing a lower bound on the partition function. Our experimental results show reductions in inference time of one order of magnitude with no loss in approximation quality. Learning using sparse variational message passing improves results over prior work using graph cuts.


international conference on document analysis and recognition | 2007

Fast Lexicon-Based Scene Text Recognition with Sparse Belief Propagation

Jerod J. Weinman; Erik G. Learned-Miller; Allen R. Hanson

Using a lexicon can often improve character recognition under challenging conditions, such as poor image quality or unusual fonts. We propose a flexible probabilistic model for character recognition that integrates local language properties, such as bigrams, with lexical decision, having open and closed vocabulary modes that operate simultaneously. Lexical processing is accelerated by performing inference with sparse belief propagation, a bottom-up method for hypothesis pruning. We give experimental results on recognizing text from images of signs in outdoor scenes. Incorporating the lexicon reduces word recognition error by 42% and sparse belief propagation reduces the number of lexicon words considered by 97%.

Collaboration


Dive into the Jerod J. Weinman's collaboration.

Top Co-Authors

Avatar

Allen R. Hanson

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Erik G. Learned-Miller

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Chris Pal

École Polytechnique de Montréal

View shared research outputs
Top Co-Authors

Avatar

Andrew McCallum

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Edward M. Riseman

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

George Dean Bissias

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Joseph Horowitz

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Lam C. Tran

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge