Gregory K. Myers
SRI International
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gregory K. Myers.
international conference on document analysis and recognition | 2005
Hrishikesh B. Aradhye; Gregory K. Myers; James A. Herson
To circumvent prevalent text-based anti-spam filters, spammers have begun embedding the advertisement text in images. Analogously, proprietary information (such as source code) may be communicated as screenshots to defeat text-based monitoring of outbound e-mail. The proposed method separates spam images from other common categories of e-mail images based on extracted overlay text and color features. No expensive OCR processing is necessary. Our method works robustly in spite of complex backgrounds, compression artifacts, and a wide variety of formats and fonts of overlaid spam text. It is also demonstrated successfully to detect screen-shots in outbound e-mail.
International Journal on Document Analysis and Recognition | 2005
Gregory K. Myers; Robert C. Bolles; Quang-Tuan Luong; James A. Herson; Hrishikesh B. Aradhye
Abstract.Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of existing perspective rectification methods that were primarily designed for images of document pages. We propose an approach that reliably rectifies and subsequently recognizes individual lines of text. Our system, which includes novel algorithms for extraction of text from real-world scenery, perspective rectification, and binarization, has been rigorously tested on still imagery as well as on MPEG-2 video clips in real time.
computer vision and pattern recognition | 2005
Katherine Donaldson; Gregory K. Myers
To increase the range of sizes of video scene text recognizable by optical character recognition (OCR), we developed a Bayesian super-resolution algorithm that uses a text-specific bimodal prior. We evaluated the effectiveness of the bimodal prior, compared with and in conjunction with a piecewise smoothness prior, visually and by measuring the accuracy of the OCR results on the variously super-resolved images. The bimodal prior improved the readability of 4- to 7-pixel-high scene text significantly better than bicubic interpolation, and increased the accuracy of OCR results better than the piecewise smoothness prior.
graphics recognition | 1995
Gregory K. Myers; Prasanna G. Mulgaonkar; Chien-Huei Chen; Jeff L. DeCurtins; Edward Chen
Existing systems for converting maps to an object-oriented form suitable for a geographic information system (GIS) are only partially automated. Most published approaches for automated interpretation of raster-scanned maps assume that the map is composed of various graphic entities, and that the vast majority of pixel positions on the map each belong to only one type of graphic entity and can therefore be geometrically segmented. However, complex color topographic maps contain several layers of information that overlap substantially (often within a single color plane), making it impossible to geometrically segment the map data into distinct regions containing a single class of graphic object. Here we describe a verification-based approach that uses various knowledge bases to detect, extract, and attribute map features without requiring the presegmentation of graphical entities. This approach builds on SRI Internationals (SRIs) verification-based computer vision and character recognition methodologies. The approach can also be applied to other types of documents containing a mix of text and graphics, such as engineering drawings, electrical schematics, and technical illustrations.
machine vision applications | 2014
Gregory K. Myers; Ramesh Nallapati; Julien van Hout; Stephanie Pancoast; Ramakant Nevatia; Chen Sun; Amirhossein Habibian; Dennis Koelma; Koen E. A. van de Sande; Arnold W. M. Smeulders; Cees G. M. Snoek
Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME’s performance in the 2012 TRECVID MED evaluation was one of the best reported.
International Journal on Document Analysis and Recognition | 2005
Katherine Donaldson; Gregory K. Myers
Abstract.To increase the range of sizes of video scene text recognizable by optical character recognition (OCR), we developed a Bayesian super-resolution algorithm that uses a text-specific bimodal prior. We evaluated the effectiveness of the bimodal prior, compared and in conjunction with a piecewise smoothness prior, visually and by measuring the accuracy of the OCR results on the variously super-resolved images. The bimodal prior improved the readability of 4- to 7-pixel-high scene text significantly better than bicubic interpolation and increased the accuracy of OCR results better than the piecewise smoothness prior.
international conference on multimedia retrieval | 2014
Chen Sun; Brian Burns; Ram Nevatia; Cees G. M. Snoek; Bob Bolles; Gregory K. Myers; Wen Wang; Eric Yeh
This paper describes a system for multimedia event detection and recounting. The goal is to detect a high level event class in unconstrained web videos and generate event oriented summarization for display to users. For this purpose, we detect informative segments and collect observations for them, leading to our ISOMER system. We combine a large collection of both low level and semantic level visual and audio features for event detection. For event recounting, we propose a novel approach to identify event oriented discriminative video segments and their descriptions with a linear SVM event classifier. User friendly concepts including objects, actions, scenes, speech and optical character recognition are used in generating descriptions. We also develop several mapping and filtering strategies to cope with noisy concept detectors. Our system performed competitively in the TRECVID 2013 Multimedia Event Detection task with near 100,000 videos and was the highest performer in TRECVID 2013 Multimedia Event Recounting task.
international conference on document analysis and recognition | 2003
Kenneth Nitz; Wayne T. Cruz; Hrishikesh B. Aradhye; Talia Shaham; Gregory K. Myers
When mixed mail enters a postal facility, it must firstbe faced and oriented so that the address is readable bymail processors.Existing USPS systems face and orientdomestic mail by searching for flourescing indicia oneach mail piece.However, stamps on foreign-originatedmail do not flouresce, so the processing systems cannotsort foreign mail.Furthermore, even the facing ofdomestic mail is subject to processing problems such asmechanical malfunction and misplaced or partiallyflourescing indicia, which cause a significant fraction ofmail to be rejected.Previously, rejected domestic mailand all foreign mail had to be faced and oriented byhand, thus increasing mail processing costs for the USPS.This work aims to eliminate these costs by developing animage-based facing system that processes scannedimages of both sides of each envelope and automaticallyfaces and orients it in real time.TheUSPS began todeploy this technology nationwide in November 2002.
international conference on document analysis and recognition | 2007
Hrishikesh B. Aradhye; Gregory K. Myers
Text in video, whether overlay or in-scene, contains a wealth of information vital to automated content analysis systems. However, low resolution of the imagery, coupled with richness of the background and compression artifacts limit the detection accuracy that can be achieved in practice using existing text detection algorithms. This paper presents a novel, non-causal temporal aggregation method that acts as a second pass over the output of an existing text detector over the entire video clip. A multiresolution change detection algorithm is used along the time axis to detect the appearance and disappearance of multiple, concurrent lines of text followed by recursive time-averaged projections on Y and X axes. This algorithm detects and rectifies instances of missed text and enhances spatial boundaries of detected text lines using consensus estimates. Experimental results, which demonstrate significant performance gain on publicly collected and annotated data, are presented.
international conference on acoustics, speech, and signal processing | 2014
J. van Hout; Eric Yeh; Dennis Koelma; Cees G. M. Snoek; Chen Sun; Ram Nevatia; Julie Wong; Gregory K. Myers
The state-of-the-art in example-based multimedia event detection (MED) rests on heterogeneous classifiers whose scores are typically combined in a late-fusion scheme. Recent studies on this topic have failed to reach a clear consensus as to whether machine learning techniques can outperform rule-based fusion schemes with varying amount of training data. In this paper, we present two parametric approaches to late fusion: a normalization scheme for arithmetic mean fusion (logistic averaging) and a fusion scheme based on logistic regression, and compare them to widely used rule-based fusion schemes. We also describe how logistic regression can be used to calibrate the fused detection scores to predict an optimal threshold given a detection prior and costs on errors. We discuss the advantages and shortcomings of each approach when the amount of positives available for training varies from 10 positives (10Ex) to 100 positives (100Ex). Experiments were run using video data from the NIST TRECVID MED 2013 evaluation and results were reported in terms of a ranking metric: the mean average precision (mAP) and R0, a cost-based metric introduced in TRECVID MED 2013.