Hrishikesh B. Aradhye

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hrishikesh B. Aradhye is active.

Explore More

Publication

Featured researches published by Hrishikesh B. Aradhye.

international conference on document analysis and recognition | 2005

Image analysis for efficient categorization of image-based spam e-mail

Hrishikesh B. Aradhye; Gregory K. Myers; James A. Herson

To circumvent prevalent text-based anti-spam filters, spammers have begun embedding the advertisement text in images. Analogously, proprietary information (such as source code) may be communicated as screenshots to defeat text-based monitoring of outbound e-mail. The proposed method separates spam images from other common categories of e-mail images based on extracted overlay text and color features. No expensive OCR processing is necessary. Our method works robustly in spite of complex backgrounds, compression artifacts, and a wide variety of formats and fonts of overlaid spam text. It is also demonstrated successfully to detect screen-shots in outbound e-mail.

International Journal on Document Analysis and Recognition | 2005

Rectification and recognition of text in 3-D scenes

Gregory K. Myers; Robert C. Bolles; Quang-Tuan Luong; James A. Herson; Hrishikesh B. Aradhye

Abstract.Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of existing perspective rectification methods that were primarily designed for images of document pages. We propose an approach that reliably rectifies and subsequently recognizes individual lines of text. Our system, which includes novel algorithms for extraction of text from real-world scenery, perspective rectification, and binarization, has been rigorously tested on still imagery as well as on MPEG-2 video clips in real time.

international conference on multimedia and expo | 2001

End-to-end videotext recognition for multimedia content analysis

Chitra Dorai; Hrishikesh B. Aradhye; Jae-Chang Shim

Videotext refers to text superimposed on still images and video frames, and a videotext based Multimedia Description Scheme has recently been adopted into the MPEG-7 standard as one of the normative media content description interfaces. While much of the previous work including ours concentrates on the task of locating and extracting text from the video frames automatically, very little research has focused on reliably recognizing segmented text. The low resolution of videotext, unconstrained font styles and sizes, poor separation of characters often resulting from video compression and decoding, all pose severe problems even to commercial OCRs in recognizing videotext accurately. This paper describes novel end-to-end video character recognition system featuring new character attributes emphasizing macro shapes, a Support Vector Machine-based character classifier, videotext object synthesis, font context analysis, and temporal contiguity analysis, to successfully address the issues confounding accurate videotext recognition. We present results from our experiments with real video data that demonstrate the strengths of this system.

international conference on document analysis and recognition | 2003

An image-based mail facing and orientation system for enhanced postal automation

Kenneth Nitz; Wayne T. Cruz; Hrishikesh B. Aradhye; Talia Shaham; Gregory K. Myers

When mixed mail enters a postal facility, it must firstbe faced and oriented so that the address is readable bymail processors.Existing USPS systems face and orientdomestic mail by searching for flourescing indicia oneach mail piece.However, stamps on foreign-originatedmail do not flouresce, so the processing systems cannotsort foreign mail.Furthermore, even the facing ofdomestic mail is subject to processing problems such asmechanical malfunction and misplaced or partiallyflourescing indicia, which cause a significant fraction ofmail to be rejected.Previously, rejected domestic mailand all foreign mail had to be faced and oriented byhand, thus increasing mail processing costs for the USPS.This work aims to eliminate these costs by developing animage-based facing system that processes scannedimages of both sides of each envelope and automaticallyfaces and orients it in real time.TheUSPS began todeploy this technology nationwide in November 2002.

international conference on multimedia and expo | 2002

New kernels for analyzing multimodal data in multimedia using kernel machines

Hrishikesh B. Aradhye; Chitra Dorai

Research in automated analysis of digital media content has led to a large collection of low-level feature extractors, such as face detectors, videotext extractors, speech and speaker identifiers, people/vehicle trackers, and event locators. These media metadata are often symbolic rather than continuous-valued, and pose significant difficulty to subsequent tasks such as classification and dimensionality reduction which traditionally deal with continuous-valued data. This paper proposes a novel mechanism that extends tasks traditionally limited to continuous-valued feature spaces, such as (a) dimensionality reduction, (b) de-noising, and (c) clustering, to domains with symbolic features. To this end, we introduce new kernels based on well-known distance metrics, and prove Mercer validity of these kernels for analyzing symbolic feature spaces. We demonstrate their usefulness within the context of kernel-space methods such as Kernel PCA and SVM, in classifying machine learning datasets from the UCI repository and in temporal clustering and tracking of videotext in multimedia. We show that the generalized kernels help capture information from symbolic feature spaces, visualize symbolic data, and aid tasks such as classification and clustering, and therefore are useful in multimodal analysis of multimedia.

international conference on document analysis and recognition | 2007

Exploiting Videotext _Events_ for Improved Videotext Detection

Hrishikesh B. Aradhye; Gregory K. Myers

Text in video, whether overlay or in-scene, contains a wealth of information vital to automated content analysis systems. However, low resolution of the imagery, coupled with richness of the background and compression artifacts limit the detection accuracy that can be achieved in practice using existing text detection algorithms. This paper presents a novel, non-causal temporal aggregation method that acts as a second pass over the output of an existing text detector over the entire video clip. A multiresolution change detection algorithm is used along the time axis to detect the appearance and disappearance of multiple, concurrent lines of text followed by recursive time-averaged projections on Y and X axes. This algorithm detects and rectifies instances of missed text and enhances spatial boundaries of detected text lines using consensus estimates. Experimental results, which demonstrate significant performance gain on publicly collected and annotated data, are presented.

Lecture Notes in Computer Science | 2005

Headprint – person reacquisition using visual features of hair in overhead surveillance video

Hrishikesh B. Aradhye; Martin A. Fischler; Robert C. Bolles; Gregory K. Myers

In this paper, we present the results of our investigation of the use of the visual characteristics of human hair as a primary recognition attribute for human ID in indoor video imagery. The emerging need for unobtrusive biometrics has led to recent research interest in using the features of the face, gait, voice, and clothes, among others, for human authentication. However, the characteristics of hair have been almost completely excluded as a recognition attribute from state-of-the-art authentication methods. We contend that people often use hair as a principal visual biometric. Furthermore, hair is the part of the human body most likely to be visible to overhead surveillance cameras free of occlusion. Although hair can hardly be trusted to be a reliable long-term indicator of human identity, we show that the visual characteristics of hair can be effectively used to unobtrusively re-establish human ID in the task of short-term recognition and reacquisition in a video-based multiple-person continuous tracking application. We propose new pixel-based and line-segment-based features designed specifically to characterize hair, and recognition schemes that use just a few training images per subject. Our results demonstrate the feasibility of this approach, which we hope can form a basis for further research in this area.

document recognition and retrieval | 2003

Syntax-directed content analysis of videotext: application to a map detection recognition system

Hrishikesh B. Aradhye; James A. Herson; Gregory K. Myers

Video is an increasingly important and ever-growing source of information to the intelligence and homeland defense analyst. A capability to automatically identify the contents of video imagery would enable the analyst to index relevant foreign and domestic news videos in a convenient and meaningful way. To this end, the proposed system aims to help determine the geographic focus of a news story directly from video imagery by detecting and geographically localizing political maps from news broadcasts, using the results of videotext recognition in lieu of a computationally expensive, scale-independent shape recognizer. Our novel method for the geographic localization of a map is based on the premise that the relative placement of text superimposed on a map roughly corresponds to the geographic coordinates of the locations the text represents. Our scheme extracts and recognizes videotext, and iteratively identifies the geographic area, while allowing for OCR errors and artistic freedom. The fast and reliable recognition of such maps by our system may provide valuable context and supporting evidence for other sources, such as speech recognition transcripts. The concepts of syntax-directed content analysis of videotext presented here can be extended to other content analysis systems.

multimedia signal processing | 2001

Augmented edit distance based temporal contiguity analysis for improved videotext recognition

Hrishikesh B. Aradhye; Chitra Dorai

Videotext refers to text superimposed on video frames and it enables automatic content annotation and indexing of large video and image collections. Its importance is underscored by the fact that a videotext-based multimedia description scheme has recently been adopted into the MPEG-7 standard. A study of published work in the area of automatic videotext extraction and recognition reveals that, despite recent interest, a reliable general purpose video character recognition (VCR) system is yet to be developed. In our development of a VCR system designed specifically to handle the low resolution output from videotext extractors, we observed that raw VCR accuracies obtained using various classifiers including kernel space methods such as SVM, are inadequate for accurate video annotation. We propose an intelligent postprocessing mechanism that is supported by general data characteristics of this domain for VCR performance improvement. We describe temporal contiguity analysis, which works independently of the raw character recognition technique and works well even for moving videotext. This novel mechanism can be easily implemented in conjunction with VCR algorithms being developed elsewhere to offer the same performance gains. Experimental results on various video streams show notable improvements in recognition rates with our system incorporating a SVM-based recognition engine and temporal contiguity analysis.

document recognition and retrieval | 2006

Address block features for image-based automated mail orientation

M. Shahab Khan; Hrishikesh B. Aradhye; Wayne T. Cruz

When mixed mail enters a postal facility, it must first be faced and oriented so that the address is readable by automated mail processing machinery. Existing US Postal Service (USPS) automated systems face and orient domestic mail by searching for fluorescing stamps on each mail piece. However, misplaced or partially fluorescing postage causes a significant fraction of mail to be rejected. Previously, rejected mail had to be faced and oriented by hand, thus increasing mail processing cost and time. Our earlier work successfully demonstrated the utility of machine-vision-based extraction of postal delimiters-such as cancellation marks and barcodes-for camera-based mail facing and orientation. Arguably, of all the localized information sources on the envelope image, the destination address block is the richest in content and the most structured in its form and layout. This paper focuses exclusively on the destination address block image and describes new vision-based features that can be extracted and used for mail orientation. Our results on real USPS datasets indicate robust performance. The algorithms described herein will be deployed nationwide on USPS hardware in the near future.

Explore More