Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where R. Manmatha is active.

Publication


Featured researches published by R. Manmatha.


international acm sigir conference on research and development in information retrieval | 2003

Automatic image annotation and retrieval using cross-media relevance models

Jiwoon Jeon; Victor Lavrenko; R. Manmatha

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.


computer vision and pattern recognition | 2004

Multiple Bernoulli relevance models for image and video annotation

Shaolei Feng; R. Manmatha; Victor Lavrenko

Retrieving images in response to textual queries requires some knowledge of the semantics of the picture. Here, we show how we can do both automatic image annotation and retrieval (using one word queries) from images and videos using a multiple Bernoulli relevance model. The model assumes that a training set of images or videos along with keyword annotations is provided. Multiple keywords are provided for an image and the specific correspondence between a keyword and an image is not provided. Each image is partitioned into a set of rectangular regions and a real-valued feature vector is computed over these regions. The relevance model is a joint probability distribution of the word annotations and the image feature vectors and is computed using the training set. The word probabilities are estimated using a multiple Bernoulli model and the image feature probabilities using a non-parametric kernel density estimate. The model is then used to annotate images in a test set. We show experiments on both images from a standard Corel data set and a set of video key frames from NISTs video tree. Comparative experiments show that the model performs better than a model based on estimating word probabilities using the popular multinomial distribution. The results also show that our model significantly outperforms previously reported results on the task of image and video annotation.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1999

Textfinder: an automatic system to detect and recognize text in images

Victor Wu; R. Manmatha; Edward M. Riseman

A robust system is proposed to automatically detect and extract text in images from different sources, including video, newspapers, advertisements, stock certificates, photographs, and checks. Text is first detected using multiscale texture segmentation and spatial cohesion constraints, then cleaned up and extracted using a histogram-based binarization algorithm. An automatic performance evaluation scheme is also proposed.


International Journal on Document Analysis and Recognition | 2007

Word spotting for historical documents

Tony M. Rath; R. Manmatha

Searching and indexing historical handwritten collections are a very challenging problem. We describe an approach called word spotting which involves grouping word images into clusters of similar words by using image matching to find similarity. By annotating “interesting” clusters, an index that links words to the locations where they occur can be built automatically. Image similarities computed using a number of different techniques including dynamic time warping are compared. The word similarities are then used for clustering using both K-means and agglomerative clustering techniques. It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates.


acm international conference on digital libraries | 1997

Finding text in images

Victor Wu; R. Manmatha; Edward M. Riseman

There are many applications in which the automatic detection and recognition of text embedded in images is useful. These applications include digad libraries, multimedia systems, and Geographical Information Systems. When machine generated text is prdnted against clean backgrounds, it can be converted to a computer readable form (ASCII) using current Optical Character Recognition (OCR) technology. However, text is often printed against shaded or textured backgrounds or is embedded in images. Examples include maps, advertisements, photographs, videos and stock certificates. Current document segmentation and recognition technologies cannot handle these situafons well. In this paper, a four-step system which automaticnlly detects and extracts text in images i& proposed. First, a texture segmentation scheme is used to focus attention on regions where text may occur. Second, strokes are extracted from the segmented text regions. Using reasonable heuristics on text strings such as height similarity, spacing and alignment, the extracted strokes are then processed to form rectangular boxes surrounding the corresponding ttzt strings. To detect text over a wide range of font sizes, the above steps are first applied to a pyramid of images generated from the input image, and then the boxes formed at each resolution level of the pyramid are fused at the image in the original resolution level. Third, text is extracted by cleaning up the background and binarizing the detected ted strings. Finally, better text bounding boxes are generated by srsiny the binarized text as strokes. Text is then cleaned and binarized from these new boxes, and can then be passed through a commercial OCR engine for recognition if the text is of an OCR-recognizable font. The system is stable, robust, and works well on imayes (with or without structured layouts) from a wide van’ety of sources, including digitized video frames, photographs, *This material is based on work supported in part by the National Science Foundation, Library of Congress and Department of Commerce under cooperative agreement number EEC9209623, in part by the United States Patent and mademark Office and Defense Advanced Research Projects Agency/IT0 under ARPA order number D468, issued by ESC/AXS contract number F19628-96-C-0235, in part by the National Science Foundation under grant number IF&9619117 and in part by NSF Multimedia CDA-9502639. Any opinions, findings and conclusions or recommendations expressed in this material are the author(s) and do not necessarily reflect those of the sponsors. Prrmission to make digital/hard copies ofall or part oflhis material for personal or clrrssroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title ofthe publication and its date appear, and notice is given that copyright is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires specific permission and/or fe DL 97 Philadelphia PA, USA Copyright 1997 AChi 0-89791~868-1197/7..


computer vision and pattern recognition | 1996

Word spotting: a new approach to indexing handwriting

R. Manmatha; Chengfeng Han; Edward M. Riseman

3.50 newspapers, advertisements, stock certifimtes, and personal checks. All parameters remain the same for-all the experiments.


international acm sigir conference on research and development in information retrieval | 2003

Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002

James Allan; Jay Aslam; Nicholas J. Belkin; Chris Buckley; James P. Callan; W. Bruce Croft; Susan T. Dumais; Norbert Fuhr; Donna Harman; David J. Harper; Djoerd Hiemstra; Thomas Hofmann; Eduard H. Hovy; Wessel Kraaij; John D. Lafferty; Victor Lavrenko; David Lewis; Liz Liddy; R. Manmatha; Andrew McCallum; Jay M. Ponte; John M. Prager; Dragomir R. Radev; Philip Resnik; Stephen E. Robertson; Ron G. Rosenfeld; Salim Roukos; Mark Sanderson; Richard M. Schwartz; Amit Singhal

There are many historical manuscripts written in a single hand which it would be useful to index. Examples include the W.B. DuBois collection at the University of Massachusetts and the early Presidential libraries at the Library of Congress. Since Optical Character Recognition (OCR) does not work well on handwriting, an alternative scheme based on matching the images of the words is proposed for indexing such texts. The current paper deals with the matching aspects of this process. Two different techniques for matching words are discussed. The first method matches words assuming that the transformation between the words may be modelled by a translation (shift). The second method matches words assuming that the transformation between the words may be modelled by an affine transform. Experiments are shown demonstrating the feasibility of the approach for indexing handwriting. The method should also be applicable to retrieving previously stored material from personal digital assistants (PDAs).


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

A Novel Word Spotting Method Based on Recurrent Neural Networks

Volkmar Frinken; Andreas Fischer; R. Manmatha; Horst Bunke

Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took place at a recent workshop. The attendees of the workshop considered information retrieval research in a range of areas chosen to give broad coverage of topic areas that engage information retrieval researchers. Those areas are retrieval models, cross-lingual retrieval, Web search, user modeling, filtering, topic detection and tracking, classification, summarization, question answering, metasearch, distributed retrieval, multimedia retrieval, information extraction, as well as testbed requirements for future work. The potential use of language modeling techniques in these areas was also discussed. The workshop identified major challenges within each of those areas. The following are recurring themes that ran throughout: • User and context sensitive retrieval • Multi-lingual and multi-media issues • Better target tasks • Improved objective evaluations • Substantially more labeled data • Greater variety of data sources • Improved formal models Contextual retrieval and global information access were identified as particularly important long-term challenges.


First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings. | 2004

Holistic word recognition for handwritten historical documents

Victor Lavrenko; Toni M. Rath; R. Manmatha

Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

A scale space approach for automatically segmenting words from historical handwritten documents

R. Manmatha; Jamie L. Rothfeder

Most offline handwriting recognition approaches proceed by segmenting words into smaller pieces (usually characters) which are recognized separately. The recognition result of a word is then the composition of the individually recognized parts. Inspired by results in cognitive psychology, researchers have begun to focus on holistic word recognition approaches. Here we present a holistic word recognition approach for single-author historical documents, which is motivated by the fact that for severely degraded documents a segmentation of words into characters will produce very poor results. The quality of the original documents does not allow us to recognize them with high accuracy - our goal here is to produce transcriptions that will allow successful retrieval of images, which has been shown to be feasible even in such noisy environments. We believe that this is the first systematic approach to recognizing words in historical manuscripts with extensive experiments. Our experiments show recognition accuracy of 65%, which exceeds performance of other systems which operate on non-degraded input images (nonhistorical documents).

Collaboration


Dive into the R. Manmatha's collaboration.

Top Co-Authors

Avatar

Toni M. Rath

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Edward M. Riseman

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Shaolei Feng

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Allan

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Ismet Zeki Yalniz

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Madirakshi Das

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Srinivas Ravela

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

C. V. Jawahar

International Institute of Information Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge