Jonathan J. Hull
Ricoh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jonathan J. Hull.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1994
Tin Kam Ho; Jonathan J. Hull; Sargur N. Srihari
A multiple classifier system is a powerful solution to difficult pattern recognition problems involving large class sets and noisy input because it allows simultaneous use of arbitrary feature descriptors and classification procedures. Decisions by the classifiers can be represented as rankings of classifiers and different instances of a problem. The rankings can be combined by methods that either reduce or rerank a given set of classes. An intersection method and union method are proposed for class set reduction. Three methods based on the highest rank, the Borda count, and logistic regression are proposed for class set reranking. These methods have been tested in applications of degraded machine-printed characters and works from large lexicons, resulting in substantial improvement in overall correctness. >
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1994
Jonathan J. Hull
An image database for handwritten text recognition research is described. Digital images of approximately 5000 city names, 5000 state names, 10000 ZIP Codes, and 50000 alphanumeric characters are included. Each image was scanned from mail in a working post office at 300 pixels/in in 8-bit gray scale on a high-quality flat bed digitizer. The data were unconstrained for the writer, style, and method of preparation. These characteristics help overcome the limitations of earlier databases that contained only isolated characters or were prepared in a laboratory setting under prescribed circumstances. Also, the database is divided into explicit training and testing sets to facilitate the sharing of results among researchers as well as performance comparisons. >
acm multimedia | 2003
Berna Erol; Jonathan J. Hull; Dar-Shyang Lee
An algorithm is presented that automatically matches images of presentation slides to the symbolic source file (e.g., PowerPoint™ or Acrobat™) from which they were generated. The images are captured either by tapping the video output from a laptop connected to a projector or by taking a picture of whats displayed on the screen in a conference room. The matching algorithm extracts features from the image data, including OCR output, edges, projection profiles, and layout and determines the symbolic file that contains the most similar collection of features. This algorithm enables several unique applications for enhancing a meeting in real-time and accessing the audio and video that were recorded while a presentation was being given. These applications include the simultaneous translation of presentation slides during a meeting, linking video clips inside a PowerPoint file that show how each slide was described by the presenter, and retrieving presentation recordings using digital camera images as queries.
international conference on image processing | 2003
Dar-Shyang Lee; Jonathan J. Hull; Berna Erol
Background subtraction is an essential processing component for many video applications. However, its development has largely been application driven and done in an ad hoc manner. In this paper, we provide a Bayesian formulation of background segmentation based on Gaussian mixture models. We show that the problem consists of two density estimation problems, one application independent and one application dependent, and a set of intuitive and theoretically optimal solutions can be derived for both. The proposed framework was tested on meeting and traffic videos and compared favorably to other well-known algorithms.
international conference on document analysis and recognition | 2003
Jonathan J. Hull; Berna Erol; Jamey Graham; Dar-Shyang Lee
The components of a key frame selection algorithm for a paper-based multimedia browsing interface called Video Paper are described. Analysis of video image frames is combined with the results of processing the closed caption to select key frames that are printed on a paper document together with the closed caption. Bar codes positioned near the key frames allow a user to play the video from the corresponding times. This paper describes several component techniques that are being investigated for key frame selection in the Video Paper system, including face detection and text recognition. The Video Paper system implementation is also discussed.
international conference on pattern recognition | 2004
Berna Erol; Jonathan J. Hull; Jamey Graham; Dar-Shyang Lee
A system is described for creating paper documents that show images of presentation slides and bar codes that point to a multimedia recording of a presentation that has not yet occurred. An image-matching algorithm applied after a presentation determines when each slide was displayed. These time-stamps map the bar codes onto commands that control a multimedia player. We describe the system infrastructure that allows us to prepare such prescient documents and the document image-matching algorithm that enables the mapping of bar codes onto the times when slides were displayed. This provides a multimedia annotation tool that requires no electronic device at capture time.
acm multimedia | 2003
Jamey Graham; Berna Erol; Jonathan J. Hull; Dar-Shyang Lee
Video Paper is a prototype system for multimedia browsing, analysis, and replay. Key frames extracted from a video recording are printed on paper together with bar codes that allow for random access and replay. A transcript for the audio track can also be shown so that users can read what was said, thus making the document a stand-alone representation for the contents of the multimedia recording. The Video Paper system has been used for several applications, including the analysis of recorded meetings, broadcast news, oral histories and personal recordings. This demonstration will show how the Video Paper system was applied to these domains and the various replay systems that were developed, including a self-contained portable implementation on a PDA and a fixed implementation on desktop PC.
acm multimedia | 2002
Dar-Shyang Lee; Berna Erol; Jamey Graham; Jonathan J. Hull; Norihiko Murata
The design and implementation of a portable meeting recorder is presented. Composed of an omni-directional video camera with four-channel audio capture, the system saves a view of all the activity in a meeting and the directions from which people spoke. Subsequent analysis computes metadata that includes video activity analysis of the compressed data stream and audio processing that helps locate events that occurred during the meeting. Automatic calculation of the room in which the meeting occurred allows for efficient navigation of a collection of recorded meetings. A user interface is populated from the metadata description to allow for simple browsing and location of significant events.
acm multimedia | 2008
Berna Erol; Emilio R. Antúnez; Jonathan J. Hull
The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or watermarks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We developed a novel algorithm, Brick Wall Coding (BWC), that performs image-based document recognition using the mobile phone video frames. Given a document patch image, BWC utilizes the layout, i.e. relative locations, of word boxes in order to determine the original file, page, and the location on the page. BWC runs real-time (4 frames per second) on a Treo 700w smartphone with a 312 MHz processor and 64MB RAM. Using our method we can recognize blurry document patch frames that contain as little as 4-5 lines of text and a video resolution as low as 176x144. We performed experiments by indexing 4397 document pages and querying this database with 533 document patches. Besides describing the basic algorithm, this paper also describes several applications that are enabled by mobile phone-paper interaction, such as inserting electronic annotations to paper, using paper as a tangible interface to collect and communicate multimedia data, and collaborative homework.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1982
Jonathan J. Hull; Sargur N. Srihari
The binary n-gram and Viterbi algorithms have been suggested as alternative approaches to contextual postprocessing for text produced by a noisy channel such as an optical character recognizer. This correspondence describes the underlying theory of each approach in unified terminology, and presents new implementation algorithms for each approach. In particular, a storage efficient data structure is proposed for the binary n-gram algorithm and a recursive formulation is given for the Viterbi algorithm. Results of extensive experiments with each algorithm are described.