Berna Erol | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Berna Erol is active.

Explore More

Publication

Featured researches published by Berna Erol.

acm multimedia | 2003

Linking multimedia presentations with their symbolic source documents: algorithm and applications

Berna Erol; Jonathan J. Hull; Dar-Shyang Lee

An algorithm is presented that automatically matches images of presentation slides to the symbolic source file (e.g., PowerPoint™ or Acrobat™) from which they were generated. The images are captured either by tapping the video output from a laptop connected to a projector or by taking a picture of whats displayed on the screen in a conference room. The matching algorithm extracts features from the image data, including OCR output, edges, projection profiles, and layout and determines the symbolic file that contains the most similar collection of features. This algorithm enables several unique applications for enhancing a meeting in real-time and accessing the audio and video that were recorded while a presentation was being given. These applications include the simultaneous translation of presentation slides during a meeting, linking video clips inside a PowerPoint file that show how each slide was described by the presenter, and retrieving presentation recordings using digital camera images as queries.

international conference on image processing | 2003

A Bayesian framework for Gaussian mixture background modeling

Dar-Shyang Lee; Jonathan J. Hull; Berna Erol

Background subtraction is an essential processing component for many video applications. However, its development has largely been application driven and done in an ad hoc manner. In this paper, we provide a Bayesian formulation of background segmentation based on Gaussian mixture models. We show that the problem consists of two density estimation problems, one application independent and one application dependent, and a set of intuitive and theoretically optimal solutions can be derived for both. The proposed framework was tested on meeting and traffic videos and compared favorably to other well-known algorithms.

international conference on document analysis and recognition | 2003

Visualizing multimedia content on paper documents: components of key frame selection for Video Paper

Jonathan J. Hull; Berna Erol; Jamey Graham; Dar-Shyang Lee

The components of a key frame selection algorithm for a paper-based multimedia browsing interface called Video Paper are described. Analysis of video image frames is combined with the results of processing the closed caption to select key frames that are printed on a paper document together with the closed caption. Bar codes positioned near the key frames allow a user to play the video from the corresponding times. This paper describes several component techniques that are being investigated for key frame selection in the Video Paper system, including face detection and text recognition. The Video Paper system implementation is also discussed.

international conference on pattern recognition | 2004

Prescient paper: multimedia document creation with document image matching

Berna Erol; Jonathan J. Hull; Jamey Graham; Dar-Shyang Lee

A system is described for creating paper documents that show images of presentation slides and bar codes that point to a multimedia recording of a presentation that has not yet occurred. An image-matching algorithm applied after a presentation determines when each slide was displayed. These time-stamps map the bar codes onto commands that control a multimedia player. We describe the system infrastructure that allows us to prepare such prescient documents and the document image-matching algorithm that enables the mapping of bar codes onto the times when slides were displayed. This provides a multimedia annotation tool that requires no electronic device at capture time.

acm multimedia | 2003

The video paper multimedia playback system

Jamey Graham; Berna Erol; Jonathan J. Hull; Dar-Shyang Lee

Video Paper is a prototype system for multimedia browsing, analysis, and replay. Key frames extracted from a video recording are printed on paper together with bar codes that allow for random access and replay. A transcript for the audio track can also be shown so that users can read what was said, thus making the document a stand-alone representation for the contents of the multimedia recording. The Video Paper system has been used for several applications, including the analysis of recorded meetings, broadcast news, oral histories and personal recordings. This demonstration will show how the Video Paper system was applied to these domains and the various replay systems that were developed, including a self-contained portable implementation on a PDA and a fixed implementation on desktop PC.

acm multimedia | 2002

Portable meeting recorder

Dar-Shyang Lee; Berna Erol; Jamey Graham; Jonathan J. Hull; Norihiko Murata

The design and implementation of a portable meeting recorder is presented. Composed of an omni-directional video camera with four-channel audio capture, the system saves a view of all the activity in a meeting and the directions from which people spoke. Subsequent analysis computes metadata that includes video activity analysis of the compressed data stream and audio processing that helps locate events that occurred during the meeting. Automatic calculation of the room in which the meeting occurred allows for efficient navigation of a collection of recorded meetings. A user interface is populated from the metadata description to allow for simple browsing and location of significant events.

acm multimedia | 2008

HOTPAPER: multimedia interaction with paper using mobile phones

Berna Erol; Emilio R. Antúnez; Jonathan J. Hull

The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or watermarks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We developed a novel algorithm, Brick Wall Coding (BWC), that performs image-based document recognition using the mobile phone video frames. Given a document patch image, BWC utilizes the layout, i.e. relative locations, of word boxes in order to determine the original file, page, and the location on the page. BWC runs real-time (4 frames per second) on a Treo 700w smartphone with a 312 MHz processor and 64MB RAM. Using our method we can recognize blurry document patch frames that contain as little as 4-5 lines of text and a video resolution as low as 176x144. We performed experiments by indexing 4397 document pages and querying this database with 533 document patches. Besides describing the basic algorithm, this paper also describes several applications that are enabled by mobile phone-paper interaction, such as inserting electronic annotations to paper, using paper as a tangible interface to collect and communicate multimedia data, and collaborative homework.

asilomar conference on signals, systems and computers | 2003

Linking presentation documents using image analysis

Berna Erol; Jonathan J. Hull

Systems for recording presentations are becoming commonly available. Commercial solutions include authoring tools that let users create online representations by recording audio, video, and presentation slides while a talk is being given. A typical collection of presentation recordings may contain hundreds even thousands of recordings, making it very difficult to retrieve particular presentations and find specific points in the presentation. This paper describes a retrieval technique that utilizes the digital camera pictures taken during presentations. An enabling image matching algorithm is also described. The algorithm utilizes the text and the layout of presentation slides captured in different multimedia streams and documents. Experimental results show that our method yields to a high retrieval accuracy.

IEEE Transactions on Multimedia | 2008

Multimedia Clip Generation From Documents for Browsing on Mobile Devices

Berna Erol; Kathrin Berkner; Siddharth Joshi

Small displays on mobile handheld devices, such as personal digital assistants (PDAs) and cellular phones, are the bottlenecks for usability of most content browsing applications. Generally, conventional content such as documents and Web pages need to be modified for effective presentation on mobile devices. This paper proposes a novel visualization for documents, called multimedia thumbnails, which consists of text and image content converted into playable multimedia clips. A multimedia thumbnail utilizes visual and audio channels of small portable devices as well as both spatial and time dimensions to communicate text and image information of a single document. The proposed algorithm for generating multimedia thumbnails includes 1) a semantic document analysis step, where salient content from a source document is extracted; 2) an optimization step, where a subset of this extracted content is selected based on time, display, and application constraints; and 3) a composition step, where the selected visual and audible document content is combined into a multimedia thumbnail. Scalability of MMNails that allows generation of multimedia clips of various lengths is also described. A user study is presented that evaluates the effectiveness of the proposed multimedia thumbnail visualization.

international conference on multimedia and expo | 2003

Multimodal summarization of meeting recordings

Berna Erol; Dar-Shyang Lee; Jonathan J. Hull

Recorded meetings are useful only if people can find, access, and browse them easily. Key-frames and video skims are useful representations that can enable quick previewing of the content without actually watching a meeting recording from beginning to end. This paper proposes a new method for creating meeting video skims based on audio and visual activity analysis together with text analysis. Audio activity analysis is performed by analyzing sound directions-indicating different speakers-and audio amplitude. Detection of important visual events in a meeting is achieved by analyzing the localized luminance variations in consideration with the omni-directional property of the video captured by our meeting recording system. Text analysis is based on the term frequency-inverse document frequency measure. The resulting video skims better capture the important meeting content compared to the skims obtained by uniform sampling.

Explore More