Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marcel Worring is active.

Publication


Featured researches published by Marcel Worring.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2000

Content-based image retrieval at the end of the early years

Arnold W. M. Smeulders; Marcel Worring; Simone Santini; Amarnath Gupta; Ramesh Jain

Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.


acm multimedia | 2006

The challenge problem for automated detection of 101 semantic concepts in multimedia

Cees G. M. Snoek; Marcel Worring; Jan C. van Gemert; Jan-Mark Geusebroek; Arnold W. M. Smeulders

We introduce the challenge problem for generic video indexing to gain insight in intermediate steps that affect performance of multimedia analysis methods, while at the same time fostering repeatability of experiments. To arrive at a challenge problem, we provide a general scheme for the systematic examination of automated concept detection methods, by decomposing the generic video indexing problem into 2 unimodal analysis experiments, 2 multimodal analysis experiments, and 1 combined analysis experiment. For each experiment, we evaluate generic video indexing performance on 85 hours of international broadcast news data, from the TRECVID 2005/2006 benchmark, using a lexicon of 101 semantic concepts. By establishing a minimum performance on each experiment, the challenge problem allows for component-based optimization of the generic indexing issue, while simultaneously offering other researchers a reference for comparison during indexing methodology development. To stimulate further investigations in intermediate analysis steps that inuence video indexing performance, the challenge offers to the research community a manually annotated concept lexicon, pre-computed low-level multimedia features, trained classifier models, and five experiments together with baseline performance, which are all available at http://www.mediamill.nl/challenge/.


IEEE Transactions on Multimedia | 2009

Learning Social Tag Relevance by Neighbor Voting

Xirong Li; Cees G. M. Snoek; Marcel Worring

Social image analysis and retrieval is important for helping people organize and access the increasing amount of user tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well-defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications.


Foundations and Trends in Information Retrieval | 2008

Concept-Based Video Retrieval

Cees G. M. Snoek; Marcel Worring

In this paper, we review 300 references on video retrieval, indicating when text-only solutions are unsatisfactory and showing the promising alternatives which are in majority concept-based. Therefore, central to our discussion is the notion of a semantic concept: an objective linguistic description of an observable entity. Specifically, we present our view on how its automated detection, selection under uncertainty, and interactive usage might solve the major scientific problem for video retrieval: the semantic gap. To bridge the gap, we lay down the anatomy of a concept-based video search engine. We present a component-wise decomposition of such an interdisciplinary multimedia system, covering influences from information retrieval, computer vision, machine learning, and human–computer interaction. For each of the components we review state-of-the-art solutions in the literature, each having different characteristics and merits. Because of these differences, we cannot understand the progress in video retrieval without serious evaluation efforts such as carried out in the NIST TRECVID benchmark. We discuss its data, tasks, results, and the many derived community initiatives in creating annotations and baselines for repeatable experiments. We conclude with our perspective on future challenges and opportunities.


IEEE Transactions on Multimedia | 2007

Adding Semantics to Detectors for Video Retrieval

Cees G. M. Snoek; Bouke Huurnink; Laura Hollink; M. de Rijke; Guus Schreiber; Marcel Worring

In this paper, we propose an automatic video retrieval method based on high-level concept detectors. Research in video analysis has reached the point where over 100 concept detectors can be learned in a generic fashion, albeit with mixed performance. Such a set of detectors is very small still compared to ontologies aiming to capture the full vocabulary a user has. We aim to throw a bridge between the two fields by building a multimedia thesaurus, i.e., a set of machine learned concept detectors that is enriched with semantic descriptions and semantic structure obtained from WordNet. Given a multimodal user query, we identify three strategies to select a relevant detector from this thesaurus, namely: text matching, ontology querying, and semantic visual querying. We evaluate the methods against the automatic search task of the TRECVID 2005 video retrieval benchmark, using a news video archive of 85 h in combination with a thesaurus of 363 machine learned concept detectors. We assess the influence of thesaurus size on video search performance, evaluate and compare the multimodal selection strategies for concept detectors, and finally discuss their combined potential using oracle fusion. The set of queries in the TRECVID 2005 corpus is too small for us to be definite in our conclusions, but the results suggest promising new lines of research.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Watersnakes: energy-driven watershed segmentation

Hieu Tat Nguyen; Marcel Worring; R. van den Boomgaard

The watershed algorithm from mathematical morphology is powerful for segmentation. However, it does not allow incorporation of a priori information as segmentation methods that are based on energy minimization. In particular, there is no control of the smoothness of the segmentation result. In this paper, we show how to represent watershed segmentation as an energy minimization problem using the distance-based definition of the watershed line. A priori considerations about smoothness can then be imposed by adding the contour length to the energy function. This leads to a new segmentation method called watersnakes, integrating the strengths of watershed segmentation and energy based segmentation. Experimental results show that, when the original watershed segmentation has noisy boundaries or wrong limbs attached to the object of interest, the proposed method overcomes those drawbacks and yields a better segmentation.


International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 2004

Classification of user image descriptions

Laura Hollink; A.Th. Schreiber; Bob J. Wielinga; Marcel Worring

In order to resolve the mismatch between user needs and current image retrieval techniques, we conducted a study to get more information about what users look for in images. First, we developed a framework for the classification of image descriptions by users, based on various classification methods from the literature. The classification framework distinguishes three related viewpoints on images, namely nonvisual metadata, perceptual descriptions and conceptual descriptions. For every viewpoint a set of descriptive classes and relations is specified. We used the framework in an empirical study, in which image descriptions were formulated by 30 participants. The resulting descriptions were split into fragments and categorized in the framework. The results suggest that users prefer general descriptions as opposed to specific or abstract descriptions. Frequently used categories were objects, events and relations between objects in the image.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing

G.G.M. Snoek; Marcel Worring; Jan-Mark Geusebroek; Dennis Koelma; Frank J. Seinstra; Arnold W. M. Smeulders

This paper presents the semantic pathfinder architecture for generic indexing of multimedia archives. The semantic pathfinder extracts semantic concepts from video by exploring different paths through three consecutive analysis steps, which we derive from the observation that produced video is the result of an authoring-driven process. We exploit this authoring metaphor for machine-driven understanding. The pathfinder starts with the content analysis step. In this analysis step, we follow a data-driven approach of indexing semantics. The style analysis step is the second analysis step. Here, we tackle the indexing problem by viewing a video from the perspective of production. Finally, in the context analysis step, we view semantics in context. The virtue of the semantic pathfinder is its ability to learn the best path of analysis steps on a per-concept basis. To show the generality of this novel indexing approach, we develop detectors for a lexicon of 32 concepts and we evaluate the semantic pathfinder against the 2004 NIST TRECVID video retrieval benchmark, using a news archive of 64 hours. Top ranking performance in the semantic concept detection task indicates the merit of the semantic pathfinder for generic indexing of multimedia archives


Journal of Visual Languages and Computing | 2008

Interactive access to large image collections using similarity-based visualization

Giang P. Nguyen; Marcel Worring

Image collections are getting larger and larger. To access those collections, systems for managing, searching, and browsing are necessary. Visualization plays an essential role in such systems. Existing visualization systems do not analyze all the problems occurring when dealing with large visual collections. In this paper, we make these problems explicit. From there, we establish three general requirements: overview, visibility, and structure preservation. Solutions for each requirement are proposed, as well as functions balancing the different requirements. We present an optimal visualization scheme, supporting users in interacting with large image collections. Experimental results with a collection of 10,000 Corel images, using simulated user actions, show that the proposed scheme significantly improves performance for a given task compared to the 2D grid-based visualizations commonly used in content-based image retrieval.


International Journal on Document Analysis and Recognition | 2002

Document Understanding for a Broad Class of Documents

Marco Aiello; Christof Monz; Leon Todoran; Marcel Worring

We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and spatial relations to the textual features and content are employed in the analysis. To deal effectively with these information sources, we define a document representation general and flexible enough to represent complex documents. To handle such a broad document class, it uses generic document knowledge only, which is identified explicitly. The proposed system inte- grates components based on computer vision, artificial intelligence, and natural language processing techniques. The system is fully implemented and experimental re- sults on heterogeneous collections of documents for each component and for the entire system are presented.

Collaboration


Dive into the Marcel Worring's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ork de Rooij

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

O. de Rooij

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge