Bouke Huurnink
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bouke Huurnink.
IEEE Transactions on Multimedia | 2007
Cees G. M. Snoek; Bouke Huurnink; Laura Hollink; M. de Rijke; Guus Schreiber; Marcel Worring
In this paper, we propose an automatic video retrieval method based on high-level concept detectors. Research in video analysis has reached the point where over 100 concept detectors can be learned in a generic fashion, albeit with mixed performance. Such a set of detectors is very small still compared to ontologies aiming to capture the full vocabulary a user has. We aim to throw a bridge between the two fields by building a multimedia thesaurus, i.e., a set of machine learned concept detectors that is enriched with semantic descriptions and semantic structure obtained from WordNet. Given a multimodal user query, we identify three strategies to select a relevant detector from this thesaurus, namely: text matching, ontology querying, and semantic visual querying. We evaluate the methods against the automatic search task of the TRECVID 2005 video retrieval benchmark, using a news video archive of 85 h in combination with a thesaurus of 363 machine learned concept detectors. We assess the influence of thesaurus size on video search performance, evaluate and compare the multimodal selection strategies for concept detectors, and finally discuss their combined potential using oracle fusion. The set of queries in the TRECVID 2005 corpus is too small for us to be definite in our conclusions, but the results suggest promising new lines of research.
international semantic web conference | 2009
Edgar Meij; Marc Bron; Laura Hollink; Bouke Huurnink; Maarten de Rijke
An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion , a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download.
theory and practice of digital libraries | 2011
Marc Bron; Bouke Huurnink; Maarten de Rijke
News, multimedia and cultural heritage archives are increasingly offering opportunities to create connections between their collections. We consider the task of linking archives: connecting an item in one archive to one or more items in other, often complementary archives. We focus on a specific instance of the task: linking items with a rich textual representation in a news archive to items with sparse annotations in a multimedia archive, where items should be linked if they describe the same or a related event. We find that the difference in textual richness of annotations presents a challenge and investigate two approaches: (i) to enrich sparsely annotated items with textually rich content; and (ii) to reduce rich news archive items using term selection. We demonstrate the positive impact of both approaches on linking to same events and linking to related events.
multimedia information retrieval | 2008
Bouke Huurnink; Katja Hofmann; Maarten de Rijke
We explore the use of benchmarks to address the problem of assessing concept selection in video retrieval systems. Two benchmarks are presented, one created by human association of queries to concepts, the other generated from an extensively tagged collection. They are compared in terms of reliability, captured semantics, and retrieval performance. Recommendations are given for using the benchmarks to assess concept selection algorithms; the assessment is demonstrated on two existing algorithms. The benchmarks are released to the research community.
Interdisciplinary Science Reviews | 2009
Laura Hollink; Guus Schreiber; Bouke Huurnink; Michiel van Liempt; Maarten de Rijke; Arnold W. M. Smeulders; Johan Oomen; Annemieke de Jong
Abstract Audiovisual material is a vital component of the worlds heritage but it remains difficult to access. With the Netherlands Institute for Sound and Vision as one of its partners, the MuNCH project aims to investigate new methods for improving access to a wide range of audiovisual documents. MuNCH brings together three research fields: multimedia analysis, language technology and semantic technologies. Within the MuNCH project, we have investigated several combinations of these fields. We have compared text matching, ontology querying, and semantic visual querying as methods to translate a multimedia query to the vocabulary of the retrieval system. In addition, we have investigated how users make such a translation, and have used this as a benchmark to create automatic methods. We have used multimedia technology to automatically detect objects and scenes as they occur in video, and made use of language technology to exploit automatic transcriptions of speech. We have enriched the Sound and Vision thesaurus that is used to annotate the TV programmes in order to provide a user with a wider range of search results. In order to verify the results of the project against real user needs, MuNCH has participated in the creation of a logging system which monitors the usage of the Sound and Vision catalogue system. Insights in the needs of real users will be used as input for all three of MuNCHs research strands.
international acm sigir conference on research and development in information retrieval | 2010
Katja Hofmann; Bouke Huurnink; Marc Bron; Maarten de Rijke
Traditional retrieval evaluation uses explicit relevance judgments which are expensive to collect. Relevance assessments inferred from implicit feedback such as click-through data can be collected inexpensively, but may be less reliable. We compare assessments derived from click-through data to another source of implicit feedback that we assume to be highly indicative of relevance: purchase decisions. Evaluating retrieval runs based on a log of an audio-visual archive, we find agreement between system rankings and purchase decisions to be surprisingly high.
cross language evaluation forum | 2010
Bouke Huurnink; Katja Hofmann; Maarten de Rijke; Marc Bron
We design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
conference on image and video retrieval | 2010
Bouke Huurnink; Cees G. M. Snoek; Maarten de Rijke; Arnold W. M. Smeulders
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. We investigate to what extent content-based video retrieval methods can improve search in the audiovisual archive. In particular, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches and content purchases from an existing audiovisual archive to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. We find that incorporating content-based video retrieval into the archives practice results in significant performance increases for shot retrieval and for retrieving entire television programs. Our experiments also indicate that individual content-based retrieval methods yield approximately equal performance gains. We conclude that the time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
IEEE Transactions on Multimedia | 2012
Bouke Huurnink; Cees G. M. Snoek; M. de Rijke; Arnold W. M. Smeulders
Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this paper, we take into account the information needs and retrieval data already present in the audiovisual archive, and demonstrate that retrieval performance can be significantly improved when content-based methods are applied to search. To the best of our knowledge, this is the first time that the practice of an audiovisual archive has been taken into account for quantitative retrieval evaluation. To arrive at our main result, we propose an evaluation methodology tailored to the specific needs and circumstances of the audiovisual archive, which are typically missed by existing evaluation initiatives. We utilize logged searches, content purchases, session information, and simulators to create realistic query sets and relevance judgments. To reflect the retrieval practice of both the archive and the video retrieval community as closely as possible, our experiments with three video search engines incorporate archive-created catalog entries as well as state-of-the-art multimedia content analysis results. A detailed query-level analysis indicates that individual content-based retrieval methods such as transcript-based retrieval and concept-based retrieval yield approximately equal performance gains. When combined, we find that content-based video retrieval incorporated into the archives practice results in significant performance increases for shot retrieval and for retrieving entire television programs. The time has come for audiovisual archives to start accommodating content-based video retrieval methods into their daily practice.
multimedia information retrieval | 2007
Bouke Huurnink; Maarten de Rijke
Video producers, in telling a news story, tend to repeat important visual and speech material multiple times in adjacent shots, thus creating a certain level of redundancy. We describe this phenomenon, and use it to develop a framework to incorporate redundancy for cross-channel retrieval of visual items using speech. Testing our models in a series of retrieval experiments, we find that incorporating the fact that information occurs redundantly into cross-channel retrieval leads to significant improvements in retrieval performance.