Remko Scha
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Remko Scha.
IEEE Transactions on Multimedia | 2010
Jasper R. R. Uijlings; Arnold W. M. Smeulders; Remko Scha
As datasets grow increasingly large in content-based image and video retrieval, computational efficiency of concept classification is important. This paper reviews techniques to accelerate concept classification, where we show the trade-off between computational efficiency and accuracy. As a basis, we use the Bag-of-Words algorithm that in the 2008 benchmarks of TRECVID and PASCAL lead to the best performance scores. We divide the evaluation in three steps: 1) Descriptor Extraction, where we evaluate SIFT, SURF, DAISY, and Semantic Textons. 2) Visual Word Assignment, where we compare a k-means visual vocabulary with a Random Forest and evaluate subsampling, dimension reduction with PCA, and division strategies of the Spatial Pyramid. 3) Classification, where we evaluate the χ2, RBF, and Fast Histogram Intersection kernel for the SVM. Apart from the evaluation, we accelerate the calculation of densely sampled SIFT and SURF, accelerate nearest neighbor assignment, and improve accuracy of the Histogram Intersection kernel. We conclude by discussing whether further acceleration of the Bag-of-Words pipeline is possible. Our results lead to a 7-fold speed increase without accuracy loss, and a 70-fold speed increase with 3% accuracy loss. The latter system does classification in real-time, which opens up new applications for automatic concept classification. For example, this system permits five standard desktop PCs to automatically tag for 20 classes all images that are currently uploaded to Flickr.
conference on image and video retrieval | 2009
Jasper R. R. Uijlings; Arnold W. M. Smeulders; Remko Scha
We start from the state-of-the-art Bag of Words pipeline that in the 2008 benchmarks of TRECvid and PASCAL yielded the best performance scores. We have contributed to that pipeline, which now forms the basis to compare various fast alternatives for all of its components: (i) For descriptor extraction we propose a fast algorithm to densely sample SIFT and SURF, and we compare several variants of these descriptors. (ii) For descriptor projection we compare a k-means visual vocabulary with a Random Forest. As a preprojection step we experiment with PCA on the descriptors to decrease projection time. (iii) For classification we use Support Vector Machines and compare the x2 kernel with the RBF kernel. Our results lead to a 10-fold speed increase without any loss of accuracy and to a 30-fold speed increase with 17% loss of accuracy, where the latter system does real-time classification at 26 images per second.
international conference on computational linguistics | 1988
Remko Scha; Livia Polanyi
This paper presents an augmented context free grammar which describes important features of the surface structure and the semantics of discourse in a formal way, integrating new as well as previously existing insights into a unified framework. The structures covered include lists, narratives, subordinating and coordinating rhetorical relations, topic chains and interruptions. The paper discusses the problem of parsing discourse, and compares different grammatical formalisms which could be used for describing discourse structure.
meeting of the association for computational linguistics | 1997
Remko Bonnema; Rens Bod; Remko Scha
In data-oriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, using corpora with syntactic annotations such as the Penn Tree-bank. If a corpus with semantically annotated sentences is used, the same approach can also generate the most probable semantic interpretation of an input sentence. The present paper explains this semantic interpretation method. A data-oriented semantic interpretation algorithm was tested on two semantically annotated corpora: the English ATIS corpus and the Dutch OVIS corpus. Experiments show an increase in semantic accuracy if larger corpus-fragments are taken into consideration.
computer vision and pattern recognition | 2009
Jasper R. R. Uijlings; Arnold W. M. Smeulders; Remko Scha
This paper discusses the question: Can we improve the recognition of objects by using their spatial context? We start from Bag-of-Words models and use the Pascal 2007 dataset. We use the rough object bounding boxes that come with this dataset to investigate the fundamental gain context can bring. Our main contributions are: (I) The result of Zhang et al. in CVPR07 that context is superfluous derived from the Pascal 2005 data set of 4 classes does not generalize to this dataset. For our larger and more realistic dataset context is important indeed. (II) Using the rough bounding box to limit or extend the scope of an object during both training and testing, we find that the spatial extent of an object is determined by its category: (a) well-defined, rigid objects have the object itself as the preferred spatial extent. (b) Non-rigid objects have an unbounded spatial extent : all spatial extents produce equally good results. (c) Objects primarily categorised based on their function have the whole image as their spatial extent. Finally, (III) using the rough bounding box to treat object and context separately, we find that the upper bound of improvement is 26% (12% absolute) in terms of mean average precision, and this bound is likely to be higher if the localisation is done using segmentation. It is concluded that object localisation, if done sufficiently precise, helps considerably in the recognition of objects for the Pascal 2007 dataset.
Linguistics and Philosophy | 1994
Hub Prüst; Remko Scha; Martin van den Berg
We argue that an adequate treatment of verb phrase anaphora (VPA) must depart in two major respects from the standard approaches. First of all, VP anaphors cannot be resolved by simply identifying the anaphoric VP with an antecedent VP. The resolution process must establish a syntactic/semantic parallelism between larger units (clauses or discourse constituent units) that the VPs occur in. Secondly, discourse structure has a significant influence on the reference possibilities of VPA. This influence must be accounted for.We propose a treatment which meets these requirements. It builds on a discourse grammar which characterizes discourse cohesion by means of a syntactic/semantic matching procedure which recognizes parallel structures in discourse. It turns out that this independently motivated procedure yields the resolution of VPA as a side effect.
Corpus-Based Methods in Language and Speech | 1997
Rens Bod; Remko Scha
In this chapter, we discuss the data-oriented approach to language processing (Scha, 1990–92; Bod, 1992–96; Sima’an, 1995–96; Sekine & Grishman, 1995; Rajman, 1995a/b; Charniak, 1996; Goodman, 1996). Systems based on this approach maintain a corpus of analyses of previously occurring utterances, and analyze new input by combining fragments of the utterance-analyses from the corpus; the occurrence-frequencies of these fragments are used to estimate which analysis is the most probable one. This chapter motivates this idea, discusses some algorithms that implement it, and reports on a number of experiments. We will only consider syntactic aspects of language here, focusing on parsing and syntactic disambiguation. For a treatment of semantic interpretation in this framework, see Bod et al. (1996).
Leonardo Music Journal | 2002
Arthur Elsenaar; Remko Scha
The authors trace the history of electric performance art. They begin with the roots of this art form in 18th-century experiments with animal electricity and artificial electricity, which were often performed as public demonstrations in royal courts and anatomical theaters. Next, the authors sketch the development of increasingly powerful techniques for the generation of electric current and their applications in destructive body manipulation, culminating in the electric chair. Finally, they discuss the development of electric muscle-control technology, from its 18th-century beginnings through Duchenne de Boulognes photo sessions to the current work of Stelarc and Arthur Elsenaar.
empirical methods in natural language processing | 2009
Reut Tsarfaty; Khalil Sima'an; Remko Scha
Applying statistical parsers developed for English to languages with freer word-order has turned out to be harder than expected. This paper investigates the adequacy of different statistical parsing models for dealing with a (relatively) free word-order language. We show that the recently proposed Relational-Realizational (RR) model consistently outperforms state-of-the-art Head-Driven (HD) models on the Hebrew Treebank. Our analysis reveals a weakness of HD models: their intrinsic focus on configurational information. We conclude that the form-function separation ingrained in RR models makes them better suited for parsing nonconfigurational phenomena.
Journal of Mathematical Psychology | 2003
Mehdi Dastani; Remko Scha
Abstract Any formal model of visual Gestalt perception requires a language for representing possible perceptual structures of visual stimuli, as well as a decision criterion that selects the actually perceived structure of a stimulus among its possible alternatives. This paper discusses an existing model of visual Gestalt perception that is based on Structural Information Theory. We investigate two factors that determine the representational power of this model: the domain of visual stimuli that can be analyzed, and the class of perceptual structures that can be generated for these stimuli. We show that the representational power of the existing model of Structural Information Theory is limited, and that some of the generated structures are perceptually inadequate. We argue that these limitations do not imply the implausibility of the underlying ideas of Structural Information Theory and introduce alternative models based on the same ideas. For each of these models, the domain of visual stimuli that can be analyzed properly is formally defined. We show that the models are conservative modifications of the original model of Structural Information Theory: for cases that are adequately analyzed in the original model of Structural Information Theory, they yield the same results.