Gerben Klaas Dirk de Vries
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gerben Klaas Dirk de Vries.
Expert Systems With Applications | 2012
Gerben Klaas Dirk de Vries; Maarten van Someren
In this paper we present a machine learning framework to analyze moving object trajectories from maritime vessels. Within this framework we perform the tasks of clustering, classification and outlier detection with vessel trajectory data. First, we apply a piecewise linear segmentation method to the trajectories to compress them. We adapt an existing technique to better retain stop and move information and show the better performance of our method with experimental results. Second, we use a similarity based approach to perform the clustering, classification and outlier detection tasks using kernel methods. We present experiments that investigate different alignment kernels and the effect of piecewise linear segmentation in the three different tasks. The experimental results show that compression does not negatively impact task performance and greatly reduces computation time for the alignment kernels. Finally, the alignment kernels allow for easy integration of geographical domain knowledge. In experiments we show that this added domain knowledge enhances performance in the clustering and classification tasks.
International Journal of Geographical Information Science | 2010
Niels Willems; Willem Robert van Hage; Gerben Klaas Dirk de Vries; Jeroen H.M. Janssens; Véronique Malaisé
We present an integrated and multidisciplinary approach for analyzing the behavior of moving objects. The results originate from an ongoing research of four different partners from the Dutch Poseidon project (Embedded Systems Institute (2007)), which aims to develop new methods for Maritime Safety and Security (MSS) systems to monitor vessel traffic in coastal areas. Our architecture enables an operator to visually test hypotheses about vessels with time-dependent sensor data and on-demand external knowledge. The system includes the following components: abstraction and simulation of trajectory sensor data, fusion of multiple heterogenous data sources, reasoning, and visual analysis of the combined data sources. We start by extracting segments of consistent movement from simulated or real-world trajectory data, which we store as instances of the Simple Event Model (SEM), an event ontology represented in the Resource Description Framework (RDF). Next, we add data from the web about vessels and geography to enrich the sensor data. This additional information is integrated with the representation of the vessels (actors) and places in SEM. The enriched trajectory data are stored in a knowledge base, which can be further annotated by reasoning and is queried by a visual analytics tool to search for spatiotemporal patterns. Although our approach is dedicated to MSS systems, we expect it to be useful in other domains.
Caries Research | 2009
Willem Robert van Hage; Véronique Malaisé; Gerben Klaas Dirk de Vries; Guus Schreiber; Maarten van Someren
Bridging the gap between low-level features and semantics is a problem commonly acknowledged in the Multimedia community. Event modeling can fill the gap. In this paper we present the Simple Event Model (SEM) and its application in a Maritime Safety and Security use case about Situational Awareness. We show how we abstract over low-level features, recognize simple behavior events using a Piecewise Linear Segmentation algorithm, and model the events as instances of SEM. We apply deduction rules, spatial proximity reasoning, and semantic web reasoning in SWI-Prolog to derive abstract events from the recognized simple events. The use case described in this paper come from the Dutch Poseidon project.
european conference on machine learning | 2013
Gerben Klaas Dirk de Vries
We introduce an approximation of the Weisfeiler-Lehman graph kernel algorithm aimed at improving the computation time of the kernel when applied to Resource Description Framework (RDF) data. RDF is the representation/storarge format of the semantic web and it essentially represents a graph. One direction for learning from the semantic web is using graph kernel methods on RDF. This is a very generic and flexible approach to learning from the semantic web, since it requires no knowledge of the semantics of the dataset and can be applied to nearly all linked data. Graph kernel computation is in general slow, since it is often based on computing some form of expensive (iso)morphism between graphs. We present an approximation of the Weisfeiler-Lehman (WL) graph kernel [2] to speed up the computation of this kernel on RDF data. Typically, applying graph kernels to RDF is done by extracting subgraphs from a large underlying RDF graph and computing the kernel on this set of subgraphs. Our approximation exploits the fact that the subgraph instances are extracted from the same RDF graph. We adapt the WL algorithm to compute the kernel directly on the underlying graph, while maintaining a subgraph perspective for each instance. We compare the performance of this kernel to the graph kernels designed for RDF described in [1]. For this comparison we use three property prediction tasks on RDF data from two datasets. In each task we try to predict a property for a certain class of resources. For instance, the first task is predicting the affilition of the people in a research institute, for which the data is modeled as RDF. Furthermore, we compare the computation time of the different kernels. In all three tasks, our kernel shows performance that is better than the regular Weisfeiler-Lehman kernel applied to RDF. Also it is increasingly more efficient as the number of instances grows by exploiting the fact that the RDF instance subgraphs share vertices and edges in the underlying large RDF graph. Furthermore, the presented kernel is faster and/or shows better classification performance than the intersection subtree and intersection graph kernels for RDF, introduced in [1]. The performance difference between the presented approximation of the WL Subtree kernel and the regular version requires further investigation.
Multimedia Tools and Applications | 2012
Willem Robert van Hage; Véronique Malaisé; Gerben Klaas Dirk de Vries; Guus Schreiber; Maarten van Someren
Bridging the gap between low-level features and semantics is a problem commonly acknowledged in the Multimedia community. Event modeling can fill this gap by representing knowledge about the data at different level of abstraction. In this paper we present the Simple Event Model (SEM) and its application in a Maritime Safety and Security use case about Situational Awareness, where the data also come as low-level features (of ship trajectories). We show how we abstract over these low-level features, recognize simple behavior events using a Piecewise Linear Segmentation algorithm, and model the resulting events as instances of SEM. We aggregate web data from different sources, apply deduction rules, spatial proximity reasoning, and semantic web reasoning in SWI-Prolog to derive abstract events from the recognized simple events. The use case described in this paper comes from the Dutch Poseidon project.
european conference on machine learning | 2010
Gerben Klaas Dirk de Vries; Maarten van Someren
In this paper we apply a selection of alignment measures, such as dynamic time warping and edit distance, to the problem of clustering vessel trajectories. Vessel trajectories are an example of moving object trajectories, which have recently become an important research topic. The alignment measures are defined as kernels and are used in the kernel k-means clustering algorithm. We investigate the performance of these alignment kernels in combination with a trajectory compression method. Experiments on a gold standard dataset indicate that compression has a positive effect on clustering performance for a number of alignment measures. Also, soft-max kernels, based on summing all alignments, perform worse than classic kernels, based on taking the score of the best alignment.
Journal of Web Semantics | 2015
Gerben Klaas Dirk de Vries; Steven de Rooij
In this paper we introduce a framework for learning from RDF data using graph kernels that count substructures in RDF graphs, which systematically covers most of the existing kernels previously defined and provides a number of new variants. Our definitions include fast kernel variants that are computed directly on the RDF graph. To improve the performance of these kernels we detail two strategies. The first strategy involves ignoring the vertex labels that have a low frequency among the instances. Our second strategy is to remove hubs to simplify the RDF graphs. We test our kernels in a number of classification experiments with real-world RDF datasets. Overall the kernels that count subtrees show the best performance. However, they are closely followed by simple bag of labels baseline kernels. The direct kernels substantially decrease computation time, while keeping performance the same. For the walks counting kernel this decrease in computation time is so large that it thereby becomes a computationally viable kernel to use. Ignoring low frequency labels improves the performance for all datasets. The hub removal algorithm increases performance on two out of three of our smaller datasets, but has little impact when used on our larger datasets. Systematic graph kernel framework for RDF.Fast computation algorithms.Low frequency labels and hub removal on RDF to enhance machine learning.
extended semantic web conference | 2012
Rinke Hoekstra; Sara Magliacane; Laurens Rietveld; Gerben Klaas Dirk de Vries; Adianto Wibisono; Stefan Schlobach
The AERS datasets is one of the few remaining, large publicly available medical data sets that until now have not been published as Linked Data. It is uniquely positioned amidst other medical datasets. This paper describes the Hubble prototype system for clinical decision support that demonstrates the speed, ease and flexibility of producing and using a Linked Data version of the AERS dataset for clinical practice and research.
international conference on data mining | 2010
Gerben Klaas Dirk de Vries; Willem Robert van Hage; Maarten van Someren
This paper presents a similarity measure that combines low-level trajectory information with geographical domain knowledge to compare vessel trajectories. The similarity measure is largely based on alignment techniques. In a clustering experiment we show how the measure can be used to discover behavior concepts in vessel trajectory data that are dependent both on the low-level trajectories and the domain knowledge. We also apply this measure in a classification task to predict the type of vessel. In this task the combined measure performs better than similarities based on domain knowledge or low-level information alone.
international provenance and annotation workshop | 2014
Adianto Wibisono; Peter Bloem; Gerben Klaas Dirk de Vries; Paul T. Groth; Adam Belloum; Marian Bubak
Electronic notebooks are a common mechanism for scientists to document and investigate their work. With the advent of tools such as IPython Notebooks and Knitr, these notebooks allow code and data to be mixed together and published online. However, these approaches assume that all work is done in the same notebook environment. In this work, we look at generating notebook documentation from multi-environment workflows by using provenance represented in the W3C PROV model. Specifically, using PROV generated from the Ducktape workflow system, we are able to generate IPython notebooks that include results tables, provenance visualizations as well as references to the software and datasets used. The notebooks are interactive and editable, so that the user can explore and analyze the results of the experiment without re-running the workflow. We identify specific extensions to PROV necessary for facilitating documentation generation. To evaluate, we recreate the documentation website for a paper which won the Open Science Award at the ECML/PKDD 2013 machine learning conference. We show that the documentation produced automatically by our system provides more detail and greater experimental insight than the original hand-crafted documentation. Our approach bridges the gap between user friendly notebook documentation and provenance generated by distributed heterogeneous components.