Sander Dieleman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sander Dieleman is active.

Explore More

Publication

Featured researches published by Sander Dieleman.

Monthly Notices of the Royal Astronomical Society | 2015

Rotation-invariant convolutional neural networks for galaxy morphology prediction

Sander Dieleman; Kyle W. Willett; Joni Dambre

Measuring the morphological parameters of galaxies is a key requirement for studying their formation and evolution. Surveys such as the Sloan Digital Sky Survey have resulted in the availability of very large collections of images, which have permitted population-wide analyses of galaxy morphology. Morphological analysis has traditionally been carried out mostly via visual inspection by trained experts, which is time consuming and does not scale to large (≳104) numbers of images. Although attempts have been made to build automated classification systems, these have not been able to achieve the desired level of accuracy. The Galaxy Zoo project successfully applied a crowdsourcing strategy, inviting online users to classify images by answering a series of questions. Unfortunately, even this approach does not scale well enough to keep up with the increasing availability of galaxy images. We present a deep neural network model for galaxy morphology classification which exploits translational and rotational symmetry. It was developed in the context of the Galaxy Challenge, an international competition to build the best model for morphology classification based on annotated images from the Galaxy Zoo project. For images with high agreement among the Galaxy Zoo participants, our model is able to reproduce their consensus with near-perfect accuracy (>99 per cent) for most questions. Confident model predictions are highly accurate, which makes the model suitable for filtering large collections of images and forwarding challenging images to experts for manual annotation. This approach greatly reduces the experts’ workload without affecting accuracy. The application of these algorithms to larger sets of training data will be critical for analysing results from future surveys such as the Large Synoptic Survey Telescope.

international conference on acoustics, speech, and signal processing | 2014

End-to-end learning for music audio

Sander Dieleman; Benjamin Schrauwen

Content-based music information retrieval tasks have traditionally been solved using engineered features and shallow processing architectures. In recent years, there has been increasing interest in using feature learning and deep architectures instead, thus reducing the required engineering effort and the need for prior knowledge. However, this new approach typically still relies on mid-level representations of music audio, e.g. spectrograms, instead of raw audio signals. In this paper, we investigate whether it is possible to apply feature learning directly to raw audio signals. We train convolutional neural networks using both approaches and compare their performance on an automatic tagging task. Although they do not outperform a spectrogram-based approach, the networks are able to autonomously discover frequency decompositions from raw audio, as well as phase-and translation-invariant feature representations.

european conference on computer vision | 2014

Sign Language Recognition Using Convolutional Neural Networks

Lionel Pigou; Sander Dieleman; Pieter-Jan Kindermans; Benjamin Schrauwen

There is an undeniable communication problem between the Deaf community and the hearing majority. Innovations in automatic sign language recognition try to tear down this communication barrier. Our contribution considers a recognition system using the Microsoft Kinect, convolutional neural networks (CNNs) and GPU acceleration. Instead of constructing complex handcrafted features, CNNs are able to automate the process of feature construction. We are able to recognize 20 Italian gestures with high accuracy. The predictive model is able to generalize on users and surroundings not occurring during training with a cross-validation accuracy of 91.7%. Our model achieves a mean Jaccard Index of 0.789 in the ChaLearn 2014 Looking at People gesture spotting competition.

International Journal of Computer Vision | 2018

Beyond temporal pooling : recurrence and temporal convolutions for gesture recognition in video

Lionel Pigou; Aäron van den Oord; Sander Dieleman; Mieke Van Herreweghe; Joni Dambre

Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results.

international conference on artificial neural networks | 2012

Training restricted boltzmann machines with multi-tempering: harnessing parallelization

Philemon Brakel; Sander Dieleman; Benjamin Schrauwen

Restricted Boltzmann Machines (RBMs) are unsupervised probabilistic neural networks that can be stacked to form Deep Belief Networks. Given the recent popularity of RBMs and the increasing availability of parallel computing architectures, it becomes interesting to investigate learning algorithms for RBMs that benefit from parallel computations. In this paper, we look at two extensions of the parallel tempering algorithm, which is a Markov Chain Monte Carlo method to approximate the likelihood gradient. The first extension is directed at a more effective exchange of information among the parallel sampling chains. The second extension estimates gradients by averaging over chains from different temperatures. We investigate the efficiency of the proposed methods and demonstrate their usefulness on the MNIST dataset. Especially the weighted averaging seems to benefit Maximum Likelihood learning.

international symposium on neural networks | 2013

The spectral radius remains a valid indicator of the Echo state property for large reservoirs

Ken Caluwaerts; Francis wyffels; Sander Dieleman; Benjamin Schrauwen

In the field of Reservoir Computing, scaling the spectral radius of the weight matrix of a random recurrent neural network to below unity is a commonly used method to ensure the Echo State Property. Recently it has been shown that this condition is too weak. To overcome this problem, other - more involved - sufficient conditions for the Echo State Property have been proposed. In this paper we provide a large-scale experimental verification of the Echo State Property for large recurrent neural networks with zero input and zero bias. Our main conclusion is that the spectral radius method remains a valid indicator of the Echo State Property; the probability that the Echo State Property does not hold, drops for larger networks with spectral radius below unity, which are the ones of practical interest.

International Conference on Graphic and Image Processing (ICGIP 2012) | 2013

Learning a piecewise linear transform coding scheme for images

Aäron van den Oord; Sander Dieleman; Benjamin Schrauwen

Gaussian mixture models are among the most widely accepted methods for clustering and probability density estimation. Recently it has been shown that these statistical methods are perfectly suited for learning patch-based image priors for various image restoration problems. In this paper we investigate the use of GMMs for image compression. A piecewise linear transform coding scheme based on Vector Quantization is proposed. In this scheme two different learning algorithms for GMMs are considered and compared. Experimental results demonstrate that the proposed techniques outperform JPEG, with results comparable to JPEG2000 for a broad class of images.

arXiv: Sound | 2016