Cédric Archambeau
Xerox
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cédric Archambeau.
cryptographic hardware and embedded systems | 2006
Cédric Archambeau; Eric Peeters; François-Xavier Standaert; Jean-Jacques Quisquater
Side-channel attacks are a serious threat to implementations of cryptographic algorithms. Secret information is recovered based on power consumption, electromagnetic emanations or any other form of physical information leakage. Template attacks are probabilistic side-channel attacks, which assume a Gaussian noise model. Using the maximum likelihood principle enables us to reveal (part of) the secret for each set of recordings (i.e., leakage trace). In practice, however, the major concerns are (i) how to select the points of interest of the traces, (ii) how to choose the minimal distance between these points, and (iii) how many points of interest are needed for attacking. So far, only heuristics were provided. In this work, we propose to perform template attacks in the principal subspace of the traces. This new type of attack addresses all practical issues in principled way and automatically. The approach is validated by attacking stream ciphers such as RC4. We also report analysis results of template style attacks against an FPGA implementation of AES Rijndael. Roughly, the template attack we carried out requires five time less encrypted messages than the best reported correlation attack against similar block cipher implementations.
Neural Computation | 2009
Manfred Opper; Cédric Archambeau
The variational approximation of posterior distributions by multivariate gaussians has been much less popular in the machine learning community compared to the corresponding approximation by factorizing distributions. This is for a good reason: the gaussian approximation is in general plagued by an number of variational parameters to be optimized, N being the number of random variables. In this letter, we discuss the relationship between the Laplace and the variational approximation, and we show that for models with gaussian priors and factorizing likelihoods, the number of variational parameters is actually . The approach is applied to gaussian process regression with nongaussian likelihoods.
cryptographic hardware and embedded systems | 2008
François-Xavier Standaert; Cédric Archambeau
The power consumption and electromagnetic radiation are among the most extensively used side-channels for analyzing physically observable cryptographic devices. This paper tackles three important questions in this respect. First, we compare the effectiveness of these two side-channels. We investigate the common belief that electromagnetic leakages lead to more powerful attacks than their power consumption counterpart. Second we study the best combination of the power and electromagnetic leakages. A quantified analysis based on sound information theoretic and security metrics is provided for these purposes. Third, we evaluate the effectiveness of two data dimensionality reduction techniques for constructing subspace-based template attacks. Selecting automatically the meaningful time samples in side-channel leakage traces is an important problem in the application of template attacks and it usually relies on heuristics. We show how classical statistical tools such as Principal Component Analysis and Fisher Linear Discriminant Analysis can be used for efficiently preprocessing the leakage traces.
international conference on machine learning | 2009
Frank D. Wood; Cédric Archambeau; Jan Gasthaus; Lancelot F. James; Yee Whye Teh
We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. We introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. We show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. We demonstrate the sequence memoizer by using it as a language model, achieving state-of-the-art results.
international conference on machine learning | 2006
Cédric Archambeau; Nicolas Delannay; Michel Verleysen
Principal components and canonical correlations are at the root of many exploratory data mining techniques and provide standard pre-processing tools in machine learning. Lately, probabilistic reformulations of these methods have been proposed (Roweis, 1998; Tipping & Bishop, 1999b; Bach & Jordan, 2005). They are based on a Gaussian density model and are therefore, like their non-probabilistic counterpart, very sensitive to atypical observations. In this paper, we introduce robust probabilistic principal component analysis and robust probabilistic canonical correlation analysis. Both are based on a Student-t density model. The resulting probabilistic reformulations are more suitable in practice as they handle outliers in a natural way. We compute maximum likelihood estimates of the parameters by means of the EM algorithm.
Neural Networks | 2007
Cédric Archambeau; Michel Verleysen
A new variational Bayesian learning algorithm for Student-t mixture models is introduced. This algorithm leads to (i) robust density estimation, (ii) robust clustering and (iii) robust automatic model selection. Gaussian mixture models are learning machines which are based on a divide-and-conquer approach. They are commonly used for density estimation and clustering tasks, but are sensitive to outliers. The Student-t distribution has heavier tails than the Gaussian distribution and is therefore less sensitive to any departure of the empirical distribution from Gaussianity. As a consequence, the Student-t distribution is suitable for constructing robust mixture models. In this work, we formalize the Bayesian Student-t mixture model as a latent variable model in a different way from Svensén and Bishop [Svensén, M., & Bishop, C. M. (2005). Robust Bayesian mixture modelling. Neurocomputing, 64, 235-252]. The main difference resides in the fact that it is not necessary to assume a factorized approximation of the posterior distribution on the latent indicator variables and the latent scale variables in order to obtain a tractable solution. Not neglecting the correlations between these unobserved random variables leads to a Bayesian model having an increased robustness. Furthermore, it is expected that the lower bound on the log-evidence is tighter. Based on this bound, the model complexity, i.e. the number of components in the mixture, can be inferred with a higher confidence.
the european symposium on artificial neural networks | 2008
Cédric Archambeau; Nicolas Delannay; Michel Verleysen
Mixtures of probabilistic principal component analyzers model high-dimensional nonlinear data by combining local linear models. Each mixture component is specifically designed to extract the local principal orientations in the data. An important issue with this generative model is its sensitivity to data lying off the low-dimensional manifold. In order to address this problem, the mixtures of robust probabilistic principal component analyzers are introduced. They take care of atypical points by means of a long tail distribution, the Student-t. It is shown that the resulting mixture model is an extension of the mixture of Gaussians, suitable for both robust clustering and dimensionality reduction. Finally, we briefly discuss how to construct a robust version of the closely related mixture of factor analyzers.
Communications of The ACM | 2011
Frank D. Wood; Jan Gasthaus; Cédric Archambeau; Lancelot F. James; Yee Whye Teh
Probabilistic models of sequences play a central role in most machine translation, automated speech recognition, lossless compression, spell-checking, and gene identification applications to name but a few. Unfortunately, real-world sequence data often exhibit long range dependencies which can only be captured by computationally challenging, complex models. Sequence data arising from natural processes also often exhibits power-law properties, yet common sequence models do not capture such properties. The sequence memoizer is a new hierarchical Bayesian model for discrete sequence data that captures long range dependencies and power-law characteristics, while remaining computationally attractive. Its utility as a language model and general purpose lossless compressor is demonstrated.
cryptographic hardware and embedded systems | 2006
François-Xavier Standaert; Eric Peeters; Cédric Archambeau; Jean-Jacques Quisquater
In this paper, we consider a recently introduced framework that investigates physically observable implementations from a theoretical point of view. The model allows quantifying the effect of practically relevant leakage functions with a combination of security and information theoretic metrics. More specifically, we apply our evaluation methodology to an exemplary block cipher. We first consider a Hamming weight leakage function and evaluate the efficiency of two commonly investigated countermeasures, namely noise addition and masking. Then, we show that the proposed methodology allows capturing certain non-trivial intuitions, e.g. about the respective effectiveness of these countermeasures. Finally, we justify the need of combined metrics for the evaluation, comparison and understanding of side-channel attacks.
Bioinformatics | 2009
Guido Sanguinetti; Andreas Ruttor; Manfred Opper; Cédric Archambeau
MOTIVATION Stress response in cells is often mediated by quick activation of transcription factors (TFs). Given the difficulty in experimentally assaying TF activities, several statistical approaches have been proposed to infer them from microarray time courses. However, these approaches often rely on prior assumptions which rule out the rapid responses observed during stress response. RESULTS We present a novel statistical model to infer how TFs mediate stress response in cells. The model is based on the assumption that sensory TFs quickly transit between active and inactive states. We therefore model mRNA production using a bistable dynamical systems whose behaviour is described by a system of differential equations driven by a latent stochastic process. We assume the stochastic process to be a two-state continuous time jump process, and devise both an exact solution for the inference problem as well as an efficient approximate algorithm. We evaluate the method on both simulated data and real data describing Escherichia colis response to sudden oxygen starvation. This highlights both the accuracy of the proposed method and its potential for generating novel hypotheses and testable predictions. AVAILABILITY MATLAB and C++ code used in the article can be downloaded from http://www.dcs.shef.ac.uk/~guido/.