Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Anne Ruiz-Gazen is active.

Publication


Featured researches published by Anne Ruiz-Gazen.


Papers in Regional Science | 2003

Explaining the pattern of regional unemployment: The case of the Midi-Pyrénées region

Yves Aragon; Dominique Haughton; Jonathan Haughton; Eve Leconte; Eric Malin; Anne Ruiz-Gazen; Christine Thomas-Agnan

Abstract. Unemployment rates vary widely at the sub-regional level. We seek to explain why such variation occurs, using data for 174 districts in the Midi-Pyrénées region of France for 1990–1991. A set of explanatory variables is derived from theory and the voluminous literature. The best model includes a correction for spatially autocorrelated errors. Unemployment rates are higher in urban areas and, where per capita income is higher, are consistent with the view that unemployment differences largely reflect variations in “amenities.” Along with a lack of evidence of housing market rigidities, these suggest that subregional variations in unemployment are not mainly the result of labor market disequilibrium.


Computational Statistics & Data Analysis | 2003

A monitoring display of multivariate outliers

Henri Caussinus; M. Fekri; S. Hakam; Anne Ruiz-Gazen

A projection pursuit approach is considered for the detection and visualization of multivariate outliers, the term outlier being used in a broad sense. Such a framework leads to assess the significance of the displays themselves rather than the significance of suspected observations. The necessary mathematical results to implement this strategy are provided in the case of a generalized principal component analysis aimed to the display of discordant observations. Three examples illustrate the various aspects of the proposed technique.


Annals of Mathematics and Artificial Intelligence | 2010

Genetic algorithms and particle swarm optimization for exploratory projection pursuit

Alain Berro; Souad Larabi Marie-Sainte; Anne Ruiz-Gazen

Exploratory Projection Pursuit (EPP) methods have been developed thirty years ago in the context of exploratory analysis of large data sets. These methods consist in looking for low-dimensional projections that reveal some interesting structure existing in the data set but not visible in high dimension. Each projection is associated with a real valued index which optima correspond to valuable projections. Several EPP indices have been proposed in the statistics literature but the main problem lies in their optimization. In the present paper, we propose to apply Genetic Algorithms (GA) and recent Particle Swarm Optimization (PSO) algorithm to the optimization of several projection pursuit indices. We explain how the EPP methods can be implemented in order to become an efficient and powerful tool for the statistician. We illustrate our proposal on several simulated and real data sets.


Electronic Journal of Statistics | 2012

Approximation of rejective sampling inclusion probabilities and application to high order correlations

Hélène Boistard; Hendrik P. Lopuhaä; Anne Ruiz-Gazen

This paper is devoted to rejective sampling. We provide an expansion of joint inclusion probabilities of any order in terms of the inclusion probabilities of order one, extending previous results by Hajek (1964) and Hajek (1981) and making the remainder term more precise. Following Hajek (1981), the proof is based on Edgeworth expansions. The main result is applied to derive bounds on higher order correlations, which are needed for the consistency and asymptotic normality of several complex estimators.


Archive | 2010

Detecting Multivariate Outliers Using Projection Pursuit with Particle Swarm Optimization

Anne Ruiz-Gazen; Souad Larabi Marie-Sainte; Alain Berro

Detecting outliers in the context of multivariate data is known as an important but difficult task and there already exist several detection methods. Most of the proposed methods are based either on the Mahalanobis distance of the observations to the center of the distribution or on a projection pursuit (PP) approach. In the present paper we focus on the one-dimensional PP approach which may be of particular interest when the data are not elliptically symmetric. We give a survey of the statistical literature on PP for multivariate outliers etection and investigate the pros and cons of the different methods. We also propose the use of a recent heuristic optimization algorithm called Tribes for multivariate outliers detection in the projection pursuit context.


international conference on swarm intelligence | 2010

An efficient optimization method for revealing local optima of projection pursuit indices

Souad Larabi Marie-Sainte; Alain Berro; Anne Ruiz-Gazen

In order to summarize and represent graphically multidimensional data in statistics, projection pursuit methods look for projection axes which reveal structures, such as possible groups or outliers, by optimizing a function called projection index. To determine these possible interesting structures, it is necessary to choose an optimization method capable to find not only the global optimum of the projection index but also the local optima susceptible to reveal these structures. For this purpose, we suggest a metaheuristic which does not ask for many parameters to settle and which provokes premature convergence to local optima. This method called Tribes is a hybrid Particle Swarm Optimization method (PSO) based on a stochastic optimization technique developed in [2]. The computation is fast even for big volumes of data so that the use of the method in the field of projection pursuit fulfills the statistician expectations.


2015 IEEE 10th International Symposium on Diagnostics for Electrical Machines, Power Electronics and Drives (SDEMPED) | 2015

Variable importance assessment in lifespan models of insulation materials: A comparative study

Farah Salameh; Antoine Picot; Marie Chabert; Eve Leconte; Anne Ruiz-Gazen; Pascal Maussion

This paper presents and compares different methods for evaluating the relative importance of variables involved in insulation lifespan models. Parametric and non-parametric models are derived from accelerated aging tests on twisted pairs covered with an insulating varnish under different stress constraints (voltage, frequency and temperature). Parametric models establish a simple stress-lifespan relationship and the variable importance can be evaluated from the estimated parameters. As an alternative approach, non-parametric models explain the stress-lifespan relationship by means of regression trees or random forests (RF) for instance. Regression trees naturally provide a hierarchy between the variables. However, they suffer from a high dependency with respect to the training set. This paper shows that RF provide a more robust model while allowing a quantitative variable importance assessment. Comparisons of the different models are performed on different training and test sets obtained through experiments.


Archive | 2007

Classification and Generalized Principal Component Analysis

Henri Caussinus; Anne Ruiz-Gazen

In previous papers, we propose a generalized principal component analysis (GPCA) aimed to display salient features of a multidimensional data set, in particular the existence of clusters. In the light of an example, this article evidences how GPCA and clustering methods are complementary. The projections provided by GPCA and the sequence of eigenvalues give useful indications on the number and the type of clusters to be expected; submitting GPCA principal components to a clustering algorithm instead of the raw data can improve the classification. The use of a convenient robustification of GPCA is also evoked.


Statistical Analysis and Data Mining | 2015

Beyond multidimensional data in model visualization: High-dimensional and complex nonnumeric data

Anne Ruiz-Gazen

We greatly appreciate the opportunity to read and discuss the paper ‘Visualizing statistical models: removing the blindfold’ [1]. The article manages to close the gap between statistics and visualization by combining advanced methods from both fields to visualize statistical models and methods. The article provides a very nice overview of visualization tools for statistical models, which can be used to i) understand the model itself (i.e., what the model says about the data), ii) assess its relevance (i.e., if the model is accurate to describe the data, if the model has been well trained...) and iii) evaluate the variability of a family of models, use this information to select a model within a family, evaluate its robustness and combine several models in a relevant manner. We believe that this point of view is innovative and rarely addressed, because statisticians tend to rely more on graphics to visualize data themselves and on numeric criteria and statistics to evaluate models. However, as demonstrated in the article, applied statisticians would take great advantage of using interactive visualization methods for fitting and interpreting an adequate model. Moreover, such tools are now easily accessible, using, for instance, the R packages rggobi, classifly, clusterfly and meifly that are described in the article. Nowadays, data analysis is a highly developing field that has applications in many disciplines such as biology, genetics, economics, marketing and meteorology. Data are also increasingly challenging: high-dimensional data, ‘big data’, complex and possibly nonnumeric data. In this context, visualization must be part of the standard background available for any statistician or data scientist. Combining specialized statistical methods with visualization, as


Computational Statistics & Data Analysis | 2018

ICS for multivariate outlier detection with application to quality control

Aurore Archimbaud; Klaus Nordhausen; Anne Ruiz-Gazen

Abstract In high reliability standards fields such as automotive, avionics or aerospace, the detection of anomalies is crucial. An efficient methodology for automatically detecting multivariate outliers is introduced. It takes advantage of the remarkable properties of the Invariant Coordinate Selection (ICS) method which leads to an affine invariant coordinate system in which the Euclidian distance corresponds to a Mahalanobis Distance (MD) in the original coordinates. The limitations of MD are highlighted using theoretical arguments in a context where the dimension of the data is large. Owing to the resulting dimension reduction, ICS is expected to improve the power of outlier detection rules such as MD-based criteria. The paper includes practical guidelines for using ICS in the context of a small proportion of outliers. The use of the regular covariance matrix and the so called matrix of fourth moments as the scatter pair is recommended. This choice combines the simplicity of implementation together with the possibility to derive theoretical results. The selection of relevant invariant components through parallel analysis and normality tests is addressed. A simulation study confirms the good properties of the proposal and provides a comparison with Principal Component Analysis and MD. The performance of the proposal is also evaluated on two real data sets using a user-friendly R package accompanying the paper.

Collaboration


Dive into the Anne Ruiz-Gazen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Klaus Nordhausen

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar

Alain Berro

University of Toulouse

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

M. Fekri

Paul Sabatier University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Haziza

Université de Montréal

View shared research outputs
Researchain Logo
Decentralizing Knowledge