Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nicolas Baskiotis is active.

Publication


Featured researches published by Nicolas Baskiotis.


ieee international conference on data science and advanced analytics | 2015

Hierarchical label partitioning for large scale classification

Raphael Puget; Nicolas Baskiotis

Extreme classification task where the number of classes is very large has received important focus over the last decade. Usual efficient multi-class classification approaches have not been designed to deal with such large number of classes. A particular issue in the context of large scale problems concerns the computational classification complexity : best multi-class approaches have generally a linear complexity with respect to the number of classes which does not allow these approaches to scale up. Recent works have put their focus on using hierarchical classification process in order to speed-up the classification of new instances. Using a priori information on labels such as a label hierarchy allows to build an efficient hierarchical structure over the labels in order to decrease logarithmically the classification time. However such information on labels is not always available nor useful. Finding a suitable hierarchical organization of the labels is thus a crucial issue as the accuracy of the model depends highly on the label assignment through the label tree. We propose in this work a new algorithm to build iteratively a hierarchical label structure by proposing a partitioning algorithm which optimizes simultaneously the structure in terms of classification complexity and the label partitioning problem in order to achieve high classification performances. Beginning from a flat tree structure, our algorithm selects iteratively a node to expand by adding a new level of nodes between the considered node and its children. This operation increases the speed-up of the classification process. Once the node is selected, best partitioning of the classes has to be computed. We propose to consider a measure based on the maximization of the expected loss of the sub-levels in order to minimize the global error of the structure. This choice enforces hardly separable classes to be group together in same partitions at the first levels of the tree structure and it delays errors at a deep level of the structure where there is no incidence on the accuracy of other classes. Experiments on real big text data from recent challenge assess the performances of our model.


international conference on intelligent transportation systems | 2016

Smart card in public transportation: Designing a analysis system at the human scale

Emeric Tonnelier; Nicolas Baskiotis; Vincent Guigue; Patrick Gallinari

In the 20th century, most mobility studies were based on costly surveys with few samples; nowadays, the data from static and mobile sensors allow to track the habits of a massive number of citizens. However, the counterpart of sensors data is that they generally provide noisy and partial signals lacking semantic information: the purpose of each human activity captured by the sensor is unknown. Extracting this latent semantic information from raw sensors data is a challenging and crucial task. In this paper, a novel algorithm based on non negative matrix factorization (NMF) is proposed in order to extract precise and meaningful user temporal profiles from logs of smart card data in a transportation system. The proposed NMF based algorithm allows a natural and informative clustering of the profiles which can lead to semantic information on the mobility of the users. The approach is compared to 4 others algorithms and focuses on the human scale, indeed, individual profiles differ quite substantially from group profiles. Experiments are conducted on a 3 months dataset supplied by the STIF, the Parisian public transport authority.


international conference on intelligent transportation systems | 2016

Joint prediction of road-traffic and parking occupancy over a city with representation learning

Ali Ziat; Bertrand Leroy; Nicolas Baskiotis; Ludovic Denoyer

As journey planning services begins to include real time traffic forecast features in order to compute more accurate routing along the journey, adaptive traffic control systems can also benefit from this prediction so as to minimize traffic congestion. But these two systems dedicated to end user and road traffic management authorities could also benefits from other information, and particularly from parking availability prediction since cruising for parking spot represents a significant part of urban traffic: when looking for a parking, drivers must guess where to go, and if they are wrong, may face long distances to find the next location, resulting in considerable time loss and a worsening of traffic congestion. We focus on the simultaneous prediction of traffic and parking availability. Our approach relay on machine learning techniques and more precisely on representation learning methods: each road and car-park is represented by a vector in a common large dimensional space which captures both structural and dynamical information about the observed phenomenon. Such a model is thus able to jointly capture the spatio-temporal correlations between parking and traffic resulting in a high performance prediction system. The results of our experiments on the Grand Lyon (France) urban area show the effectiveness of our approach compared to state of the art methods.


Neurocomputing | 2018

Anomaly detection in smart card logs and distant evaluation with Twitter: a robust framework

Emeric Tonnelier; Nicolas Baskiotis; Vincent Guigue; Patrick Gallinari

Abstract Smart card logs constitute a valuable source of information to model a public transportation network and characterize normal or abnormal events; however, this source of data is associated to a high level of noise and missing data, thus, it requires robust analysis tools. First, we define an anomaly as any perturbation in the transportation network with respect to a typical day: temporary interruption, intermittent habit shifts, closed stations, unusual high/low number of entrances in a station. The Parisian metro network with 300 stations and millions of daily trips is considered as a case study. In this paper, we present four approaches for the task of anomaly detection in a transportation network using smart card logs. The first three approaches involve the inference of a daily temporal prototype of each metro station and the use of a distance denoting the compatibility of a particular day and its inferred prototype. We introduce two simple and strong baselines relying on a differential modeling between stations and prototypes in the raw-log space. We implemented a raw version (sensitive to volume change) as well as a normalized version (sensitive to behavior changes). The third approach is an original matrix factorization algorithm that computes a dictionary of typical behaviors shared across stations and the corresponding weights allowing the reconstruction of denoised station profiles. We propose to measure the distance between stations and prototypes directly in the latent space. The main advantage resides in its compactness allowing to describe each station profile and the inherent variability within a few parameters. The last approach is a user-based model in which abnormal behaviors are first detected for each user at the log level and then aggregated spatially and temporally; as a consequence, this approach is heavier and requires to follow users, at the opposite of the previous ones that operate on anonymous log data. On top of that, our contribution regards the evaluation framework: we listed particular days but we also mined RATP 1 Twitter account to obtain (partial) ground truth information about operating incidents. Experiments show that matrix factorization is very robust in various situations while the last user-based model is particularly efficient to detect small incidents reported in the twitter dataset.


international conference on neural information processing | 2017

Binary Stochastic Representations for Large Multi-class Classification

Thomas Gerald; Nicolas Baskiotis; Ludovic Denoyer

Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top performance in this context, these approaches suffer of a high inference complexity, linear w.r.t. the number of categories. Different models based on the notion of binary codes have been proposed to overcome this limitation, achieving in a sublinear inference complexity. But they a priori need to decide which binary code to associate to which category before learning using more or less complex heuristics. We propose a new end-to-end model which aims at simultaneously learning to associate binary codes with categories, but also learning to map inputs to binary codes. This approach called Deep Stochastic Neural Codes (DSNC) keeps the sublinear inference complexity but do not need any a priori tuning. Experimental results on different datasets show the effectiveness of the approach w.r.t. baseline methods.


international conference on intelligent transportation systems | 2016

Trajectory Bayesian indexing: The airport ground traffic case

Cynthia Delauney; Nicolas Baskiotis; Vincent Guigue

In this paper, we propose a new approach of indexing trajectories to efficiently distinguish abnormal behaviors from normal ones. After a discretization step, trajectories are considered as sets of triplets (location, velocity, direction). Those triplets are seen as words and a multinomial modeling is learned to estimate the probability of each word. The originality of our work consists in computing the likelihood of all measures and aggregating them by trajectories and spatial cells. The achieved representation is light and offers new opportunities to request normal or abnormal behaviors. The interest of our approach is demonstrated on a plane trajectory dataset provided by Paris-Charles de Gaulle airport. Several experiments are carried out to promote the proposed likelihood descriptors; in particular, experiments show how to extract easily relevant specific trajectories. A t-SNE diagram is also presented to achieve an overall discriminative representation of the whole dataset.


Archive | 2016

Anomaly Ranking in a High Dimensional Space: The Unsupervised TreeRank Algorithm

Stéphan Clémençon; Nicolas Baskiotis; Nicolas Vayatis

Ranking unsupervised data in a multivariate feature space (mathcal{X} subset mathbb{R}^{d}), d ≥ 1 by degree of abnormality is of crucial importance in many applications (e.g., fraud surveillance, monitoring of complex systems/infrastructures such as energy networks or aircraft engines, system management in data centers). However, the learning aspect of unsupervised ranking has only received attention in the machine-learning community in the past few years. The Mass-Volume (MV) curve has been recently introduced in order to evaluate the performance of any scoring function (s: mathcal{X} rightarrow mathbb{R}) with regard to its ability to rank unlabeled data. It is expected that relevant scoring functions will induce a preorder similar to that induced by the density function f(x) of the (supposedly continuous) probability distribution of the statistical population under study. As far as we know, there is no efficient algorithm to build a scoring function from (unlabeled) training data with nearly optimal MV curve when the dimension d of the feature space is high. It is the major purpose of this chapter to introduce such an algorithm which we call the Unsupervised TreeRank algorithm. Beyond its description and the statistical analysis of its performance, numerical experiments are exhibited in order to provide empirical evidence of its accuracy.


international joint conference on artificial intelligence | 2007

A machine learning approach for statistical software testing

Nicolas Baskiotis; Michèle Sebag; Marie-Claude Gaudel; Sandrine-Dominique Gouraud


Workshop MUD, Mining Urban Data 201 | 2014

Car-traffic forecasting: A representation learning approach

Ali Ziat; Gabriella Contardo; Nicolas Baskiotis; Ludovic Denoyer


the european symposium on artificial neural networks | 2015

Learning Embeddings for Completion and Prediction of Relationnal Multivariate Time-Series

Ali Ziat; Gabriella Contardo; Nicolas Baskiotis; Ludovic Denoyer

Collaboration


Dive into the Nicolas Baskiotis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nicolas Vayatis

École Normale Supérieure

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge