Is this you? Create Your Porfile

Sylvie Ratté

École de technologie supérieure

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sylvie Ratté is active.

Explore More

Publication

Featured researches published by Sylvie Ratté.

databases knowledge and data applications | 2010

Adaptation of Apriori to MapReduce to Build a Warehouse of Relations between Named Entities across the Web

Jean-Daniel Cryans; Sylvie Ratté; Roger Champagne

The Semantic Web has made possible the use of the Internet to extract useful content, a task that could necessitate an infrastructure across the Web. With Hadoop, a free implementation of the MapReduce programming paradigm created by Google, we can treat these data reliably over hundreds of servers. This article describes how the Apriori algorithm was adapted to MapReduce in the search for relations between entities to deal with thousands of Web pages coming from RSS feeds daily. First, every feed is looked up five times per day and each entry is registered in a database with MapReduce. Second, the entries are read and their content sent to the Web service OpenCalais for the detection of named entities. For each Web page, the set of all itemsets found is generated and stored in the database. Third, all generated sets, from first to last, are counted and their support is registered. Finally, various analytical tasks are executed to present the relationships found. Our tests show that the third step, executed over 3,000,000 sets, was 4.5 times faster using five servers than using a single machine. This approach allows us to easily and automatically distribute treatments on as many machines as are available, and be able to process datasets that one server, even a very powerful one, would not be able to manage alone. We believe that this work is a step forward in processing semantic Web data efficiently and effectively.

Knowledge and Information Systems | 2011

Classifier-based acronym extraction for business documents

Pierre André Ménard; Sylvie Ratté

Acronym extraction for business documents has been neglected in favor of acronym extraction for biomedical documents. Although there are overlapping challenges, the semi-structured and non-predictive nature of business documents hinder the effectiveness of the extraction methods used on biomedical documents and fail to deliver the expected performance. A classifier-based extraction subsystem is presented as part of the wider project, Binocle, for the analysis of French business corpora. Explicit and implicit acronym presentation cases are identified using textual and syntactical hints. Among the 7 features extracted from each candidate instance, we introduce “similarity” features, which compare a candidate’s characteristics with average length-related values calculated from a generic acronym repository. Commonly used rules for evaluating the candidate (matching first letters, ordered instances, etc.) are scored and aggregated in a single composite feature that permits a supple classification. One hundred and thirty-eight French business documents from 14 public organizations were used for the training and evaluation corpora, yielding a recall of 90.9% at a precision level of 89.1% for a search space size of 3 sentences.

automated software engineering | 2016

Concept extraction from business documents for software engineering projects

Pierre André Ménard; Sylvie Ratté

Acquiring relevant business concepts is a crucial first step for any software project for which the software experts are not domain experts. The wealth of information buried within an organization’s written documentation is a precious source of concepts, relationships and attributes which can be used to model the enterprise’s domain. The lack of targeted extraction tools can make perusing through this type of resource a lengthy and costly process. We propose a domain model focused extraction process aimed at the rapid discovery of knowledge relevant to the software expert. To avoid undesirable noise from high-level linguistic tools, the process is mainly composed of positive and negative base filters that are less error prone and more robust. The extracted candidates are then reordered using a weight propagation algorithm based on structural hints from source documents. When tested on French text corpora from public organizations, our process performs 2.7 times better than a statistical baseline for relevant concept discovery. A new metric to assess the performance discovery speed of relevant concepts is introduced. The annotation of a gold standard definition of software engineering oriented concepts for knowledge extraction tasks is also presented.

international work-conference on the interplay between natural and artificial computation | 2015

Identification of loitering human behaviour in video surveillance environments

F A Héctor Gómez; Rafael Martínez Tomás; Susana Arias Tapia; Antonio Fernández Caballero; Sylvie Ratté; Alexandra González Eras; Patricia Ludeña González

Loitering is a common behaviour of the elderly people. We goal is develop an artificial intelligence system that automatically detects loitering behaviour in video surveillance environments. The first step to identify this behaviour was used a Generalized Sequential Patterns that detects sequential micro-patterns in the input loitering video sequences. The test phase determines the appropriate percentage of inclusion of this set of micro-patterns in a new input sequence, namely those that are considered to form part of the profile, and then be identified as loitering. The system is dynamic; it obtains micro-patterns on a repetitive basis. During the execution time, the system takes into account the human operator and updates the performance values of loitering in shopping mall. The profile obtained is consistent with what has been documented by experts in this field and is sufficient to focus the attention of the human operator on the surveillance monitor.

international conference on product lifecycle management | 2012

A Holistic Approach for the Architecture and Design of an Ontology-Based Data Integration Capability in Product Master Data Management

Daniel Fitzpatrick; François Coallier; Sylvie Ratté

In the context of a broadened product lifecycle management envi- ronment, a traditional product information management, also referred to as product master data management (P-MDM) needs to be complemented by other MDM domains. Such MDM domains may include Customers, Financials, Sup- pliers, Human Resources, Events and other domains. To satisfy such a transver- sal set of requirements requires a true cross-enterprise semantic integration capability. This capability cannot be met by current off-the-shelf technologies. This paper proposes a research approach that would elicit the definition of a reference architecture and a multi-domain ontology, from research and devel- opment work performed notably in ontology engineering, in both academic and industry domains.

advances in social networks analysis and mining | 2012

A Multi-Classifier System for Sentiment Analysis and Opinion Mining

Luana Bezerra Batista; Sylvie Ratté

Although successfully employed to reduce error rates of difficult pattern recognition problems, multi-classifier systems (MCS) are not in widespread use in the field of Sentiment Analysis and Opinion Mining. The motivation of using a MCS stems from the fact that different classifiers usually make different errors on different samples. By using just the best classifier, it is possible to loose valuable information contained in the other sub optimal classifiers. In this work, we take advantage of unigrams, big rams and trig rams to design a multi-classifier system for Sentiment Analysis and Opinion Mining. Three different Naive Bayes classifiers are trained--each one with a specific set of features-- , and then combined in the ROC space by using the Iterative Boolean Combination (IBC) technique. IBC iteratively combines the ROC curves produced by different classifiers using all Boolean functions, and does not require prior assumption that the classifiers are statistically independent. An experimental study investigates the advantage of using the proposed MCS, over each individual classifier, in classifying Twitter messages as positive or negative. The Stanford Universitys Twitter database is employed for this task. As real-world application, the proposed MCS is used to identify the sentiment of electors regarding the main candidates for the 2012 United States Presidential Elections. Results indicate that the proposed MCS can provide useful information about peoples opinions that are comparable to conventional opinion polls.

Archive | 2011

Evolvable Metaheuristics on Circuit Design

Felipe Padilla; Aurora Torres; Julio Ponce; María Dolores Torres; Sylvie Ratté; Eunice Ponce-de-Leon

Evolutionary computation algorithms are stochastic optimization methods; they are conveniently presented using the metaphor of natural evolution: a randomly initialized population of individuals evolves following a simulation of the Darwinian principle. New individuals are generated using genetic operations such as mutation and crossover. The probability of survival of the newly generated solutions depends on their fitness (Michalewicz et al., 1995). Evolutionary algorithms (EAs) have been successfully used to solve different types of optimization problems (Back, 1996). In the most general terms, evolution can be described as a two-step iterative process, consisting of random variation followed by selection. The structure of any evolutionary computation algorithm is shown in the figure 1.

technical symposium on computer science education | 2009

AARTIC: development of an intelligent environment for human learning

Faten M'hiri; Sylvie Ratté

The projects main objective is the design and development of an intelligent environment for human learning (EIAH). Our system is adaptive and intelligent and aims to guide students in the realization of their labs and to collaborate with peers. A pedagogical agent, integrated into the environment, incorporates functions of observations and personalized tutoring for learning. The system is currently tested in an introductory course in IA given to undergraduate students at ETS.

computational intelligence | 2017

Bagged Subspaces for Unsupervised Outlier Detection

José Ramón Pasillas-Díaz; Sylvie Ratté

In many domains, important events are not represented as the common scenario, but as deviations from the rule. The importance and impact associated with these particular, outnumbered, deviant, and sometimes even previously unseen events is directly related to the application domain (e.g., breast cancer detection, satellite image classification, etc.). The detection of these rare events or outliers has recently been gaining popularity as evidenced by the wide variety of algorithms currently available. These algorithms are based on different assumptions about what constitutes an outlier, a characteristic pointing toward their integration in an ensemble to improve their individual detection rate. However, there are two factors that limit the use of current ensemble outlier detection approaches: first, in most cases, outliers are not detectable in full dimensionality, but instead are located in specific subspaces of data; and second, despite the expected improvement on detection rate achieved using an ensemble of detectors, the computational efficiency of the ensemble will increase linearly as the number of components increases. In this article, we propose an ensemble approach that identifies outliers based on different subsets of features and subsamples of data, providing more robust results while improving the computational efficiency of similar ensemble outlier detection approaches.

Electronic Notes in Theoretical Computer Science | 2016

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures

José Ramón Pasillas-Díaz; Sylvie Ratté

Outlier detection, the discovery of observations that deviates from normal behavior, has become crucial in many application domains. Numerous and diverse algorithms have been proposed to detect them. These algorithms identify outliers using precise definitions of the concept of outliers, thus their performance depends largely on the context of application. The construction of ensembles has been proposed as a solution to increase the individual capacity of each algorithm. However, the unsupervised scenario (absence of class labels) in the domains where outlier detection operates restricts the use of approaches relying on the existence of labels. In this paper, two novel unsupervised approaches using ensembles of heterogeneous types of detectors are proposed. Both approaches construct the ensemble using solely the results produced by each algorithm, identifying and giving more weight to the most suitable techniques depending on the particular dataset under examination. Through experimental evaluation in real world datasets, we demonstrate that our proposed algorithm provides a significant improvement over the base algorithms and even over existing approaches for ensemble outlier detection.

Explore More