Pascal Poncelet | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pascal Poncelet is active.

Explore More

Publication

Featured researches published by Pascal Poncelet.

data and knowledge engineering | 2003

Incremental mining of sequential patterns in large databases

Florent Masseglia; Pascal Poncelet; Maguelonne Teisseire

In this paper, we consider the problem of the incremental mining of sequential patterns when new transactions or new customers are added to an original database. We present a new algorithm for mining frequent sequences that uses information collected during an earlier mining process to cut down the cost of finding new sequential patterns in the updated database. Our test shows that the algorithm performs significantly faster than the naive approach of mining the whole updated database from scratch. The difference is so pronounced that this algorithm could also be useful for mining sequential patterns, since in many cases it is faster to apply our algorithm than to mine sequential patterns using a standard algorithm, by breaking down the database into an original database plus an increment.

conference on soft computing as transdisciplinary science and technology | 2008

Web opinion mining: how to extract opinions from blogs?

Ali Harb; Michel Plantié; Gérard Dray; Mathieu Roche; François Trousset; Pascal Poncelet

The growing popularity of Web 2.0 provides with increasing numbers of documents expressing opinions on different topics. Recently, new research approaches have been defined in order to automatically extract such opinions from the Internet. They usually consider opinions to be expressed through adjectives, and make extensive use of either general dictionaries or experts to provide the relevant adjectives. Unfortunately, these approaches suffer from the following drawback: in a specific domain, a given adjective may either not exist or have a different meaning from another domain. In this paper, we propose a new approach focusing on two steps. First, we automatically extract a learning dataset for a specific domain from the Internet. Secondly, from this learning set we extract the set of positive and negative adjectives relevant to the domain. The usefulness of our approach was demonstrated by experiments performed on real data.

Data Mining and Knowledge Discovery | 2008

Web usage mining: extracting unexpected periods from web logs

Florent Masseglia; Pascal Poncelet; Maguelonne Teisseire; Alice Marascu

Existing Web usage mining techniques are currently based on an arbitrary division of the data (e.g. “one log per month”) or guided by presumed results (e.g. “what is the customers’ behaviour for the period of Christmas purchases?”). These approaches have two main drawbacks. First, they depend on the above-mentioned arbitrary organization of data. Second, they cannot automatically extract “seasonal peaks” from among the stored data. In this paper, we propose a specific data mining process (in particular, to extract frequent behaviour patterns) in order to reveal the densest periods automatically. From the whole set of possible combinations, our method extracts the frequent sequential patterns related to the extracted periods. A period is considered to be dense if it contains at least one frequent sequential pattern for the set of users connected to the website in that period. Our experiments show that the extracted periods are relevant and our approach is able to extract both frequent sequential patterns and the associated dense periods.

Expert Systems With Applications | 2009

Efficient mining of sequential patterns with time constraints: Reducing the combinations

Florent Masseglia; Pascal Poncelet; Maguelonne Teisseire

In this paper we consider the problem of discovering sequential patterns by handling time constraints as defined in the Gsp algorithm. While sequential patterns could be seen as temporal relationships between facts embedded in the database where considered facts are merely characteristics of individuals or observations of individual behavior, generalized sequential patterns aim to provide the end user with a more flexible handling of the transactions embedded in the database. We thus propose a new efficient algorithm, called Gtc (Graph for Time Constraints) for mining such patterns in very large databases. It is based on the idea that handling time constraints in the earlier stage of the data mining process can be highly beneficial. One of the most significant new feature of our approach is that handling of time constraints can be easily taken into account in traditional levelwise approaches since it is carried out prior to and separately from the counting step of a data sequence. Our test shows that the proposed algorithm performs significantly faster than a state-of-the-art sequence mining algorithm.

international symposium on temporal representation and reasoning | 2004

Pre-processing time constraints for efficiently mining generalized sequential patterns

Florent Masseglia; Pascal Poncelet; Maguelonne Teisseire

In this paper we consider the problem of discovering sequential patterns by handling time constraints. While sequential patterns could be seen as temporal relationships between facts embedded in the database, generalized sequential patterns aim at providing the end user with a more flexible handling of the transactions embedded in the database. We propose a new efficient algorithm, called GTC (graph for time constraints) for mining such patterns in very large databases. It is based on the idea that handling time constraints in the earlier stage of the algorithm can be highly beneficial since it minimizes computational costs by preprocessing data sequences. Our test shows that the proposed algorithm performs significantly faster than a state-of-the-art sequence mining algorithm.

Data Mining and Knowledge Discovery | 2008

Mining conjunctive sequential patterns

Chedy Raïssi; Toon Calders; Pascal Poncelet

In this paper we aim at extending the non-derivable condensed representation in frequent itemset mining to sequential pattern mining. We start by showing a negative example: in the context of frequent sequences, the notion of non-derivability is meaningless. Therefore, we extend our focus to the mining of conjunctions of sequences. Besides of being of practical importance, this class of patterns has some nice theoretical properties. Based on a new unexploited theoretical definition of equivalence classes for sequential patterns, we are able to extend the notion of a non-derivable itemset to the sequence domain. We present a new depth-first approach to mine non-derivable conjunctive sequential patterns and show its use in mining association rules for sequences. This approach is based on a well known combinatorial theorem: the Möbius inversion. A performance study using both synthetic and real datasets illustrates the efficiency of our mining algorithm. These new introduced patterns have a high-potential for real-life applications, especially for network monitoring and biomedical fields with the ability to get sequential association rules with all the classical statistical metrics such as confidence, conviction, lift etc.

intelligent information systems | 2007

Towards a new approach for mining frequent itemsets on data stream

Chedy Raïssi; Pascal Poncelet; Maguelonne Teisseire

Mining frequent patterns on streaming data is a new challenging problem for the data mining community since data arrives sequentially in the form of continuous rapid streams. In this paper we propose a new approach for mining itemsets. Our approach has the following advantages: an efficient representation of items and a novel data structure to maintain frequent patterns coupled with a fast pruning strategy. At any time, users can issue requests for frequent itemsets over an arbitrary time interval. Furthermore our approach produces an approximate answer with an assurance that it will not bypass user-defined frequency and temporal thresholds. Finally the proposed method is analyzed by a series of experiments on different datasets.

database and expert systems applications | 1999

WebTool: An Integrated Framework for Data Mining

Florent Masseglia; Pascal Poncelet; Rosine Cicchetti

Large volumes of data such as user address or URL requested are gathered automatically by Web servers and collected in access log files. Analysis of server access data can provide significant and useful information for performance enhancement, and restructuring a Web site for increased effectiveness. In this paper, we propose an integrated system (WebTool) for mining user patterns and association rules from one or more Web servers and pay a particular attention to handling of time constraints. Once interesting patterns are discovered, we illustrate how they can be used to customize the server hypertext organization dynamically.

pacific asia workshop on intelligence and security informatics | 2010

MAD-IDS: novel intrusion detection system using mobile agents and data mining approaches

Imen Brahmi; Sadok Ben Yahia; Pascal Poncelet

Intrusion Detection has been investigated for many years and the field reached the maturity. Nevertheless, there are still important challenges, e.g., how an Intrusion Detection System (IDS) can detect distributed attacks. To tackle this problem, we propose a novel distributed IDS, based on the desirable features provided by the mobile agent methodology and the high accuracy offered by the data mining techniques.

data warehousing and knowledge discovery | 2008

Is a Voting Approach Accurate for Opinion Mining

Michel Plantié; Mathieu Roche; Gérard Dray; Pascal Poncelet

In this paper, we focus on classifying documents according to opinion and value judgment they contain. The main originality of our approach is to combine linguistic pre-processing, classification and a voting system using several classification methods. In this context, the relevant representation of the documents allows to determine the features for storing textual data in data warehouses. The conducted experiments on very large corpora from a French challenge on text mining (DEFT) show the efficiency of our approach.

Explore More