Bernard Levrat
University of Angers
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bernard Levrat.
international conference on tools with artificial intelligence | 2007
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
In this work an image retrieval system adaptable to users interests by the use of relevance feedback via genetic algorithm is presented. The retrieval process is based on local similarity patterns. The goal of the genetic algorithm is to infer weights for regions and features that better translate the users requirements producing better quality rankings. The genetic algorithm used has as its main innovation an order-based fitness function, which is appropriate to the ranking requirements of a majority of the users. This fitness function will quickly drive the genetic algorithm in the process of searching for an optimal solution. Evaluations in several databases have shown the robustness and efficiency of the proposed retrieval method even when the query is a sketch or damaged image.The WindowDiff evaluation measure (Pevzner and Hearst, 2002) is becoming the standard criterion for evaluating text segmentation methods. Nevertheless, this metric is really not fair with regard to the characteristics of the methods and the results that it provides on different kinds of corpus are difficult to compare. Therefore, we first attempt to improve this measure according to the risks taken by each method on different kinds of text. On the other hand, the production of a segmentation of reference being a rather difficult task, this paper describes a new evaluation metric that relies on the stability of the segmentations face to text transformations. Our experimental results appear to indicate that both proposed metrics provide really better indicators of the text segmentation accuracy than existing measures.
artificial intelligence methodology systems applications | 2008
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
An alternative way to tackle Information Retrieval, called Passage Retrieval, considers text fragments independently rather than assessing global relevance of documents. In such a context, the fact that relevant information is surrounded by parts of text deviating from the interesting topic does not penalize the document. In this paper, we propose to study the impact of the consideration of these text fragments on a document clustering process. The use of clustering in the field of Information Retrieval is mainly supported by the cluster hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant documents and hence a clustering process is likely to gather them. Previous experiments have shown that clustering the first retrieved documents as response to a users query allows the Information Retrieval systems to improve their effectiveness. In the clustering process used in these studies, documents have been considered globally. Nevertheless, the assumption stating that a document can refer to more than one topic/concept may have also impacts on the document clustering process. Considering passages of the retrieved documents separately may allow to create more representative clusters of the addressed topics. Different approaches have been assessed and results show that using text fragments in the clustering process may turn out to be actually relevant.
international conference on tools with artificial intelligence | 2007
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
The document-length normalization problem has been widely studied in the field of information retrieval. The cosine normalization (Baeza-Yates and Ribeiro-Neto, 1999), the maximum if normalization (Allan et al., 1997) and the byte length normalization (Robertson et al., 1992) are the most commonly used normalization techniques. In (Singhal et al., 1996), authors studied the retrieval probability of documents w.r.t. their size, using different similarity measures. They have shown that none of existing measures retrieve the documents of different lengths with the same probability. We first show here that the document and query sizes are indeed very influent on the similarity score expectation. Therefore, we propose to realize a statistical regression of the similarity scores distribution w. r. t. document and query sizes in order to normalize them. Experimental results appear to indicate that our approach, as well in the field of classical Information Retrieval as when applied to a document clustering process, allows to judge similarities really more fairly.
acm symposium on applied computing | 2007
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
This paper describes ClassStruggle, an algorithm for linear text segmentation on general corpuses. It relies on an initial clustering of the sentences of the text. This preliminary partitioning provides a global view on the sentences relations existing in the text, considering the similarities in a group rather than individually. ClassStruggle is based on the distribution of the occurrences of the members of each class. During the process, the clusters then evolve, by considering a notion of proximity and of layout in the text, in the aim to create groups that contain only sentences related to a same topic development. Finally, boundaries are created between sentences belonging to two different classes. First experimental results are promising, ClassStruggle appears to be very competitive compared with existing methods.
Applied Artificial Intelligence | 2008
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
The automatic text segmentation task consists of identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. Text segmentation has motivated a large amount of research. We focus here on the statistical approaches that rely on an analysis of the distribution of the words in the text. Usually, the segmentation of texts is realized sequentially on the basis of very local clues. However, such an approach prevents the consideration of the text in a global way, particularly concerning the granularity degree adopted for the expression of the different topics it addresses. We thus propose here two new segmentation algorithms—ClassStruggle and SegGen—which use criteria rendering global views of texts. ClassStruggle is based on an initial clustering of the sentences of the text, thus allowing the consideration of similarities within a group rather than individually. It relies on the distribution of the occurrences of the members of each class 1 to segment the texts. SegGen proposes to evaluate potential segmentations of the whole text thanks to a genetic algorithm. It attempts to find a solution of segmentation optimizing two criteria, the maximization of the internal cohesion of the segments and the minimization of the similarity between adjacent ones. According to experimental results, both approaches appear to be very competitive compared to existing methods.
acm symposium on applied computing | 2010
Sylvain Lamprier; Tassadit Amghar; Frédéric Saubion; Bernard Levrat
Document clustering techniques have been widely applied in Information Retrieval to reorganize results furnished as a response to users queries. Following the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant ones, most of relevant documents are likely to be gathered in a single cluster. Usually, systems organizing search results as a set of clusters consider this tendency as a very advantageous phenomenon, since it allows to filter the results provided by the initial search. Adopting a different point of view, we rather consider the Cluster Hypothesis as a hindrance to the information access since it prevents the emergence of the various aspects of the query. The risk induced is to restrict the perception of the subject to an unique point of view. Therefore, we propose to rather distribute the relevant documents over clusters by orienting the organization of the clusters according to the users topic. The aim is to attract the clusters around the latter in order to highlight the thematic differences between documents which are strongly connected to the query. Rather than modifying the inter-documents similarity computation as it is the case in several studies, we propose to directly act on the organization of the clusters by using a multi-objective evolutionary clustering algorithm which, besides the classical internal cohesion, also optimizes the query proximity of the clusters. First experimental results highlight the great benefit which may be gained by our way of query consideration.
Revue des Sciences et Technologies de l'Information - Série Document Numérique | 2010
Sylvain Lamprier; Tassadit Amghar; Bernard Levrat; Frédéric Saubion
RESUME. S’appuyant sur la Cluster Hypothesis, qui stipule que les documents per tinents a une requete tendent a etre plus proches les uns des autres que des docume nts non pertinents, la plupart des systemes de recherche d’information realisant une catego risation de leurs resultats visent a regrouper l’ensemble des documents pertinents dans un memegroupe. Nous proposons ici, par la mise en place de nouvelles mesures d’evaluation, de reco nsiderer les benefices resultant d’une telle concentration de l’information pertinente. Contraireme nt a ce qui est habituellement admis, nous montrons finalement que des systemes realisan t une distribution de l’information pertinente peuvent s’averer au moins aussi interessants p our l’utilisateur que des systemes regroupant l’ensemble des documents pertinents dans un clu ster unique.
international conference on vehicular electronics and safety | 2015
Can Gocmenoglu; Tankut Acarman; Bernard Levrat
VANET Simulation schemes require a combination of mobility and wireless network simulation packages, coupled with custom scripts, visualization tools and various scenarios. The results of simulation studies need to be supported by special tools or scripts to analyze or visualize them easily. Some additional difficulties arise at sharing the results, visually comparing simulation runs across different platforms and showcasing the findings of a research to a larger audience. As a solution, we have developed a 3D Web-based Visualization Tool for VANET Simulations (WGL-VANET), which takes advantage of HTML5 and WebGL technologies to create a crossplatform, easy-to-use and flexible visualization tool for VANET simulations. WGL-VANET reads simulation data from a JSON document and supports a variety of visual features, and displays the simulation run on a WebGL canvas inside a web-browser.
text speech and dialogue | 2010
Amaria Adila Bouabdallah; Tassadit Amghar; Bernard Levrat
Many studies have been devoted to the temporal analysis of texts, and more precisely to the tagging of temporal entities and relations occurring in texts. Among these lasts, the various avatars of events in their multiples occurring forms has been tackled by numerous works. We describe here a method for the detection of noun phrases denoting events. Our approach is based on the implementation of a simple linguistic test proposed by linguists for this task. Our method is applied on two different corpuses; the first is composed of newspaper articles and the second, a much larger one, rests on an interface for automatically querying the Yahoo search engine. Primary results are encouraging and increasing the size of the learning corpus should allow for a real statistical validation of the results.
acm symposium on applied computing | 2010
Sylvain Lamprier; Tassadit Amghar; Frédéric Saubion; Bernard Levrat
Relying on the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant documents, most of information retrieval systems organizing search results as a set of clusters seek to gather all relevant documents in the same cluster. We propose here to reconsider the benefits of the entailed concentration of the relevant information. Contrary to what is commonly admitted, we believe that systems which aim to distribute the relevant documents in different clusters, since being more likely to highlight different aspects of the subject, may be at least as useful for the user as systems gathering all relevant documents in a single group. Since existing evaluation measures tend to greatly favor the latter systems, we first investigate ways to more fairly assess the ability to reach the relevant information from the list of cluster descriptions. At last, we show that systems distributing the relevant information in different clusters may actually provide a better information access than classical systems.