Annie Morin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Annie Morin is active.

Explore More

Publication

Featured researches published by Annie Morin.

Ninth International Conference on Information Visualisation (IV'05) | 2005

Meaning metaphor for visualizing search results

Nicolas Bonnel; Alexandre Cotarmanac'h; Annie Morin

While searching the Web, the user is often confronted by a great number of results, generally sorted by their rank. These results are then displayed as a succession of ordered lists. Facing the limits of this approach, we propose a prototype to explore new organizations and presentations of search results, as well as new types of interactions with the results in order to make their exploration more intuitive and efficient. The main topic of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the result quality, the effectiveness of the result processing represents an alternative way to improve the relevance for the user. Given the current expectations, this processing is composed by an organization step and a visualization step. Then the proposed prototype organizes the results according to their meaning using a Kohonen self-organizing map, and also visualizes them in a 3D scene to increase the representation space. The 3D metaphor proposed here is a city.

portuguese conference on artificial intelligence | 2007

N-grams and morphological normalization in text classification: a comparison on a Croatian-English parallel corpus

Artur Šilić; Jean-Hugues Chauchat; Bojana Dalbelo Bašić; Annie Morin

In this paper we compare n-grams and morphological normalization, two inherently different text-preprocessing methods, used for text classification on a Croatian-English parallel corpus. Our approach to comparing different text preprocessing techniques is based on measuring computational performance (execution time and memory consumption), as well as classification performance. We show that although n-grams achieve classifier performance comparable to traditional word-based feature extraction and can act as a substitute for morphological normalization, they are computationally much more demanding.

information technology interfaces | 2004

Intensive use of correspondence analysis for information retrieval

Annie Morin

With the huge amount of available textual data, we need to find convenient ways to process the data and to get invaluable information. It appears that the use of factorial correspondence analysis allows to get most of the information included in the data. Besides, even after the data processing, we still have a big amount of material and we need visualization tools to display it. In this paper, we show how to use correspondence analysis in a sensible way and we give an application on the analysis of the internal scientific production of an important research center in France : the INRIA, the french national institute for research in computer science and control

artificial intelligence in medicine in europe | 2009

Subgroup Discovery in Data Sets with Multi---dimensional Responses: A Method and a Case Study in Traumatology

Lan Umek; Blaž Zupan; Marko Toplak; Annie Morin; Jean-Hugues Chauchat; Gregor Makovec; Dragica Smrke

Biomedical experimental data sets may often include many features both at input (description of cases, treatments, or experimental parameters) and output (outcome description). State-of-the-art data mining techniques can deal with such data, but would consider only one output feature at the time, disregarding any dependencies among them. In the paper, we propose the technique that can treat many output features simultaneously, aiming at finding subgroups of cases that are similar both in input and output space. The method is based on k -medoids clustering and analysis of contingency tables, and reports on case subgroups with significant dependency in input and output space. We have used this technique in explorative analysis of clinical data on femoral neck fractures. The subgroups discovered in our study were considered meaningful by the participating domain expert, and sparked a number of ideas for hypothesis to be further experimentally tested.

Journal of Classification | 2014

Rhetorical Strategy in Forensic Speeches: Multidimensional Statistics-Based Methodology

Mónica Bécue-Bertaut; Belchin Kostov; Annie Morin; Gulhem Naro

Rhetorical strategy is relevant in the law domain, where language is a vital instrument. Textual statistics have much to offer for uncovering such a strategy. We propose a methodology that starts from a non-structured text; first, the breakpoints are automatically detected and lexically homogeneous parts are identified; then, the shape of the text through the trajectory of these parts and their hierarchical structure are uncovered; finally, the argument flow is tracked along. Several methods are combined. Chronological clustering of multidimensional count series detects the breakpoints; the shape of the text is revealed by applying correspondence analysis to the parts×words table while the progression of the argument is described by labelled time-constrained hierarchical clustering. This methodology is illustrated on a rhetoric forensic application, concretely a closing speech delivered by a prosecutor at Barcelona Criminal Court. This approach could also be useful in politics, communication and professional writing.

intelligent data analysis | 2009

Textual features for corpus visualization using correspondence analysis

Sasa Petrovic; Bojana Dalbelo Bašić; Annie Morin; Blaž Zupan; Jean-Hugues Chauchat

Explorative data analysis in text mining essentially relies on effective visualization techniques which can expose hidden relationships among documents and reveal correspondence between documents and their features. In text mining, the documents are most often represented by feature vectors of very high dimensions, requiring dimensionality reduction to obtain visual projections in two- or three-dimensional space. Correspondence analysis is an unsupervised approach that allows for construction of low-dimensional projection space with simultaneous placement of both documents and features, making it ideal for explorative analysis in text mining. Its present use, however, has been limited to word-based features. In this paper, we investigate how this particular document representation compares to the representation with letter n-grams and word n-grams, and find that these alternative representations yield better results in separating documents of different class. We perform our experimental analysis on a bilingual Croatian-English parallel corpus, allowing us to additionally explore the impact of features in different languages on the quality of visualizations.

Expert Systems With Applications | 2012

Visualization of temporal text collections based on Correspondence Analysis

Arthur Šilić; Annie Morin; Jean-Hugues Chauchat; Bojana Dalbelo Bašić

In this paper, we present CatViz-Temporally-Sliced Correspondence Analysis Visualization. This novel method visualizes relationships through time and is suitable for large-scale temporal multivariate data. We couple CatViz with clustering methods, whereupon we introduce the concept of final centroid transfer, which enables the correspondence of clusters in time. Although CatViz can be used on any type of temporal data, we show how it can be applied to the task of exploratory visual analysis of text collections. We present a successful concept of employing feature-type filtering to present different aspects of textual data. We performed case studies on large collections of French and English news articles. In addition, we conducted a user study that confirms the usefulness of our method. We present typical tasks of exploratory text analysis and discuss application procedures that an analyst might perform. We believe that CatViz is general and highly applicable to large data sets because of its intuitiveness, effectiveness, and robustness. We expect that it will enable a better understanding of texts in huge historical archives.

EGC (best of volume) | 2010

Intensive Use of Correspondence Analysis for Large Scale Content-Based Image Retrieval

Nguyen-Khang Pham; Annie Morin; Patrick Gros; Quyet-Thang Le

In this paper, we investigate the intensive use of Correspondence Analysis (CA) for large scale content-based image retrieval. Correspondence Analysis is a useful method for analyzing textual data and we adapt it to images using the SIFT local descriptors. CA is used to reduce dimensions and to limit the number of images to be considered during the search step. An incremental algorithm for CA is proposed to deal with large databases giving exactly the same result as the standard algorithm. We also integrate the Contextual Dissimilarity Measure in our search scheme in order to improve response time and accuracy. We explore this integration in two ways: (i) off-line (the structure of image neighborhoods is corrected off-line) and (ii) on-the-fly (the structure of image neighborhoods is adapted during the search). The evaluation tests have been performed on a large image database (up to 1 million images).

computer analysis of images and patterns | 2009

Accelerating Image Retrieval Using Factorial Correspondence Analysis on GPU

Nguyen-Khang Pham; Annie Morin; Patrick Gros

We are interested in the intensive use of Factorial Correspondence Analysis (FCA) for large-scale content-based image retrieval. Factorial Correspondence Analysis, is a useful method for analyzing textual data, and we adapt it to images using the SIFT local descriptors. FCA is used to reduce dimensions and to limit the number of images to be considered during the search. Graphics Processing Units (GPU) are fast emerging as inexpensive parallel processors due to their high computation power and low price. The G80 family of Nvidia GPUs provides the CUDA programming model that treats the GPU as a SIMD processor array. We present two very fast algorithms on GPU for image retrieval using FCA: the first one is a parallel incremental algorithm for FCA and the second one is an extension of the filtering algorithm in our previous work for filtering step. Our implementation is able to scale up the FCA computation a factor of 30 compared to the CPU version. For retrieval tasks, the parallel version on GPU performs 10 times faster than the one on CPU. Retrieving images in a database of 1 million images is done in about 8 milliseconds.

international conference on multimedia and expo | 2004

Multimedia indexing and retrieval with features association rules mining [image databases]

Anicet Kouomou-Choupo; Laure Berti-Equille; Annie Morin

The administration of very large collections of images, accentuates the classical problems of indexing and efficiently querying information. This paper describes a new method applied to very large still image databases that combines two data mining techniques: clustering and association rules mining in order to better organize image collections and to improve the performance of queries. The objective of our work is to exploit association rules discovered by mining, global MPEG-7 features data and to adapt the query processing. In our experiment, we use five MPEG-7 features to describe several thousands of still images. For each feature, we initially determine several clusters of images by using a K-mean algorithm. Then, we generate association rules between different clusters of features and exploit these rules to rewrite the query and to optimize the query-by-content processing

Explore More