Patrice Bertrand
Paris Dauphine University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Patrice Bertrand.
Archive | 2000
Patrice Bertrand; Françoise Goupil
The intention of this chapter is to extend the concept of frequency distribution, and the standard definitions of descriptive statistics for real-valued data, such as the empirical mean the empirical standard deviation and the median, to the general framework of symbolic variables. We denote by E={1,…, n} the set of units that are described by p symbolic variables Y1,…, Yp. The domain of each variable Y j for j = 1,…, p, is denoted by Y j and S = × j=1 p y j denotes the whole domain space. We will examine different types of symbolic variables, namely: cases where each symbolic variable Y j is multi-valu ed or interval-valued; including the case wher e logical rules may exist among the values taken by Y1,…, Y p
Archive | 2007
Paula Brito; Patrice Bertrand; Guy Cucumel; Francisco de A. T. de Carvalho
This volume presents recent methodological developments in data analysis and classification. A wide range of topics is covered that includes methods for classification and clustering, dissimilarity analysis, graph analysis, consensus methods, conceptual analysis of data, analysis of symbolic data, statistical multivariate methods, data mining and knowledge discovery in databases. Besides structural and theoretical results, the book presents a wide variety of applications, in fields such as biology, micro-array analysis, cyber traffic, bank fraud detection, and text analysis. Combining new methodological advances with a wide variety of real applications, this volume is certainly of special value for researchers and practitioners, providing new analytical tools that are useful in theoretical research and daily practice in classification and data analysis.
Discrete Applied Mathematics | 2003
Patrice Bertrand; Melvin F. Janowitz
Several approaches have been proposed for the purpose of proving that different classes of dissimilarities (e.g. ultrametrics) can be represented by certain types of stratified clusterings which are easily visualized (e.g. indexed hierarchies). These approaches differ in the choice of the clusters that are used to represent a dissimilarity coefficient. More precisely, the clusters may be defined as the maximal linked subsets, also called ML-sets; equally they may be defined as a particular type of 2-ball. In this paper, we first introduce the notion of a k-ball, thereby extending the notion of a 2-ball. For an arbitrary dissimilarity coefficient, we establish some properties of the k-balls that pinpoint the connection between them and the ML-sets. We also introduce the (2,k)-point condition (k ≥ 1) which is an extension of the Bandelt four-point condition.For k ≥ 2, we prove that the dissimilarities satisfying the (2, k)-point condition are in one-one correspondence with a class of stratified clusterings, called k-weak hierarchical representations, whose main characteristic is that the intersection of (k + 1) arbitrary clusters may be reduced to the intersection of some k of these clusters.
Computational Statistics & Data Analysis | 2006
Patrice Bertrand; G. Bel Mufti
A method is developed for measuring clustering stability under the removal of a few objects from a set of objects to be partitioned. Measures of stability of an individual cluster are defined as Loevingers measures of rule quality. The stability of an individual cluster can be interpreted as a weighted mean of the inherent stabilities in the isolation and cohesion, respectively, of the examined cluster. The design of the method also enables us to measure the stability of a partition, that can be viewed as a weighted mean of the stability measures of all clusters in the partition. As a consequence, an approach is derived for determining the optimal number of clusters of a partition. Furthermore, using a Monte Carlo test, a significance probability is computed in order to assess how likely any stability measure is, under a null model that specifies the absence of cluster stability. In order to illustrate the potential of the method, stability measures that were obtained by using the batch K-Means algorithm on artificial data sets and on Iris Data are presented.
The Journal of Combinatorics | 2000
Patrice Bertrand
Different one?one correspondences exist between classes of indexed clustering structures (such as indexed hierarchies) and classes of dissimilarities (such as ultrametrics). Following the line developed in previous works (i.e., Johnson (1967), Diday (1984, 1986), Bertrand and Diday (1991), Batbedat (1988), Bandelt and Dress (1994)), we associate each pseudo-dissimilarity? defined on a finite set E with the set system of all maximal linked subsets of E. This provides a one?one correspondence that maps the set of pseudo-dissimilarities onto the collection of all the valued set systems (S, f) that satisfy two conditions. One of these conditions was introduced by Batbedat (1988) and is related to the characterization of 2-conformity in hypergraphs (see also Bandelt and Dress (1994)). We introduce the other condition which requires the index f to be a weak index. Our approach includes a characterization of the valued set systems (S, f) such that S contains (respectively is contained in) the set of all finite nonempty intersections of maximal linked subsets with respect to the pseudo-dissimilarity induced by the pair (S, f).
Discrete Applied Mathematics | 2002
Patrice Bertrand; Melvin F. Janowitz
There are several well known bijections between classes of dissimilarity coefficients and structures such as indexed or weakly indexed pyramids, as well as indexed closed weak hierarchies. Our goal will be to approach these results from the viewpoint developed by Jardine and Sibson (Mathematical Taxonomy, Wiley, New York, 1971). Properties of dissimilarity coefficients will be related to properties of the maximal linked subsets defined by the family of relations associated with the underlying dissimilarity coefficient. Our approach also involves a close study of the inclusion and diameter conditions introduced by Diatta and Fichet (in: E. Diday et al. (Eds.), New Approaches in Classification and Data Analysis, Springer, Berlin, 1994, p. 111). Typical results include showing that the diameter condition is equivalent to a weakening of the Bandelt four-point characterization that appears in Bandelt (Mathematisches Seminar, Universitat Hamburg, Germany, 1992) as well as Bandelt and Dress (Discrete Math. 136 (1994) 21), and this in turn is equivalent to the maximal linked subsets being closed under nonempty intersections; the inclusion condition is equivalent to the 2-balls coinciding with the weak clusters; the Bandelt four-point characterization is equivalent to the maximal linked subsets coinciding with the weak clusters; and a Robinsonian dissimilarity coefficient is strongly Robinsonian (in the sense of Fichet (in: Y.A. Prohorov, V.V. Sazonov (Eds.), Proceedings of the First World Congress of the BERNOULLI SOCIETY, Tachkent, 1986, V.N.U. Science Press, Vol. 2, 1987, p. 123)) if and only if it satisfies the inclusion condition.
Neurocomputing | 2016
Francisco de A. T. de Carvalho; Patrice Bertrand; Eduardo C. Simões
Interval-valued data is most utilized to represent either the uncertainty related to a single measurement, or the variability of the information inherent to a group rather than an individual. In this paper, we focus on Kohonen self-organizing maps (SOMs) for interval-valued data, and design a new Batch SOM algorithm that optimizes an explicit objective function. This algorithm can handle, respectively, suitable City-Block, Euclidean and Hausdorff distances with the purpose to compare interval-valued data during the training of the SOM. Moreover, most often conventional batch SOM algorithms consider that all variables are equally important in the training of the SOM. However, in real situations, some variables may be more or less important or even irrelevant for this task. Thanks to a parameterized definition of the above-mentioned distances, we propose also an adaptive version of the new algorithm that tackles this problem with an additional step where a relevance weight is automatically learned for each interval-valued variable. Several examples with synthetic and real interval-valued data sets illustrate the usefulness of the two novel batch SOM algorithms.
Discrete Applied Mathematics | 2008
Patrice Bertrand
A set X is said to properly intersect a set Y if none of the sets X@?Y, X@?Y and Y@?X is empty. In this paper, we consider collections of subsets such that each member of the collection properly intersects at most one other member. Such collections are hereafter called paired hierarchical collections. The two following combinatorial properties are investigated. First, any paired hierarchical collection is a set of intervals of at least one linear order defined on the ground set. Next, the maximum size of a paired hierarchical collection defined on an n-element set is @?52(n-1)@?. The properties of these collections are also investigated from the cluster analysis point of view. In the framework of the general bijection defined by Batbedat [Les isomorphismes HTS et HTE (apres la bijection de Benzecri-Johnson), Metron 46 (1988) 47-59] and Bertrand [Set systems and dissimilarities, European J. Combin. 21 (2000) 727-743], we characterize the dissimilarities that are induced by weakly indexed paired hierarchical collections. Finally, we propose a proof of the so-called agglomerative paired hierarchical clustering (APHC) algorithm that extends the well-known AHC algorithm in order to allow that some clusters can be merged twice. An implementation and some illustrations of this algorithm and of a variant of it were presented by Chelcea et al. [A new agglomerative 2-3 hierarchical clustering algorithm, in: D. Baier, K.-D. Wernecke (Eds.), Innovations in Classification, Data Science, and Information Systems (GfKL 2003), Springer, Berlin, 2004, pp. 3-10 and Un Nouvel Algorithme de Classification Ascendante 2-3 Hierarchique, in: Reconnaissance des Formes et Intelligence Artificielle (RFIA 2004), vol. 3, Toulouse, France, 2004, pp. 1471-1480. Available at ].
Archive | 2005
Sergiu Chelcea; Patrice Bertrand; Brigitte Trousse
We studied a new general clustering procedure, that we call here Agglomerative 2–3 Hierarchical Clustering (2–3 AHC), which was proposed in Bertrand (2002a, 2002b). The three main contributions of this paper are: first, the theoretical study has led to reduce the complexity of the algorithm from \(\mathcal{O}\)>(n3) to \(\mathcal{O}\)(n2logn). Secondly, we proposed a new 2–3 AHC algorithm that simplifies the one proposed in 2002 (its principle is closer to the principle of the classical AHC). Finally, we proposed a first implementation of a 2–3 AHC algorithm.
Electronic Notes in Discrete Mathematics | 1999
Patrice Bertrand; Melvin F. Janowitz
Abstract Several approaches have been proposed for the purpose of proving that different classes of dissimilarities (e.g. ultrametrics) can be represented by certain types of stratified clusterings which are easily visualized (e.g. indexed hierarchies). These approaches differ in the choice of the clusters that are used to represent a dissimilarity coefficient. More precisely, the clusters may be defined as the maximal linked subsets, also called ML-sets; equally they may be defined as a particular type of 2-ball. In this communication, we first introduce the notion of a k-ball, thereby extending the notion of a 2-ball. For an arbitrary dissimilarity coefficient, we then establish some properties of k-balls that establish the connection between them and the ML-sets. We also introduce the k-point inequality (k ≥ 3) which is an extension of the Bandelt four-point inequality. Moreover, we define a new class of clusterings, called k-weak hierarchies, whose main characteristic is that the intersection of (k + 1) arbitrary clusters may be reduced to the intersection of some k of these clusters. When k ≥ 4, we prove that the dissimilarities satisfying the k-point inequality are in one-one correspondence with the k-weak hierarchical representations.
Collaboration
Dive into the Patrice Bertrand's collaboration.
Francisco de A. T. de Carvalho
Universidade Federal Rural de Pernambuco
View shared research outputs