Hans-Georg Bartel
Humboldt University of Berlin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hans-Georg Bartel.
GfKl | 2008
Hans-Joachim Mucha; Hans-Georg Bartel; Jens Dolata
In archaeometry the focus is mainly on chemical analysis of archaeological artifacts such as glass objects or pottery. Usually the artefacts are characterized by their chemical composition. Here the focus is on cluster analysis of compositional data. Using Euclidean distances cluster analysis is closely related to principal component analysis (PCA) that is a frequently used multivariate projection technique in archaeometry. Since PCA and cluster analysis based on Euclidean distances are scale dependent, some kind of “appropriate” data transformation is necessary. Some different techniques of data preparation will be presented. We consider the log-ratio transformation of Aitchison and the transformation into ranks in more detail. From the statistical point of view the latter is a robust method.
Ecotoxicology | 2002
Stefan Pudenz; Rainer Brüggemann; Hans-Georg Bartel
A rather small data matrix of seven chemicals and 17 different ecotoxicological end points is examined by methods of Discrete Mathematics. Especially, the lattice theory and its variant, the Formal Concept Analysis may be an attractive tool to analyze Quantitative Structure Activity Relationships, when a numerical functional approach is not at hand. The central item is the so called concept, which is a pair of subsets: A subset of molecules and a subset of properties which correspond to each other. The concepts are partially ordered due to a subset relation. From this subset relation, if–then-rules are derived, which aim to relate the structure of molecules with their ecotoxicological properties. For example, the following chemical rule is found: Cl ⇒ (2A,2C,2M). That means, all substances considered here having a “–Cl” as structural code have a medium ecotoxicological effect on Daphnia magna , Orconectes immunisare (Crustacea) and on Photobacterium phosphoreum , at least within the training set.
Archive | 2014
Hans-Georg Bartel; Hans-Joachim Mucha
Usually, there are only two stages of comparability between two objects: they are comparable or incomparable (see, for instance, the theory of partially ordered sets). The same holds with respect to equality/inequality. In this publication, measures of incomparability u ij and of inequality v ij between two objects g i and g j with m attributes with respect to the relation ≤ are introduced. Based on these definitions the (non-metric) distance measure \( {a}_{ij}=\frac{1}{2}\left({u}_{ij}+{v}_{ij}\right) \) with maximal possible values \( m+1+\left[\frac{m}{2}\right]\cdot \left(m-\left[\frac{m}{2}\right]\right) \) is proposed. The distance matrix A = (a ij ) will be used for clustering starting from the corresponding complete graph 〈g〉 (g – number of objects), whose edges g i –g j are valued by a ij . The result of the classification consists of a set of complete subgraphs, where, for instance, the objective function of compactness of a cluster is based on all pairwise distances of its members. The same edge-valued graph is used to construct a transitive-directed tournament. Thus, a unique seriation of the objects can be obtained which can also be used for further interpretation of the data. For illustrative purposes, an application to environmental chemistry with only a small data set is considered.
Archive | 2002
H.-J. Mucha; Hans-Georg Bartel; J. Dolata
Mathematical methods of classification are suited to support archaeological interpretation ((1978)). Especially clustering techniques are frequently used data mining tools for finding structure in the data set. They aim at the partition of a generally huge set of unarranged high-dimensional objects (observations) into homogeneous subsets (clusters). Usually the starting point is data without hypotheses about the data. Here a sample of Roman brick and tile is collected which is characterized by 19 chemical elements (variables). The sample size is 613 ((1999a)). An adaptive clustering algorithm is applied which can handle different scaling of the variables automatically. The number of clusters is investigated by using a simulation technique. Furthermore the importance of the variables is quantified in the same way. Afterwards this knowledge is used for both hierarchical clustering and the visualization of data and clusters by principal component analysis.
GfKl | 2005
Hans-Joachim Mucha; Hans-Georg Bartel; Jens Dolata
Chemical analysis of ancient ceramics has been used frequently to support archaeological interpretation. Often the dimensionality in the data has been high. Therefore multivariate statistical techniques like cluster analysis have been applied. Successful applications of simple model-based Gaussian clustering of Roman bricks and tiles has been reported by Mucha et al. (2001). And now, more complex Gaussian models can be investigated because of an increase of sample size by new findings excavated in Boppard. Additionally these and previous successful simple models will be applied in a very local fashion considering two supposed brickyards only. Here, after giving a brief history of clustering Roman bricks and tiles, some cluster analysis models including different data transformations will be investigated in order to answer questions like: Is it possible to differentiate between brickyards of Rheinzabern and Worms on basis of chemical analysis? Do the bricks and tiles found in Boppard belong to the brickyards of Worms or Rheinzabern?
GfKl | 2014
Hans-Joachim Mucha; Hans-Georg Bartel
The bootstrap approach is resampling taken with replacement from the original data. Here we consider sampling from the empirical distribution of a given data set in order to investigate the stability of results of cluster analysis. Concretely, the original bootstrap technique can be formulated by choosing the following weights of observations: m i = n, if the corresponding object i is drawn n times, and m i = 0, otherwise. We call the weights of observations masses. In this paper, we present another bootstrap method, called soft bootstrapping, that consists of random change of the “bootstrap masses” to some degree. Soft bootstrapping can be applied to any cluster analysis method that makes (directly or indirectly) use of weights of observations. This resampling scheme is especially appropriate for small sample sizes because no object is totally excluded from the soft bootstrap sample. At the end we compare different resampling techniques with respect to cluster analysis.
Monatshefte Fur Chemie | 1979
Winfried Mientus; Werner Haberditzl; Hans-Georg Bartel
Based on the combination ofPars-Orbitals (PO) the LCPO-MO-Method has been described for the quantum chemical treatment of large molecules which are divided in reasonable fragments. The secular matrix constructed from the Eigenvalues and parameters of the molecular fragments has a characteristic block-form. The dimension of the secularproblem can be reduced and depends on the used approximations. The resulting method is demonstrated for naphthalene.
GfKl | 2014
Hans-Joachim Mucha; Hans-Georg Bartel; Jens Dolata
We consider binary classification based on the dual scaling technique. In the case of more than two classes many binary classifiers can be considered. The proposed approach goes back to Mucha (An intelligent clustering technique based on dual scaling. In: S. Nishisato, Y. Baba, H. Bozdogan, K. Kanefuji (eds.) Measurement and multivariate analysis, pp. 37–46. Springer, Tokyo, 2002) and it is based on the pioneering book of Nishisato (Analysis of categorical data: Dual scaling and its applications. The University of Toronto Press, Toronto, 1980). It is applicable to mixed data the statistician is often faced with. First, numerical variables have to be discretized into bins to become ordinal variables (data preprocessing). Second, the ordinal variables are converted into categorical ones. Then the data is ready for dual scaling of each individual variable based on the given two classes: each category is transformed into a score. Then a classifier can be derived from the scores simply in an additive manner over all variables. It will be compared with the simple Bayesian classifier (SBC). Examples and applications to archaeometry (provenance studies of Roman ceramics) are presented.
Classification and Data Mining | 2013
Hans-Joachim Mucha; Hans-Georg Bartel; Carlos Morales-Merino
We present some methods for (multivariate) visualisation of cluster analysis results and cluster validation results. Visualisation is essential for a better understanding of results because it operates at the interface between statisticians and researchers. Without loss of generality, we focus on visualisation of clustering based on pairwise distances. Here, usually one can start with “dimensionless” heatmaps (fingerprints) of proximity matrices. The Excel “Big Grid” spreadsheet is both a distinguished depository for data/proximities and a plotting board for multivariate graphics such as dendrograms, plot-dendrograms, informative dendrograms and discriminant projection plots. Informative dendrograms are ordered binary trees that show additional information such as stability values of the clusters. In this way, graphics can be a very useful and much simpler aid for the reader.
GfKl | 2007
Jens Dolata; Hans-Joachim Mucha; Hans-Georg Bartel
During the past few years, a complex model of history and relations of Roman brick and tile production in south-west Germany has been developed by archaeologists. However, open questions remain concerning the brickyard of Frankfurt-Nied. From the statistical point of view the set of bricks and tiles of this location is divided into two clusters. These clusters can be confirmed by cluster validation. As a result of these validations, archaeologists can now modify and consolidate their ideas about the internal structures of Roman brick and tile making in Frankfurt-Nied.