Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dankmar Böhning is active.

Publication


Featured researches published by Dankmar Böhning.


Biometrics | 1992

Computer-assisted analysis of mixtures (C.A.MAN) statistical algorithms

Dankmar Böhning; Peter Schlattmann; Bruce G. Lindsay

SUMMARY This paper presents various algorithmic approaches for computing the maximum likelihood estimator of the mixing distribution of a one-parameter family of densities and provides a unifying computeroriented concept for the statistical analysis of unobserved heterogeneity (i.e., observations stemming from different subpopulations) in a univariate sample. The case with unknown number of population subgroups as well as the case with known number of population subgroups, with emphasis on the first, is considered in the computer package C.A.MAN (Computer Assisted Mixture ANalysis). It includes an algorithmic menu with choices of the EM algorithm, the vertex exchange algorithm, a combination of both, as well as the vertex direction method. To ensure reliable convergence, a steplength menu is provided for the three latter methods, each achieving monotonicity for the direction of choice. C.A.MAN has the option to work with restricted support size-that is, the case when the number of components is known a priori. In the latter case, the EM algorithm is used. Applications of mixture modelling in medical problems are discussed.


Statistical Methods in Medical Research | 2008

Revisiting youden's index as a useful measure of the misclassification error in meta-analysis of diagnostic studies

Dankmar Böhning; Walailuck Böhning; Heinz Holling

The paper considers meta-analysis of diagnostic studies that use a continuous score for classification of study participants into healthy or diseased groups. Classification is often done on the basis of a threshold or cut-off value, which might vary between studies. Consequently, conventional meta-analysis methodology focusing solely on separate analysis of sensitivity and specificity might be confounded by a potentially unknown variation of the cut-off value. To cope with this phenomena it is suggested to use, instead, an overall estimate of the misclassification error previously suggested and used as Youdens index and; furthermore, it is argued that this index is less prone to between-study variation of cut-off values. A simple Mantel—Haenszel estimator as a summary measure of the overall misclassification error is suggested, which adjusts for a potential study effect. The measure of the misclassification error based on Youdens index is advantageous in that it easily allows an extension to a likelihood approach, which is then able to cope with unobserved heterogeneity via a nonparametric mixture model. All methods are illustrated at hand of an example on a diagnostic meta-analysis on duplex doppler ultrasound, with angiography as the standard for stroke prevention.


Biometrics | 1998

Recent Developments in Computer-Assisted Analysis of Mixtures

Dankmar Böhning; Ekkehart Dietz; Peter Schlattmann

This paper reviews recent developments in the area of computer-assisted analysis of mixture distributions (C.A.MAN). Given a biometric situation of interest in which, under homogeneity assumptions, a certain parametric density occurs, such as the Poisson, the binomial, the geometric, the normal, and so forth, then it is argued that this situation can easily be enlarged to allow a variation of the scalar parameter in the population. This situation is called unobserved heterogeneity. This naturally leads to a specific form of nonparametric mixture distribution that can then be assumed to be the standard model in the biometric application of interest (since it also incorporates the homogeneous situations as a special case). Besides developments in theory and algorithms, the work focuses on developments in biometric applications such as meta-analysis, fertility studies, estimation of prevalence under clustering, and estimation of the distribution function of survival time under interval censoring. The approach is nonparametric for the mixing distribution, including leaving the number of components (subpopulations) of the mixing distribution unknown.


Computational Statistics & Data Analysis | 2007

Editorial: Advances in Mixture Models

Dankmar Böhning; Wilfried Seidel; Macro Alfó; Bernard Garel; Valentin Patilea; Günther Walther

The importance of mixture distributions is not only remarked by a number of recent books on mixtures including Lindsay (1995), Böhning (2000), McLachlan and Peel (2000) and Frühwirth-Schnatter (2006) which update previous books by Everitt and Hand (1981), Titterington et al. (1985) and McLachlan and Basford (1988). Also, a diversity of publications on mixtures appeared in this journal since 2003 (which we take here as a milestone with the appearance of the first special issue on mixtures) including Hazan et al. (2003), Benton and Krishnamoorthy (2003), Woodward and Sain (2003), Besbeas and Morgan (2004), Jamshidian (2004), Hürlimann (2004), Bohacek and Rozovskii (2004), Tao et al. (2004), Vaz de Melo Mendes and Lopes (2006), Agresti et al. (2006), Bartolucci and Scaccia (2006), D’Elia and Piccolo (2005), Neerchal and Morel (2005), Klar and Meintanis (2005), Bocci et al. (2006), Hu and Sung (2006), Seidel et al. (2006), Nadarajah (2006), Almhana et al. (2006), Congdon (2006), Priebe et al. (2006), and Li and Zha (2006). In the following we give a brief introduction to the papers contributing novels aspects in this Special Issue. These come from a diversity of areas as different as capture–recapture modelling, likelihood based cluster analysis, semiparametric mixture modelling in microarray data, latent class analysis or integer lifetime data analysis—just to mention a few. Mixture models are frequently used in capture–recapture studies for estimating population size (Chao, 1987; Link, 2003; Böhning and Schön, 2005; Böhning et al., 2005; Böhning and Kuhnert, 2006). In this issue, Mao (2007) highlights a variety of sources of difficulties in statistical inference using mixture models and uses a binomial mixture model as an illustration. Random intercept models for binary data—as useful tools for addressing between-subject heterogeneity—are discussed by Caffo et al. (2007). The nonlinearity of link functions for binary data is blurred in probit models with a normally distributed random intercept because the resulting model implies a probit marginal link as well. Caffo et al. (2007) explore another family of random intercept models where the distribution associated with the marginal and conditional link function as well as the random effect distribution are all of the same family. Formann (2007) extends the latent class approach (as a specific discrete multivariate mixture model) for situations where the discrete outcome variables (such as longitudinal binary data) experience nonignorable associations and, in addition and most importantly, have missing entries as it is rather typical for repeated observations in longitudinal studies. The modelling also incorporates potential covariates. This is illustrated using data from the Muscatine Coronary Risk Factor Study. The contribution by Grün and Leisch (2007) introduces the R-package flexmixwhich provides flexible modelling of finite mixtures of regression models using the EM algorithm. Alfò et al. (2007) consider a semiparametric mixture model for detecting differentially expressed genes in microarray experiments.An important goal of microarray studies is the detection of genes that show significant changes in observed expressions when two or more classes of biological samples (e.g. treatment and control) are compared. With the c-fold rule a gene is declared to be differentially expressed if its average expression level varies by more than a constant (typically 2). Instead, Alfò et al. (2007) introduce a gene-specific random term to control for both dependence among genes and variability with respect to the probability of yielding a fold change over a threshold c. Likelihood based inference is accomplished with a two-level finite mixture model while nonparametric Bayesian estimation is performed through the counting distribution of exceedances. Mixtures-of-experts models (Jacobs et al., 1991) and their generalization, hierarchical mixtures-of-expert models (Jordan and Jacobs, 1994) have been introduced to account for nonlinearities and other complexities in the data.


Computational Statistics & Data Analysis | 2007

EditorialAdvances in Mixture Models

Dankmar Böhning; Wilfried Seidel; Macro Alfó; Bernard Garel; Valentin Patilea; Günther Walther

The importance of mixture distributions is not only remarked by a number of recent books on mixtures including Lindsay (1995), Böhning (2000), McLachlan and Peel (2000) and Frühwirth-Schnatter (2006) which update previous books by Everitt and Hand (1981), Titterington et al. (1985) and McLachlan and Basford (1988). Also, a diversity of publications on mixtures appeared in this journal since 2003 (which we take here as a milestone with the appearance of the first special issue on mixtures) including Hazan et al. (2003), Benton and Krishnamoorthy (2003), Woodward and Sain (2003), Besbeas and Morgan (2004), Jamshidian (2004), Hürlimann (2004), Bohacek and Rozovskii (2004), Tao et al. (2004), Vaz de Melo Mendes and Lopes (2006), Agresti et al. (2006), Bartolucci and Scaccia (2006), D’Elia and Piccolo (2005), Neerchal and Morel (2005), Klar and Meintanis (2005), Bocci et al. (2006), Hu and Sung (2006), Seidel et al. (2006), Nadarajah (2006), Almhana et al. (2006), Congdon (2006), Priebe et al. (2006), and Li and Zha (2006). In the following we give a brief introduction to the papers contributing novels aspects in this Special Issue. These come from a diversity of areas as different as capture–recapture modelling, likelihood based cluster analysis, semiparametric mixture modelling in microarray data, latent class analysis or integer lifetime data analysis—just to mention a few. Mixture models are frequently used in capture–recapture studies for estimating population size (Chao, 1987; Link, 2003; Böhning and Schön, 2005; Böhning et al., 2005; Böhning and Kuhnert, 2006). In this issue, Mao (2007) highlights a variety of sources of difficulties in statistical inference using mixture models and uses a binomial mixture model as an illustration. Random intercept models for binary data—as useful tools for addressing between-subject heterogeneity—are discussed by Caffo et al. (2007). The nonlinearity of link functions for binary data is blurred in probit models with a normally distributed random intercept because the resulting model implies a probit marginal link as well. Caffo et al. (2007) explore another family of random intercept models where the distribution associated with the marginal and conditional link function as well as the random effect distribution are all of the same family. Formann (2007) extends the latent class approach (as a specific discrete multivariate mixture model) for situations where the discrete outcome variables (such as longitudinal binary data) experience nonignorable associations and, in addition and most importantly, have missing entries as it is rather typical for repeated observations in longitudinal studies. The modelling also incorporates potential covariates. This is illustrated using data from the Muscatine Coronary Risk Factor Study. The contribution by Grün and Leisch (2007) introduces the R-package flexmixwhich provides flexible modelling of finite mixtures of regression models using the EM algorithm. Alfò et al. (2007) consider a semiparametric mixture model for detecting differentially expressed genes in microarray experiments.An important goal of microarray studies is the detection of genes that show significant changes in observed expressions when two or more classes of biological samples (e.g. treatment and control) are compared. With the c-fold rule a gene is declared to be differentially expressed if its average expression level varies by more than a constant (typically 2). Instead, Alfò et al. (2007) introduce a gene-specific random term to control for both dependence among genes and variability with respect to the probability of yielding a fold change over a threshold c. Likelihood based inference is accomplished with a two-level finite mixture model while nonparametric Bayesian estimation is performed through the counting distribution of exceedances. Mixtures-of-experts models (Jacobs et al., 1991) and their generalization, hierarchical mixtures-of-expert models (Jordan and Jacobs, 1994) have been introduced to account for nonlinearities and other complexities in the data.


Statistical Methods in Medical Research | 2011

A limitation of the diagnostic-odds ratio in determining an optimal cut-off value for a continuous diagnostic test

Dankmar Böhning; Heinz Holling; Valentin Patilea

The article considers the diagnostic odds ratio, a special summarising function of specificity and sensitivity for a given diagnostic test, which has been suggested as a measure of diagnostic discriminatory power. In the situation of a continuous diagnostic test a cut-off value has to be chosen and it is a common practice to choose the cut-off value on the basis of the maximised diagnostic odds ratio. We show that this strategy is not to be recommended since it might easily lead to cut-off values on the boundary of the parameter range. This is illustrated by means of some examples. The source of the deficient behaviour of the diagnostic odds ratio lies in the convexity of the log-diagnostic odds ratio as a function of the cut-off value. This can easily be seen in practice by plotting a non-parametric estimate of the log-DOR against the cut-off value. In fact, it is shown for the case of a normal distributed diseased and a normal distributed non-diseased population with equal variances that the log-DOR is a convex function of the cut-off value. It is also shown that these problems are not present for the Youden index, which appears to be a better choice.


Journal of Agricultural Biological and Environmental Statistics | 2008

Estimating the hidden number of scrapie affected holdings in Great Britain using a simple, truncated count model allowing for heterogeneity

Dankmar Böhning; Victor J. Del Rio Vilas

None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture-recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture-recapture models. Alternative methods, still under the capture-recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture-recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao’s lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates—in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.


Psychological Methods | 2012

A mixed model approach to meta-analysis of diagnostic studies with binary test outcome.

Philipp Doebler; Heinz Holling; Dankmar Böhning

We propose 2 related models for the meta-analysis of diagnostic tests. Both models are based on the bivariate normal distribution for transformed sensitivities and false-positive rates. Instead of using the logit as a transformation for these proportions, we employ the tα family of transformations that contains the log, logit, and (approximately) the complementary log. A likelihood ratio test for the cutoff value problem is developed, and summary receiver operating characteristic (SROC) curves are discussed. Worked examples showcase the methodology. We compare the models to the hierarchical SROC model, which in contrast employs a logit transformation. Data from various meta-analyses are reanalyzed, and the reanalysis indicates a better performance of the models based on the tα transformation.


The Annals of Applied Statistics | 2011

Population size estimation based upon ratios of recapture probabilities

Irene Rocchetti; John Bunge; Dankmar Böhning

Estimating the size of an elusive target population is of prominent interest in many areas in the life and social sciences. Our aim is to provide an efficient and workable method to estimate the unknown population size, given the frequency distribution of counts of repeated identifications of units of the population of interest. This counting variable is necessarily zero-truncated, since units that have never been identified are not in the sample. We consider several applications: clinical medicine, where interest is in estimating patients with adenomatous polyps which have been overlooked by the diagnostic procedure; drug user studies, where interest is in estimating the number of hidden drug users which are not identified; veterinary surveillance of scrapie in the UK, where interest is in estimating the hidden amount of scrapie; and entomology and microbial ecology, where interest is in estimating the number of unobserved species of organisms. In all these examples, simple models such as the homogenous Poisson are not appropriate since they do not account for present and latent heterogeneity. The Poisson–Gamma (negative binomial) model provides a flexible alternative and often leads to well-fitting models. It has a long history and was recently used in the development of the Chao–Bunge estimator. Here we use a different property of the Poisson–Gamma model: if we consider ratios of neighboring Poisson–Gamma probabilities, then these are linearly related to the counts of repeated identifications. Also, ratios have the useful property that they are identical for truncated and untruncated distributions. In this paper we propose a weighted logarithmic regression model to estimate the zero frequency counts, assuming a Gamma–Poisson distribution for the counts. A detailed explanation about the chosen weights and a goodness of fit index are presented, along with extensions to other distributions. To evaluate the proposed estimator, we applied it to the benchmark examples mentioned above, and we compared the results with those obtained through the Chao–Bunge and other estimators. The major benefits of the proposed estimator are that it is defined under mild conditions, whereas the Chao–Bunge estimator fails to be well defined in several of the examples presented; in cases where the Chao–Bunge estimator is defined, its behavior is comparable to the proposed estimator in terms of Bias and MSE as a simulation study shows. Furthermore, the proposed estimator is relatively insensitive to inclusion or exclusion of large outlying frequencies, while sensitivity to outliers is characteristic of most other methods. The implications and limitations of such methods are discussed.


Statistical Modelling | 2012

Meta-analysis of diagnostic studies based upon SROC-curves: a mixed model approach using the Lehmann family

Heinz Holling; Walailuck Böhning; Dankmar Böhning

Meta-analysis of diagnostic studies experiences the common problem that different studies might not be comparable since they have been using a different cut-off value for the continuous or ordered categorical diagnostic test value defining different regions for which the diagnostic test is defined to be positive. Hence specificities and sensitivities arising from different studies might vary just because the underlying cut-off value had been different. To cope with the cut-off value problem, interest is usually directed towards the receiver operating characteristic (ROC) curve which consists of pairs of sensitivities and false positive rate (1–specificity). In the context of meta-analysis, one pair represents one study and the associated diagram is called SROC curve where the S stands for ‘summary’. The paper will consider—as a novel approach—modelling SROC curves with the Lehmann family that assumes log-sensitivity is proportional to the log-false positive rate across studies. The approach allows for study-specific false positive rates which are treated as (infinitely many) nuisance parameters and eliminated by means of the profile likelihood. The adjusted profile likelihood turns out to have a simple univariate Gaussian structure which is ultimately used for building inference for the parameter of the Lehmann family. The Lehmann model is further extended by allowing the constant of proportionality to vary across studies to cope with unobserved heterogeneity. The simple Gaussian form of the adjusted profile likelihood allows this extension easily as a form of a mixed model in which unobserved heterogeneity is incorporated by means of a normal random effect. Some meta-analytic applications on diagnostic studies including brain natriuretic peptides for heart failure, alcohol use disorder identification test (AUDIT) and the consumption part of AUDIT for detection of unhealthy alcohol use as well as the mini-mental state examination for cognitive disorders are discussed to illustrate the methodology.

Collaboration


Dive into the Dankmar Böhning's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nishan Guha

John Radcliffe Hospital

View shared research outputs
Top Co-Authors

Avatar

P. H. Sönksen

University of Southampton

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Victor J. Del Rio Vilas

Veterinary Laboratories Agency

View shared research outputs
Researchain Logo
Decentralizing Knowledge