David Siegmund | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Siegmund is active.

Explore More

Publication

Featured researches published by David Siegmund.

Biometrics | 1982

Maximally Selected Chi Square Statistics

Rupert Miller; David Siegmund

Two samples can be compared by selecting a cut point and then forming a 2 x 2 table of the numbers of observations above and below the cut point in each sample. When the cut point is selected so as to maximize the standard chi square statistic, the chi square percentile points are inappropriate. Actual significance levels are computed for large samples, and correct percentile pOilltS are tabulated.

American Journal of Pathology | 2003

Gene Expression Patterns and Gene Copy Number Changes in Dermatofibrosarcoma Protuberans

Sabine C. Linn; Robert B. West; Jonathan R. Pollack; Shirley Zhu; Tina Hernandez-Boussard; Torsten O. Nielsen; Brian P. Rubin; Rajiv M. Patel; John R. Goldblum; David Siegmund; David Botstein; Patrick O. Brown; C. Blake Gilks; Matt van de Rijn

Dermatofibrosarcoma protuberans (DFSP) is an aggressive spindle cell neoplasm. It is associated with the chromosomal translocation, t(17:22), which fuses the COL1A1 and PDGFbeta genes. We determined the characteristic gene expression profile of DFSP and characterized DNA copy number changes in DFSP by array-based comparative genomic hybridization (array CGH). Fresh frozen and formalin-fixed, paraffin-embedded samples of DFSP were analyzed by array CGH (four cases) and DNA microarray analysis of global gene expression (nine cases). The nine DFSPs were readily distinguished from 27 other diverse soft tissue tumors based on their gene expression patterns. Genes characteristically expressed in the DFSPs included PDGF beta and its receptor, PDGFRB, APOD, MEOX1, PLA2R, and PRKCA. Array CGH of DNA extracted either from frozen tumor samples or from paraffin blocks yielded equivalent results. Large areas of chromosomes 17q and 22q, bounded by COL1A1 and PDGF beta, respectively, were amplified in DFSP. Expression of genes in the amplified regions was significantly elevated. Our data shows that: 1) DFSP has a distinctive gene expression profile; 2) array CGH can be applied successfully to frozen or formalin-fixed, paraffin-embedded tumor samples; 3) a characteristic amplification of sequences from chromosomes 17q and 22q, demarcated by the COL1A1 and PDGF beta genes, respectively, was associated with elevated expression of the amplified genes.

International Statistical Review | 1989

On Hotelling's Approach to Testing for a Nonlinear Parameter in Regression

Mark Knowles; David Siegmund

Abstract : The method suggested by Hotelling (1939) to test for a nonlinear parameter in a regression model is reviewed. Using the method of Weyl (1939), we derive a simple expression for the volume of a tube about a two dimensional manifold with boundary embedded in the unit sphere in I delta R sub n. Applications to testing for a single harmonic of undetermined frequency and phase and to testing for a change-point in linear regression are discussed. Keywords: Differential geometry.

Biometrics | 1998

Multipoint linkage analysis using affected relative pairs and partially informative markers.

Jun Teng; David Siegmund

Linkage analysis is a method of identifying regions of the human genome harboring genes affecting the risk for a particular disease. It works by finding chromosomal segments inherited by affected relatives from a common ancestor (i.e., identical by descent or IBD) in excess of that expected by chance. Two complicating factors are that only a relatively small number of genomic locations (marker loci) are examined and the number of distinct realizations (alleles) at each marker is not large. Hence, unambiguous determination of IBD is impossible for any genomic location without additional information. Assuming data from a set of mapped, partially informative markers, we evaluate the effectiveness of a method that analyzes the array of markers on each chromosome jointly (multipoint methods) as a function of the informativeness and density of the markers. For the special case of pairs of half siblings whose parents are also typed, a combination of analysis and simulation is used to obtain insight into the problem of setting thresholds to control the false-positive error rate. Approximations are given for the power, and guidelines are developed to help describe the trade-offs between marker density and informativeness.

Journal of Applied Probability | 1986

Convergence of quasi-stationary to stationary distributions for stochastically monotone Markov processes

Moshe Pollak; David Siegmund

Abstract : It is shown that if a stochastically monotone Markov process of (0, infinity) with stationary distribution H has its state space truncated by making all states in (B, infinity) absorbing, then the quasi-stationary distribution of the new process coverages to H as B approaches limit of infinity. (Author)

Genetic Epidemiology | 2010

Joint Testing of Genotype and Ancestry Association in Admixed Families

Hua Tang; David Siegmund; Nicholas A. Johnson; Isabelle Romieu; Stephanie J. London

Current genome‐wide association studies (GWAS) often involve populations that have experienced recent genetic admixture. Genotype data generated from these studies can be used to test for association directly, as in a non‐admixed population. As an alternative, these data can be used to infer chromosomal ancestry, and thus allow for admixture mapping. We quantify the contribution of allele‐based and ancestry‐based association testing under a family‐design, and demonstrate that the two tests can provide non‐redundant information. We propose a joint testing procedure, which efficiently integrates the two sources information. The efficiencies of the allele, ancestry and combined tests are compared in the context of a GWAS. We discuss the impact of population history and provide guidelines for future design and analysis of GWAS in admixed populations. Genet. Epidemiol. 34:783‐791, 2010.

Statistical Science | 2004

A Report on the Future of Statistics

Bruce G. Lindsay; Jon Kettenring; David Siegmund

In May 2002 a workshop was held at the National Science Foundation to discuss the future challenges and opportunities for the statistics community. After the workshop the scientific committee produced an extensive report that described the general consensus of the community. This article is an abridgment of the full report.

Archive | 1985

The Sequential Probability Ratio Test

David Siegmund

We begin by recalling the Neyman-Pearson Lemma for testing a simple hypothesis against a simple alternative. Let x denote a (discrete or continuous) random variable (or vector) with probability density function f.

Journal of Computational Biology | 2001

Approximate P-Values for Local Sequence Alignments: Numerical Studies

John D. Storey; David Siegmund

Siegmund and Yakir (2000) have given an approximate p-value when two independent, identically distributed sequences from a finite alphabet are optimally aligned based on a scoring system that rewards similarities according to a general scoring matrix and penalizes gaps (insertions and deletions). The approximation involves an infinite sequence of difficult-to-compute parameters. In this paper, it is shown by numerical studies that these reduce to essentially two numerically distinct parameters, which can be computed as one-dimensional numerical integrals. For an arbitrary scoring matrix and affine gap penalty, this modified approximation is easily evaluated. Comparison with published numerical results show that it is reasonably accurate.

Sequential Analysis | 2013

Change-Points: From Sequential Detection to Biology and Back

David Siegmund

Abstract Modern change-point detection had its origins about 50 years ago in the work of Page, Shiryaev, and Lorden, who focused on sequential detection of a change-point in a sequence of observations. Motivation often arose from sequential quality control: to detect a disruption in the quality of a continuous production process. More recently, motivation from a broad range of applications has led to a variety of different problem formulations. In this article I will review this history with particular attention to a selected subset of applications arising in biology and to common features of different likelihood-based formulations.

Explore More