Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Staal A. Vinterbo is active.

Publication


Featured researches published by Staal A. Vinterbo.


International Journal of Approximate Reasoning | 2000

Minimal approximate hitting sets and rule templates

Staal A. Vinterbo; Aleksander Øhrn

A set S that has a non-empty intersection with every set in a collection of sets C is called a hitting set of C. If no elements can be removed from S without violating the hitting set property, then we say that S is minimal. Several interesting problems can in part be formulated as that of having to find one or more minimal hitting sets. Many of these problems require proper solutions, but sometimes approximate solutions suffice. We define an r-approximate hitting set as a set that intersects at least a fraction r of the sets in C. This notion is extended to the case, where C is a weighted multiset, and properties of r are explored with respect to simplification of C by absorption of supersets. Also, approximations of reducts from rough set theory are defined by means of minimal r-approximate hitting sets, and some links to the notion of dynamic reducts are established. n nThe most common use of reducts are as templates for the generation of minimal classification rules from empirical data. A genetic algorithm is devised to compute r-approximate hitting sets, and applied to induce classifiers from selected real-world databases. These classifiers are then compared to those generated from proper reducts and dynamic reducts, respectively. The improvement in discriminatory ability yielded by the r-approximate reduct based classifiers was statistically significant (p<0.05). Furthermore, they were smaller, i.e., had fewer rules.


Journal of the American Medical Informatics Association | 2008

SMART—An Integrated Wireless System for Monitoring Unattended Patients

Dorothy Curtis; Esteban J. Pino; Jacob Bailey; Eugene I. Shih; Jason Waterman; Staal A. Vinterbo; Thomas O. Stair; John V. Guttag; Robert A. Greenes; Lucila Ohno-Machado

Monitoring vital signs and locations of certain classes of ambulatory patients can be useful in overcrowded emergency departments and at disaster scenes, both on-site and during transportation. To be useful, such monitoring needs to be portable and low cost, and have minimal adverse impact on emergency personnel, e.g., by not raising an excessive number of alarms. The SMART (Scalable Medical Alert Response Technology) system integrates wireless patient monitoring (ECG, SpO(2)), geo-positioning, signal processing, targeted alerting, and a wireless interface for caregivers. A prototype implementation of SMART was piloted in the waiting area of an emergency department and evaluated with 145 post-triage patients. System deployment aspects were also evaluated during a small-scale disaster-drill exercise.


Artificial Intelligence in Medicine | 2000

A genetic algorithm approach to multi-disorder diagnosis.

Staal A. Vinterbo; Lucila Ohno-Machado

One of the common limitations of expert systems for medical diagnosis is that they make an implicit assumption that multiple disorders do not co-occur in a single patient. The need for this simplifying assumption stems from the fact that finding minimal sets of disorders that cover all symptoms for a given patient is generally computationally intractable (NP-hard). In this paper, we explain the need for performing multi-disorder diagnosis, review previous approaches, formulate the problem using set theory notation, and propose the use of a search method based on a genetic algorithm. We test the algorithm and compare it to another approach using a simple example. The genetic algorithm performs well independently of the order of symptoms, and has the potential to perform multi-disorder diagnosis using existing or newly developed knowledge bases.


international conference on neural information processing | 2002

Two applications of the LSA machine

Andreas Alexander Albrecht; G. Lappas; Staal A. Vinterbo; C. Wong; Lucila Ohno-Machado

We present two applications of a learning algorithm that combines logarithmic simulated annealing with the perceptron algorithm. The implementation of the learning algorithm is called LSA machine and has been successfully applied already to the classification of liver tissue from CT images. We investigate the performance of the LSA machine on two sets of numerical data: The Wisconsin breast cancer diagnosis (WBCD) database and microarray data published by Golub et al. (1999). The WBCD data consist of 683 samples with 9 input values that are divided into 444 benign cases (positive examples) and 239 malignant cases (negative examples). The LSA machine has been trained on 50% and 75% of the entire sample set, and the test has been performed on the remaining samples. In both cases, we obtain a correct classification close to 99% which is comparable to the best results published on WBCD data. The training set of the microarray data consists of I I samples of acute myeloid leukemia (AML) and 27 samples of acute lymphoblastic leukemia (ALL), each of them with 7129 input values (gene-expression data). For the test, 14 AML samples and 20 ALL samples are used. We obtain a single classification error (which is a ALL test sample) on seven genes only, which improves on the results published by Golub et al. (1999) by using the model of self-organising maps. Our result is competitive to the best results published for support vector machines.


Artificial Intelligence in Medicine | 2003

An Epicurean learning approach to gene-expression data classification

Andreas Alexander Albrecht; Staal A. Vinterbo; Lucila Ohno-Machado

We investigate the use of perceptrons for classification of microarray data where we use two datasets that were published in [Nat. Med. 7 (6) (2001) 673] and [Science 286 (1999) 531]. The classification problem studied by Khan et al. is related to the diagnosis of small round blue cell tumours (SRBCT) of childhood which are difficult to classify both clinically and via routine histology. Golub et al. study acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). We used a simulated annealing-based method in learning a system of perceptrons, each obtained by resampling of the training set. Our results are comparable to those of Khan et al. and Golub et al., indicating that there is a role for perceptrons in the classification of tumours based on gene-expression data. We also show that it is critical to perform feature selection in this type of models, i.e. we propose a method for identifying genes that might be significant for the particular tumour types. For SRBCTs, zero error on test data has been obtained for only 13 out of 2308 genes; for the ALL/AML problem, we have zero error for 9 out of 7129 genes that are used for the classification procedure. Furthermore, we provide evidence that Epicurean-style learning and simulated annealing-based search are both essential for obtaining the best classification results.


Journal of the American Medical Informatics Association | 2002

Effects of Data Anonymization by Cell Suppression on Descriptive Statistics and Predictive Modeling Performance

Lucila Ohno-Machado; Staal A. Vinterbo; Stephan Dreiseitl

Protecting individual data in disclosed databases is essential. Data anonymization strategies can produce table ambiguation by suppression of selected cells. Using table ambiguation, different degrees of anonymization can be achieved, depending on the number of individuals that a particular case must become indistinguishable from. This number defines the level of anonymization. Anonymization by cell suppression does not necessarily prevent inferences from being made from the disclosed data. Preventing inferences may be important to preserve confidentiality. We show that anonymized data sets can preserve descriptive characteristics of the data, but might also be used for making inferences on particular individuals, which is a feature that may not be desirable. The degradation of predictive performance is directly proportional to the degree of anonymity. As an example, we report the effect of anonymization on the predictive performance of a model constructed to estimate the probability of disease given clinical findings.


BMC Bioinformatics | 2006

Approximation properties of haplotype tagging

Staal A. Vinterbo; Stephan Dreiseitl; Lucila Ohno-Machado

BackgroundSingle nucleotide polymorphisms (SNPs) are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties.ResultsIt is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n)/2) for n haplotypes but not approximable within (1 - ε) ln(n/2) for any ε > 0 unless NP ⊂ DTIME(nlog log n).A simple, very easily implementable algorithm that exhibits the above upper bound on solution quality is presented. This algorithm has running time O((2m - p + 1)) ≤ O(m(n2 - n)/2) where p ≤ min(n, m) for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound.ConclusionThe haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.


international conference on information security | 2018

A Simple Algorithm for Estimating Distribution Parameters from \(n\)-Dimensional Randomized Binary Responses

Staal A. Vinterbo

Randomized response for privacy protection is attractive as provided disclosure control can be quantified by means such as differential privacy. However, recovering statistics involving multiple dependent binary attributes can be difficult, posing a barrier to the use of randomized response for privacy protection. In this work, we identify a family of randomizers for which we are able to present a simple and efficient algorithm for obtaining unbiased maximum likelihood estimates for k-way marginal distributions from the randomized data. We also provide theoretical bounds on the statistical efficiency of these estimates, allowing the assessment of sample sizes for these randomizers. The identified family consists of randomizers generated by an iterated Kronecker product of an invertible and bisymmetric 2 x 2 matrix. This family includes modes of Googles Rappor randomizer, as well as applications of two well-known classical randomized response methods: Warners original method, and Simmons unrelated question method. We find that randomizers in this family can also be considered to be equivalent to each other with respect to the efficiency -- differential privacy tradeoff. Importantly, the estimation algorithm is simple to implement, an aspect critical to technologies for privacy protection and security.


Archive | 2003

Epicurean-style Learning Applied to the Classification of Gene-Expression Data

Andreas Alexander Albrecht; Staal A. Vinterbo; Lucila Ohno-Machado

We investigate the use of perceptrons for classification of microarray data where we use two datasets that were published in Khan et al., Nature [Medicine], vol. 7, 2001, and Golub et al., Science, vol. 286, 1999. The classification problem studied by Khan et al. is related to the diagnosis of small round blue cell tumours of childhood (SRBCT) which are difficult to classify both clinically and via routine histology. Golub et al. study acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). We used a simulated annealing-based method in learning a system of perceptrons, each obtained by resampling of the training set. Our results are comparable to those of Khan et al. and Golub et al., indicating that there is a role for perceptrons in the classification of tumours based on gene expression data. We also show that it is critical to perform feature selection in this type of models, i.e., we propose a method for identifying genes that might be significant for the particular tumour types. For SRBCTs, zero error on test data has been obtained for only 10 out of 2308 genes; for the ALL/AML problem, our results are competitive to the best results published in the literature, and we obtain 6 genes out of 7129 genes that are used for the classification procedure. Furthermore, we provide evidence that Epicurean-style learning is essential for obtaining the best classification results.


international conference on artificial neural networks | 2002

A Simulated Annealing and Resampling Method for Training Perceptrons to Classify Gene-Expression Data

Andreas Alexander Albrecht; Staal A. Vinterbo; C. K. Wong; Lucila Ohno-Machado

We investigate the use of perceptrons for classification of microarray data. Small round blue cell tumours of childhood are difficult to classify both clinically and via routine histology. Khan et al. [10] showed that a system of artificial neural networks can utilize gene expression measurements from microarrays and classify these tumours into four different categories. We used a simulated annealing-based method in learning a system of perceptrons, each obtained by resampling of the training set. Our results are comparable to those of Khan et al., indicating that there is a role for perceptrons in the classification of tumours based on gene expression data. We also show that it is critical to perform feature selection in this type of models.

Collaboration


Dive into the Staal A. Vinterbo's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stephan Dreiseitl

University of Health Sciences Antigua

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aleksander Øhrn

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

C. K. Wong

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Harald Kittler

Medical University of Vienna

View shared research outputs
Top Co-Authors

Avatar

Michael Binder

Medical University of Vienna

View shared research outputs
Top Co-Authors

Avatar

Aleksander Øhrn

Norwegian University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge