Perry Groot
Radboud University Nijmegen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Perry Groot.
web information systems engineering | 2005
Holger Wache; Perry Groot; Heiner Stuckenschmidt
Approximation has been identified as a potential way of reducing the complexity of logical reasoning. Here we explore approximation for speeding up instance retrieval in a Semantic Web context. For OWL ontologies, i.e., Description Logic (DL) Knowledge Bases, it is known that reasoning is a hard problem. Especially in instance retrieval when the number of instances that need to be retrieved becomes very large. We discuss two approximation methods for retrieving instances to conjunctive queries over DL T-Boxes and the results of experiments carried out with a modified version of the Instance Store System.
european semantic web conference | 2005
Perry Groot; Heiner Stuckenschmidt; Holger Wache
In many application scenarios, the use of the Web ontology language OWL is hampered by the complexity of the underlying logic that makes reasoning in OWL intractable in the worst case. In this paper, we address the question whether approximation techniques known from the knowledge representation literature can help to simplify OWL reasoning. In particular, we carry out experiments with approximate deduction techniques on the problem of classifying new concept expressions into an existing OWL ontology using existing Ontologies on the web. Our experiments show that a direct application of approximate deduction techniques as proposed in the literature in most cases does not lead to an improvement and that these methods also suffer from some fundamental problems.
international conference on artificial neural networks | 2011
Perry Groot; Adriana Birlutiu; Tom Heskes
In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, possibly noisy, labels from multiple annotators. Typically, annotators have different levels of expertise (i.e., novice, expert) and there is considerable diagreement among annotators. We present a Gaussian process (GP) approach to regression with multiple labels but no absolute gold standard. The GP framework provides a principled non-parametric framework that can automatically estimate the reliability of individual annotators from data without the need of prior knowledge. Experimental results show that the proposed GP multi-annotator model outperforms models that either average the training data or weigh individually learned single-annotator models.
Neurocomputing | 2010
Adriana Birlutiu; Perry Groot; Tom Heskes
We present an EM-algorithm for the problem of learning preferences with semiparametric models derived from Gaussian processes in the context of multi-task learning. We validate our approach on an audiological data set and show that predictive results for sound quality perception of hearing-impaired subjects, in the context of pairwise comparison experiments, can be improved using a hierarchical model.
Machine Learning | 2013
Adriana Birlutiu; Perry Groot; Tom Heskes
This paper presents a framework for optimizing the preference learning process. In many real-world applications in which preference learning is involved the available training data is scarce and obtaining labeled training data is expensive. Fortunately in many of the preference learning situations data is available from multiple subjects. We use the multi-task formalism to enhance the individual training data by making use of the preference information learned from other subjects. Furthermore, since obtaining labels is expensive, we optimally choose which data to ask a subject for labelling to obtain the most of information about her/his preferences. This paradigm—called active learning—has hardly been studied in a multi-task formalism. We propose an alternative for the standard criteria in active learning which actively chooses queries by making use of the available preference data from other subjects. The advantage of this alternative is the reduced computation costs and reduced time subjects are involved. We validate empirically our approach on three real-world data sets involving the preferences of people.
PLOS ONE | 2016
Elena Sokolova; Perry Groot; Tom Claassen; Kimm J. E. van Hulzen; Jeffrey C. Glennon; Barbara Franke; Tom Heskes; Jan K. Buitelaar
Background Numerous factor analytic studies consistently support a distinction between two symptom domains of attention-deficit/hyperactivity disorder (ADHD), inattention and hyperactivity/impulsivity. Both dimensions show high internal consistency and moderate to strong correlations with each other. However, it is not clear what drives this strong correlation. The aim of this paper is to address this issue. Method We applied a sophisticated approach for causal discovery on three independent data sets of scores of the two ADHD dimensions in NeuroIMAGE (total N = 675), ADHD-200 (N = 245), and IMpACT (N = 164), assessed by different raters and instruments, and further used information on gender or a genetic risk haplotype. Results In all data sets we found strong statistical evidence for the same pattern: the clear dependence between hyperactivity/impulsivity symptom level and an established genetic factor (either gender or risk haplotype) vanishes when one conditions upon inattention symptom level. Under reasonable assumptions, e.g., that phenotypes do not cause genotypes, a causal model that is consistent with this pattern contains a causal path from inattention to hyperactivity/impulsivity. Conclusions The robust dependency cancellation observed in three different data sets suggests that inattention is a driving factor for hyperactivity/impulsivity. This causal hypothesis can be further validated in intervention studies. Our model suggests that interventions that affect inattention will also have an effect on the level of hyperactivity/impulsivity. On the other hand, interventions that affect hyperactivity/impulsivity would not change the level of inattention. This causal model may explain earlier findings on heritable factors causing ADHD reported in the study of twins with learning difficulties.
Knowledge and Information Systems | 2005
Perry Groot; Annette ten Teije; Frank van Harmelen
The overall aim of this paper is to provide a general setting for quantitative quality measures of knowledge-based system behaviour that is widely applicable to many knowledge-based systems. We propose a general approach that we call degradation studies: an analysis of how system output changes as a function of degrading system input, such as incomplete or incorrect data or knowledge. To show the feasibility of our approach, we have applied it in a case study. We have taken a large and realistic vegetation-classification system, and have analysed its behaviour under various varieties of incomplete and incorrect input. This case study shows that degradation studies can reveal interesting and surprising properties of the system under study.
American Journal of Medical Genetics | 2015
Elena Sokolova; Martine Hoogman; Perry Groot; Tom Claassen; Alejandro Arias Vasquez; Jan K. Buitelaar; Barbara Franke; Tom Heskes
Attention‐deficit/hyperactivity disorder (ADHD) is a common and highly heritable disorder affecting both children and adults. One of the candidate genes for ADHD is DAT1, encoding the dopamine transporter. In an attempt to clarify its mode of action, we assessed brain activity during the reward anticipation phase of the Monetary Incentive Delay (MID) task in a functional MRI paradigm in 87 adult participants with ADHD and 77 controls (average age 36.5 years). The MID task activates the ventral striatum, where DAT1 is most highly expressed. A previous analysis based on standard statistical techniques did not show any significant dependencies between a variant in the DAT1 gene and brain activation [Hoogman et al. (2013); Neuropsychopharm 23:469–478]. Here, we used an alternative method for analyzing the data, that is, causal modeling. The Bayesian Constraint‐based Causal Discovery (BCCD) algorithm [Claassen and Heskes (2012); Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence] is able to find direct and indirect dependencies between variables, determines the strength of the dependencies, and provides a graphical visualization to interpret the results. Through BCCD one gets an opportunity to consider several variables together and to infer causal relations between them. Application of the BCCD algorithm confirmed that there is no evidence of a direct link between DAT1 genetic variability and brain activation, but suggested an indirect link mediated through inattention symptoms and diagnostic status of ADHD. Our finding of an indirect link of DAT1 with striatal activity during reward anticipation might explain existing discrepancies in the current literature. Further experiments should confirm this hypothesis.
probabilistic graphical models | 2014
Elena Sokolova; Perry Groot; Tom Claassen; Tom Heskes
Bayesian Constraint-based Causal Discovery (BCCD) is a state-of-the-art method for robust causal discovery in the presence of latent variables. It combines probabilistic estimation of Bayesian networks over subsets of variables with a causal logic to infer causal statements. Currently BCCD is limited to discrete or Gaussian variables. Most of the real-world data, however, contain a mixture of discrete and continuous variables. We here extend BCCD to be able to handle combinations of discrete and continuous variables, under the assumption that the relations between the variables are monotonic. To this end, we propose a novel method for the efficient computation of BIC scores for hybrid Bayesian networks. We demonstrate the accuracy and efficiency of our approach for causal discovery on simulated data as well as on real-world data from the ADHD-200 competition.
International Journal of Remote Sensing | 2007
Maurice Samulski; Nico Karssemeijer; Peter J. F. Lucas; Perry Groot
In this paper, we compare two state-of-the-art classification techniques characterizing masses as either benign or malignant, using a dataset consisting of 271 cases (131 benign and 140 malignant), containing both a MLO and CC view. For suspect regions in a digitized mammogram, 12 out of 81 calculated image features have been selected for investigating the classification accuracy of support vector machines (SVMs) and Bayesian networks (BNs). Additional techniques for improving their performance were included in their comparison: the Manly transformation for achieving a normal distribution of image features and principal component analysis (PCA) for reducing our high-dimensional data. The performance of the classifiers were evaluated with Receiver Operating Characteristics (ROC) analysis. The classifiers were trained and tested using a k-fold cross-validation test method (k=10). It was found that the area under the ROC curve (Az) of the BN increased significantly (p=0.0002) using the Manly transformation, from Az = 0.767 to Az = 0.795. The Manly transformation did not result in a significant change for SVMs. Also the difference between SVMs and BNs using the transformed dataset was not statistically significant (p=0.78). Applying PCA resulted in an improvement in classification accuracy of the naive Bayesian classifier, from Az = 0.767 to Az = 0.786. The difference in classification performance between BNs and SVMs after applying PCA was small and not statistically significant (p=0.11).