Christian P. Mol
Utrecht University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Christian P. Mol.
Medical Image Analysis | 2015
Ivana Išgum; Manon J.N.L. Benders; Brian B. Avants; M. Jorge Cardoso; Serena J. Counsell; Elda Fischi Gomez; Laura Gui; Petra S. Hűppi; Karina J. Kersbergen; Antonios Makropoulos; Andrew Melbourne; Pim Moeskops; Christian P. Mol; Maria Kuklisova-Murgasova; Daniel Rueckert; Julia A. Schnabel; Vedran Srhoj-Egekher; Jue Wu; Siying Wang; Linda S. de Vries; Max A. Viergever
A number of algorithms for brain segmentation in preterm born infants have been published, but a reliable comparison of their performance is lacking. The NeoBrainS12 study (http://neobrains12.isi.uu.nl), providing three different image sets of preterm born infants, was set up to provide such a comparison. These sets are (i) axial scans acquired at 40 weeks corrected age, (ii) coronal scans acquired at 30 weeks corrected age and (iii) coronal scans acquired at 40 weeks corrected age. Each of these three sets consists of three T1- and T2-weighted MR images of the brain acquired with a 3T MRI scanner. The task was to segment cortical grey matter, non-myelinated and myelinated white matter, brainstem, basal ganglia and thalami, cerebellum, and cerebrospinal fluid in the ventricles and in the extracerebral space separately. Any team could upload the results and all segmentations were evaluated in the same way. This paper presents the results of eight participating teams. The results demonstrate that the participating methods were able to segment all tissue classes well, except myelinated white matter.
PLOS ONE | 2013
Petronella Anbeek; Ivana Išgum; Britt J. van Kooij; Christian P. Mol; Karina J. Kersbergen; Floris Groenendaal; Max A. Viergever; Linda S. de Vries; Manon J.N.L. Benders
Purpose Volumetric measurements of neonatal brain tissues may be used as a biomarker for later neurodevelopmental outcome. We propose an automatic method for probabilistic brain segmentation in neonatal MRIs. Materials and Methods In an IRB-approved study axial T1- and T2-weighted MR images were acquired at term-equivalent age for a preterm cohort of 108 neonates. A method for automatic probabilistic segmentation of the images into eight cerebral tissue classes was developed: cortical and central grey matter, unmyelinated and myelinated white matter, cerebrospinal fluid in the ventricles and in the extra cerebral space, brainstem and cerebellum. Segmentation is based on supervised pixel classification using intensity values and spatial positions of the image voxels. The method was trained and evaluated using leave-one-out experiments on seven images, for which an expert had set a reference standard manually. Subsequently, the method was applied to the remaining 101 scans, and the resulting segmentations were evaluated visually by three experts. Finally, volumes of the eight segmented tissue classes were determined for each patient. Results The Dice similarity coefficients of the segmented tissue classes, except myelinated white matter, ranged from 0.75 to 0.92. Myelinated white matter was difficult to segment and the achieved Dice coefficient was 0.47. Visual analysis of the results demonstrated accurate segmentations of the eight tissue classes. The probabilistic segmentation method produced volumes that compared favorably with the reference standard. Conclusion The proposed method provides accurate segmentation of neonatal brain MR images into all given tissue classes, except myelinated white matter. This is the one of the first methods that distinguishes cerebrospinal fluid in the ventricles from cerebrospinal fluid in the extracerebral space. This method might be helpful in predicting neurodevelopmental outcome and useful for evaluating neuroprotective clinical trials in neonates.
PLOS ONE | 2014
Richard A. P. Takx; Pim A. de Jong; Tim Leiner; Matthijs Oudkerk; Harry J. de Koning; Christian P. Mol; Max A. Viergever; Ivana Išgum
Objective To determine the agreement and reliability of fully automated coronary artery calcium (CAC) scoring in a lung cancer screening population. Materials and Methods 1793 low-dose chest CT scans were analyzed (non-contrast-enhanced, non-gated). To establish the reference standard for CAC, first automated calcium scoring was performed using a preliminary version of a method employing coronary calcium atlas and machine learning approach. Thereafter, each scan was inspected by one of four trained raters. When needed, the raters corrected initially automaticity-identified results. In addition, an independent observer subsequently inspected manually corrected results and discarded scans with gross segmentation errors. Subsequently, fully automatic coronary calcium scoring was performed. Agatston score, CAC volume and number of calcifications were computed. Agreement was determined by calculating proportion of agreement and examining Bland-Altman plots. Reliability was determined by calculating linearly weighted kappa (κ) for Agatston strata and intraclass correlation coefficient (ICC) for continuous values. Results 44 (2.5%) scans were excluded due to metal artifacts or gross segmentation errors. In the remaining 1749 scans, median Agatston score was 39.6 (P25–P75∶0–345.9), median volume score was 60.4 mm3 (P25–P75∶0–361.4) and median number of calcifications was 2 (P25–P75∶0–4) for the automated scores. The κ demonstrated very good reliability (0.85) for Agatston risk categories between the automated and reference scores. The Bland-Altman plots showed underestimation of calcium score values by automated quantification. Median difference was 2.5 (p25–p75∶0.0–53.2) for Agatston score, 7.6 (p25–p75∶0.0–94.4) for CAC volume and 1 (p25–p75∶0–5) for number of calcifications. The ICC was very good for Agatston score (0.90), very good for calcium volume (0.88) and good for number of calcifications (0.64). Discussion Fully automated coronary calcium scoring in a lung cancer screening setting is feasible with acceptable reliability and agreement despite an underestimation of the amount of calcium when compared to reference scores.
PLOS ONE | 2013
Constantinus F. Buckens; Pim A. de Jong; Christian P. Mol; Eric Bakker; Hein P. Stallman; Willem P. Th. M. Mali; Yolanda van der Graaf; Helena M. Verkooijen
Objectives To evaluate the reliability of semiquantitative Vertebral Fracture Assessment (VFA) on chest Computed Tomography (CT). Methods Four observers performed VFA twice upon sagittal reconstructions of 50 routine clinical chest CTs. Intra- and interobserver agreement (absolute agreement or 95% Limits of Agreement) and reliability (Cohens kappa or intraclass correlation coefficient(ICC)) were calculated for the visual VFA measures (fracture present, worst fracture grade, cumulative fracture grade on patient level) and for percentage height loss of each fractured vertebra compared to the adjacent vertebrae. Results Observers classified 24–38% patients as having at least one vertebral fracture, giving rise to kappas of 0.73–0.84 (intraobserver) and 0.56–0.81 (interobserver). For worst fracture grade we found good intraobserver (76–88%) and interobserver (74–88%) agreement, and excellent reliability with square-weighted kappas of 0.84–0.90 (intraobserver) and 0.84–0.94 (interobserver). For cumulative fracture grade the 95% Limits of Agreement were maximally ±1,99 (intraobserver) and ±2,69 (interobserver) and the reliability (ICC) varied from 0.84–0.94 (intraobserver) and 0.74–0.94 (interobserver). For percentage height-loss on a vertebral level the 95% Limits of Agreement were maximally ±11,75% (intraobserver) and ±12,53% (interobserver). The ICC was 0.59–0.90 (intraobserver) and 0.53–0–82 (interobserver). Further investigation is needed to evaluate the prognostic value of this approach. Conclusion In conclusion, these results demonstrate acceptable reproducibility of VFA on CT.
pacific rim international conference on multi agents | 2008
Mehdi Dastani; Christian P. Mol; Bas R. Steunebrink
This paper discusses a module-based vision for designing BDI-based multi-agent programming languages. The introduced concept of modules is generic and facilitates the implementation of different agent concepts such as agent roles and agent profiles, and enables common programming techniques such as encapsulation and information hiding for BDI-based agents. This vision is applied to 2APL, which is an existing BDI-based agent programming language. Specific programming constructs are added to 2APL to allow the implementation of modules. The syntax and intuitive meaning of these programming constructs are provided as well as the operational semantics of one of the programming constructs. Some informal properties of the programming constructs are discussed and it is explained how these modules can be used to implement agent roles, agent profiles, or the encapsulation of BDI concepts.
Proceedings of SPIE | 2010
Laurens Hogeweg; Christian P. Mol; Pim A. de Jong; Bram van Ginneken
The computer aided diagnosis (CAD) of abnormalities on chest radiographs is difficult due to the presence of overlapping normal anatomy. Suppression of the normal anatomy is expected to improve performance of a CAD system, but such a method has not yet been applied to the computer detection of interstitial abnormalities such as occur in tuberculosis (TB). The aim of this research is to evaluate the effect of rib suppression on a CAD system for TB. Profiles of pixel intensities sampled perpendicular to segmented ribs were used to create a local PCA-based shape model of the rib. The model was normalized to the local background intensity and corrected for gradients perpendicular to the rib. Subsequently rib suppressed images were created by subtracting the models for each rib from the original image. The effect of rib suppression was evaluated using a CAD system for TB detection. Small square image patches were sampled randomly from 15 normal and 35 TB-affected images containing textural abnormalities. Abnormalities were outlined by a radiologist and were given a subtlety rating from 1 to 5. Features based on moments of intensity distributions of Gaussian derivative filtered images were extracted. A supervised learning approach was used to discriminate between normal and diseased image patches. The use of rib suppressed images increased the overall performance of the system, as measured by the area under the receiver operator characteristic (ROC) curve, from 0.75 to 0.78. For the more subtly rated patches (rated 1-3) the performance increased from 0.62 to 0.70.
Academic Radiology | 2015
Cécile J. Ravesloot; Anouk van der Gijp; Marieke van der Schaaf; Josephine C.B.M. Huige; Koen L. Vincken; Christian P. Mol; Ronald L. A. W. Bleys; Olle Tj ten Cate; Jan P.J. van Schaik
RATIONALE AND OBJECTIVES Radiology practice has become increasingly based on volumetric images (VIs), but tests in medical education still mainly involve two-dimensional (2D) images. We created a novel, digital, VI test and hypothesized that scores on this test would better reflect radiological anatomy skills than scores on a traditional 2D image test. To evaluate external validity we correlated VI and 2D image test scores with anatomy cadaver-based test scores. MATERIALS AND METHODS In 2012, 246 medical students completed one of two comparable versions (A and B) of a digital radiology test, each containing 20 2D image and 20 VI questions. Thirty-three of these participants also took a human cadaver anatomy test. Mean scores and reliabilities of the 2D image and VI subtests were compared and correlated with human cadaver anatomy test scores. Participants received a questionnaire about perceived representativeness and difficulty of the radiology test. RESULTS Human cadaver test scores were not correlated with 2D image scores, but significantly correlated with VI scores (r = 0.44, P < .05). Cronbachs α reliability was 0.49 (A) and 0.65 (B) for the 2D image subtests and 0.65 (A) and 0.71 (B) for VI subtests. Mean VI scores (74.4%, standard deviation 2.9) were significantly lower than 2D image scores (83.8%, standard deviation 2.4) in version A (P < .001). VI questions were considered more representative of clinical practice and education than 2D image questions and less difficult (both P < .001). CONCLUSIONS VI tests show higher reliability, a significant correlation with human cadaver test scores, and are considered more representative for clinical practice than tests with 2D images.
Simulation in Healthcare | 2017
Anouk van der Gijp; Cécile J. Ravesloot; Corinne A. Tipker; Kim de Crom; Dik R. Rutgers; Marieke van der Schaaf; Irene C. van der Schaaf; Christian P. Mol; Koen L. Vincken; Olle ten Cate; Mario Maas; Jan P.J. van Schaik
Introduction Clinical reasoning in diagnostic imaging professions is a complex skill that requires processing of visual information and image manipulation skills. We developed a digital simulation-based test method to increase authenticity of image interpretation skill assessment. Methods A digital application, allowing volumetric image viewing and manipulation, was used for three test administrations of the national Dutch Radiology Progress Test for residents. This study describes the development and implementation process in three phases. To assess authenticity of the digital tests, perceived image quality and correspondence to clinical practice were evaluated and compared with previous paper-based tests (PTs). Quantitative and qualitative evaluation results were used to improve subsequent tests. Results Authenticity of the first digital test was not rated higher than the PTs. Test characteristics and environmental conditions, such as image manipulation options and ambient lighting, were optimized based on participants’ comments. After adjustments in the third digital test, participants favored the image quality and clinical correspondence of the digital image questions over paper-based image questions. Conclusions Digital simulations can increase authenticity of diagnostic radiology assessments compared with paper-based testing. However, authenticity does not necessarily increase with higher fidelity. It can be challenging to simulate the image interpretation task of clinical practice in a large-scale assessment setting, because of technological limitations. Optimizing image manipulation options, the level of ambient light, time limits, and question types can help improve authenticity of simulation-based radiology assessments.
Diagnosis | 2017
Cécile J. Ravesloot; Anouk van der Gijp; Marieke van der Schaaf; Josephine C.B.M. Huige; Olle ten Cate; Koen L. Vincken; Christian P. Mol; Jan P.J. van Schaik
Abstract Background: Misinterpretation of medical images is an important source of diagnostic error. Errors can occur in different phases of the diagnostic process. Insight in the error types made by learners is crucial for training and giving effective feedback. Most diagnostic skill tests however penalize diagnostic mistakes without an eye for the diagnostic process and the type of error. A radiology test with stepwise reasoning questions was used to distinguish error types in the visual diagnostic process. We evaluated the additional value of a stepwise question-format, in comparison with only diagnostic questions in radiology tests. Methods: Medical students in a radiology elective (n=109) took a radiology test including 11–13 cases in stepwise question-format: marking an abnormality, describing the abnormality and giving a diagnosis. Errors were coded by two independent researchers as perception, analysis, diagnosis, or undefined. Erroneous cases were further evaluated for the presence of latent errors or partial knowledge. Inter-rater reliabilities and percentages of cases with latent errors and partial knowledge were calculated. Results: The stepwise question-format procedure applied to 1351 cases completed by 109 medical students revealed 828 errors. Mean inter-rater reliability of error type coding was Cohen’s κ=0.79. Six hundred and fifty errors (79%) could be coded as perception, analysis or diagnosis errors. The stepwise question-format revealed latent errors in 9% and partial knowledge in 18% of cases. Conclusions: A stepwise question-format can reliably distinguish error types in the visual diagnostic process, and reveals latent errors and partial knowledge.
Academic Radiology | 2017
D. R. Rutgers; Fleur van Raamt; Anouk van der Gijp; Christian P. Mol; Olle ten Cate
RATIONALE AND OBJECTIVES The psychometric characteristics of image-based test items in radiological written examinations are not well known. In this study, we explored difficulty and discriminating power of these test items in postgraduate radiological digital examinations. MATERIALS AND METHODS We reviewed test items of seven Dutch Radiology Progress Tests (DRPTs) that were taken from October 2013 to April 2017. The DRPT is a semiannual formative examination, required for all Dutch radiology residents. We assessed several stimulus and response characteristics of test items. The response format of test items included true or false, single right multiple choice with 2, 3, 4, or ≥5 answer options, pick-N multiple-choice, drag-and-drop, and long-list-menu formats. We calculated item P values and item-rest-correlation (Rir) values to assess difficulty and discriminating power. We performed linear regression analysis in image-based test items to investigate whether P and Rir values were significantly related to stimulus and response characteristics. Also, we compared psychometric indices between image-based test items and text-alone items. RESULTS P and Rir values of image-based items (n = 369) were significantly related to the type of response format (P < .001), and not to which of the seven DRPTs the item was obtained from, radiological subspecialty domain, nonvolumetric or volumetric character of images, or context-rich or context-free character of the stimulus. When accounted for type of response format, difficulty and discriminating power of image-based items did not differ significantly from text-alone items (n = 881). Test items with a relatively large number of answer options were generally more difficult, and discriminated better among high- and low-performing candidates. CONCLUSION In postgraduate radiological written examinations, difficulty and discriminating power of image-based test items are related to the type of response format and are comparable to those of text-alone items. We recommend a response format with a relatively large number of answer options to optimize psychometric indices of radiological image-based test items.