William Gray Roncal
Johns Hopkins University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by William Gray Roncal.
statistical and scientific database management | 2013
Randal C. Burns; Kunal Lillaney; Daniel R. Berger; Logan Grosenick; Karl Deisseroth; R. Clay Reid; William Gray Roncal; Priya Manavalan; Davi Bock; Narayanan Kasthuri; Michael M. Kazhdan; Stephen J. Smith; Dean M. Kleissas; Eric Perlman; Kwanghun Chung; Nicholas C. Weiler; Jeff W. Lichtman; Alexander S. Szalay; Joshua T. Vogelstein; R. Jacob Vogelstein
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems---reads to parallel disk arrays and writes to solid-state storage---to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effectiveness of spatial data organization.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013
Joshua T. Vogelstein; William Gray Roncal; R.J. Vogelstein; Carey E. Priebe
This manuscript considers the following “graph classification” question: Given a collection of graphs and associated classes, how can one predict the class of a newly observed graph? To address this question, we propose a statistical model for graph/class pairs. This model naturally leads to a set of estimators to identify the class-conditional signal, or “signal-subgraph,” defined as the collection of edges that are probabilistically different between the classes. The estimators admit classifiers which are asymptotically optimal and efficient, but which differ by their assumption about the “coherency” of the signal-subgraph (coherency is the extent to which the signal-edges “stick together” around a common subset of vertices). Via simulation, the best estimator is shown to be not just a function of the coherency of the model, but also the number of training samples. These estimators are employed to address a contemporary neuroscience question: Can we classify “connectomes” (brain-graphs) according to sex? The answer is yes, and significantly better than all benchmark algorithms considered. Synthetic data analysis demonstrates that even when the model is correct, given the relatively small number of training samples, the estimated signal-subgraph should be taken with a grain of salt. We conclude by discussing several possible extensions.
ieee global conference on signal and information processing | 2013
William Gray Roncal; Zachary H. Koterba; Disa Mhembere; Dean M. Kleissas; Joshua T. Vogelstein; Randal C. Burns; Anita R. Bowles; Dimitrios K. Donavos; Sephira G. Ryman; Rex E. Jung; Lei Wu; Vince D. Calhoun; R. Jacob Vogelstein
Currently, connectomes (e.g., functional or structural brain graphs) can be estimated in humans at ≈ 1 mm3 scale using a combination of diffusion weighted magnetic resonance imaging, functional magnetic resonance imaging and structural magnetic resonance imaging scans. This manuscript summarizes a novel, scalable implementation of open-source algorithms to rapidly estimate magnetic resonance connectomes, using both anatomical regions of interest (ROIs) and voxel-size vertices. To assess the reliability of our pipeline, we develop a novel non-parametric non-Euclidean reliability metric. Here we provide an overview of the methods used, demonstrate our implementation, and discuss available user extensions. We conclude with results showing the efficacy and reliability of the pipeline over previous state-of-the-art.
Frontiers in Neuroinformatics | 2015
William Gray Roncal; Dean M. Kleissas; Joshua T. Vogelstein; Priya Manavalan; Kunal Lillaney; Michael Pekala; Randal C. Burns; R. Jacob Vogelstein; Carey E. Priebe; Mark A. Chevillet; Gregory D. Hager
Reconstructing a map of neuronal connectivity is a critical challenge in contemporary neuroscience. Recent advances in high-throughput serial section electron microscopy (EM) have produced massive 3D image volumes of nanoscale brain tissue for the first time. The resolution of EM allows for individual neurons and their synaptic connections to be directly observed. Recovering neuronal networks by manually tracing each neuronal process at this scale is unmanageable, and therefore researchers are developing automated image processing modules. Thus far, state-of-the-art algorithms focus only on the solution to a particular task (e.g., neuron segmentation or synapse identification). In this manuscript we present the first fully-automated images-to-graphs pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue and produces a brain graph without any human interaction). To evaluate overall performance and select the best parameters and methods, we also develop a metric to assess the quality of the output graphs. We evaluate a set of algorithms and parameters, searching possible operating points to identify the best available brain graph for our assessment metric. Finally, we deploy a reference end-to-end version of the pipeline on a large, publicly available data set. This provides a baseline result and framework for community analysis and future algorithm development and testing. All code and data derivatives have been made publicly available in support of eventually unlocking new biofidelic computational primitives and understanding of neuropathologies.
arXiv: Quantitative Methods | 2017
Eva L. Dyer; William Gray Roncal; Judy A. Prasad; Hugo L. Fernandes; Doga Gursoy; Vincent De Andrade; Kamel Fezzaa; Xianghui Xiao; Joshua T. Vogelstein; Chris Jacobsen; Konrad P. Körding; Narayanan Kasthuri
Visual Abstract Methods for resolving the three-dimensional (3D) microstructure of the brain typically start by thinly slicing and staining the brain, followed by imaging numerous individual sections with visible light photons or electrons. In contrast, X-rays can be used to image thick samples, providing a rapid approach for producing large 3D brain maps without sectioning. Here we demonstrate the use of synchrotron X-ray microtomography (µCT) for producing mesoscale (∼1 µm 3 resolution) brain maps from millimeter-scale volumes of mouse brain. We introduce a pipeline for µCT-based brain mapping that develops and integrates methods for sample preparation, imaging, and automated segmentation of cells, blood vessels, and myelinated axons, in addition to statistical analyses of these brain structures. Our results demonstrate that X-ray tomography achieves rapid quantification of large brain volumes, complementing other brain mapping and connectomics efforts.
ieee global conference on signal and information processing | 2013
Disa Mhembere; William Gray Roncal; Daniel L. Sussman; Carey E. Priebe; Rex E. Jung; Sephira G. Ryman; R. Jacob Vogelstein; Joshua T. Vogelstein; Randal C. Burns
Graphs are quickly emerging as a leading abstraction for the representation of data. One important application domain originates from an emerging discipline called “connectomics”. Connectomics studies the brain as a graph; vertices correspond to neurons (or collections thereof) and edges correspond to structural or functional connections between them. To explore the variability of connectomes-to address both basic science questions regarding the structure of the brain, and medical health questions about psychiatry and neurology-one can study the topological properties of these brain-graphs. We define multivariate glocal graph invariants: these are features of the graph that capture various local and global topological properties of the graphs. We show that the collection of features can collectively be computed via a combination of daisy-chaining, sparse matrix representation and computations, and efficient approximations. Our custom open-source Python package serves as a back-end to a Web-service that we have created to enable researchers to upload graphs, and download the corresponding invariants in a number of different formats. Moreover, we built this package to support distributed processing on multicore machines. This is therefore an enabling technology for network science, lowering the barrier of entry by providing tools to biologists and analysts who otherwise lack these capabilities. As a demonstration, we run our code on 120 brain-graphs, each with approximately 16M vertices and up to 90M edges.
GigaScience | 2017
Gregory Kiar; Krzysztof J. Gorgolewski; Dean M. Kleissas; William Gray Roncal; Brian Litt; Brian A. Wandell; Russel A. Poldrack; Martin Wiener; R. Jacob Vogelstein; Randal C. Burns; Joshua T. Vogelstein
Abstract Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift from data collection to data analysis. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools that drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called ‘science in the cloud’ (SIC). Exploiting scientific containers, cloud computing, and cloud data services, we show the capability to compute in the cloud and run a web service that enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results that will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended.
bioRxiv | 2018
Gregory Kiar; Eric Bridgeford; Vikram Chandrashekhar; Disa Mhembere; Randal C. Burns; William Gray Roncal; Joshua T. Vogelstein
Modern scientific discovery depends on collecting large heterogeneous datasets with many sources of variability, and applying domain-specific pipelines from which one can draw insight or clinical utility. For example, macroscale connectomics studies require complex pipelines to process raw functional or diffusion data and estimate connectomes. Individual studies tend to customize pipelines to their needs, raising concerns about their reproducibility, which add to a longer list of factors that may differ across studies and result in failures to replicate (including sampling, experimental design, and data acquisition protocols). Mitigating these issues requires multi-study datasets and the development of pipelines that can be applied across them. We developed NeuroData’s MRI to Graphs (NDMG) pipeline using several functional and diffusion studies, including the Consortium for Reliability and Reproducability, to estimate connectomes. Without any manual intervention or parameter tuning, NDMG ran on 25 different studies (≈6,000 scans) from 19 sites, with each scan resulting in a biologically plausible connectome (as assessed by multiple quality assurance metrics at each processing stage). For each study, the connectomes from NDMG are more similar within than across individuals, indicating that NDMG is preserving biological variability. Moreover, the connectomes exhibit near perfect consistency for certain connectional properties across every scan, individual, study, site and modality; these include stronger ipsilateral than contralateral connections and stronger homotopic than heterotopic connections. Yet, the magnitude of the differences varied across individuals and studies—much more so when pooling data across sites, even after controlling for study, site, and basic demographic variables (i.e., age, sex, and ethnicity). This indicates that other experimental variables (possibly those not measured or reported) are contributing to this variability, which if not accounted for can limit the value of aggregate datasets, as well as expectations regarding the accuracy of findings and likelihood of replication. We therefore provide a set of principles to guide the development of pipelines capable of pooling data across studies while maintaining biological variability and minimizing measurement error. This open science approach provides us with an opportunity to understand and eventually mitigate spurious results for both past and future studies.The connectivity of the human brain is fundamental to understanding the principles of cognitive function, and the mechanisms by which it can go awry. To that extent, tools for estimating human brain networks are required for single participant, group level, and cross-study analyses. We have developed an open-source, cloud-enabled, turn-key pipeline that operates on (groups of) raw diffusion and structure magnetic resonance imaging data, estimating brain networks (connectomes) across 24 different spatial scales, with quality assurance visualizations at each stage of processing. Running a harmonized analysis on 10 different datasets comprising 2,295 subjects and 2,861 scans reveals that the connectomes across datasets are similar on coarse scales, but quantitatively different on fine scales. Our framework therefore illustrates that while general principles of human brain organization may be preserved across experiments, obtaining reliable p-values and clinical biomarkers from connectomics will require further harmonization efforts.
Nature Methods | 2018
Joshua T. Vogelstein; Eric S. Perlman; Benjamin Falk; Alex Baden; William Gray Roncal; Vikram Chandrashekhar; Forrest Collman; Sharmishtaa Seshamani; Jesse L. Patsolic; Kunal Lillaney; Michael M. Kazhdan; Robert Hider; Derek Pryor; Jordan Matelsky; Timothy Gion; Priya Manavalan; Brock A. Wester; Mark A. Chevillet; Eric T. Trautman; Khaled Khairy; Eric Bridgeford; Dean M. Kleissas; Daniel J. Tward; Ailey K. Crow; Brian Hsueh; Matthew Wright; Michael I. Miller; Stephen J. Smith; R. Jacob Vogelstein; Karl Deisseroth
Big imaging data is becoming more prominent in brain sciences across spatiotemporal scales and phylogenies. We have developed a computational ecosystem that enables storage, visualization, and analysis of these data in the cloud, thusfar spanning 20+ publications and 100+ terabytes including nanoscale ultrastructure, microscale synaptogenetic diversity, and mesoscale whole brain connectivity, making NeuroData the largest and most diverse open repository of brain data.
bioRxiv | 2017
Dean M. Kleissas; Robert Hider; Derek Pryor; Timothy Gion; Priya Manavalan; Jordan Matelsky; Alex Baden; Kunal Lillaney; Randal C. Burns; Denise D'Angelo; William Gray Roncal; Brock A. Wester
Large volumetric neuroimaging datasets have grown in size over the past ten years from gigabytes to terabytes, with petascale data becoming available and more common over the next few years. Current approaches to store and analyze these emerging datasets are insuffcient in their ability to scale in both cost-effectiveness and performance. Additionally, enabling large-scale processing and annotation is critical as these data grow too large for manual inspection. We propose a new cloud-native managed service for large and multi-modal experiments, providing support for data ingest, storage, visualization, and sharing through a RESTful Application Programming Interface (API) and web-based user interface. Our project is open source and can be easily and costeffectively used for a variety of modalities and applications.