Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where John Ionides is active.

Publication


Featured researches published by John Ionides.


Proteins | 2005

The CCPN data model for NMR spectroscopy: Development of a software pipeline

Wim F. Vranken; Wayne Boucher; Tim J. Stevens; Rasmus H. Fogh; Anne Pajon; Miguel Llinás; Eldon L. Ulrich; John L. Markley; John Ionides; Ernest D. Laue

To address data management and data exchange problems in the nuclear magnetic resonance (NMR) community, the Collaborative Computing Project for the NMR community (CCPN) created a “Data Model” that describes all the different types of information needed in an NMR structural study, from molecular structure and NMR parameters to coordinates. This paper describes the development of a set of software applications that use the Data Model and its associated libraries, thus validating the approach. These applications are freely available and provide a pipeline for high‐throughput analysis of NMR data. Three programs work directly with the Data Model: CcpNmr Analysis, an entirely new analysis and interactive display program, the CcpNmr FormatConverter, which allows transfer of data from programs commonly used in NMR to and from the Data Model, and the CLOUDS software for automated structure calculation and assignment (Carnegie Mellon University), which was rewritten to interact directly with the Data Model. The ARIA 2.0 software for structure calculation (Institut Pasteur) and the QUEEN program for validation of restraints (University of Nijmegen) were extended to provide conversion of their data to the Data Model. During these developments the Data Model has been thoroughly tested and used, demonstrating that applications can successfully exchange data via the Data Model. The software architecture developed by CCPN is now ready for new developments, such as integration with additional software applications and extensions of the Data Model into other areas of research. Proteins 2005.


Nucleic Acids Research | 2007

Remediation of the protein data bank archive

Kim Henrick; Zukang Feng; Wolfgang F. Bluhm; Dimitris Dimitropoulos; Jurgen F. Doreleijers; Shuchismita Dutta; Judith L. Flippen-Anderson; John Ionides; Chisa Kamada; Eugene Krissinel; Catherine L. Lawson; John L. Markley; Haruki Nakamura; Richard Newman; Yukiko Shimizu; Jawahar Swaminathan; Sameer Velankar; Jeramia Ory; Eldon L. Ulrich; Wim F. Vranken; John D. Westbrook; Reiko Yamashita; Huanwang Yang; Jasmine Young; Muhammed Yousufuddin; Helen M. Berman

The Worldwide Protein Data Bank (wwPDB; wwpdb.org) is the international collaboration that manages the deposition, processing and distribution of the PDB archive. The online PDB archive at ftp://ftp.wwpdb.org is the repository for the coordinates and related information for more than 47 000 structures, including proteins, nucleic acids and large macromolecular complexes that have been determined using X-ray crystallography, NMR and electron microscopy techniques. The members of the wwPDB–RCSB PDB (USA), MSD-EBI (Europe), PDBj (Japan) and BMRB (USA)–have remediated this archive to address inconsistencies that have been introduced over the years. The scope and methods used in this project are presented.


Nucleic Acids Research | 2003

E-MSD: the European Bioinformatics Institute Macromolecular Structure Database

Harry Boutselakis; Dimitris Dimitropoulos; Joël Fillon; Adel Golovin; Kim Henrick; A. Hussain; John Ionides; Melford John; Peter A. Keller; Evgeny B. Krissinel; P. McNeil; Avi Naim; Richard Newman; Thomas J. Oldfield; Jorge Pineda; Abdel-Krim Rachedi; J. Copeland; Andrey Sitnov; Siamak Sobhany; Antonio Suarez-Uruena; Jawahar Swaminathan; Mohammed Tagari; John G. Tate; Swen Tromm; Samir S. Velankar; Wim F. Vranken

The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.


Nature Structural & Molecular Biology | 2002

The CCPN project: an interim report on a data model for the NMR community.

Rasmus H. Fogh; John Ionides; Eldon L. Ulrich; Wayne Boucher; Wim F. Vranken; Jens P. Linge; Michael Habeck; Wolfgang Rieping; Talapady N. Bhat; John D. Westbrook; Kim Henrick; Gary L. Gilliland; Helen M. Berman; Janet M. Thornton; Michael Nilges; John L. Markley; Ernest D. Laue

A recent workshop discusses the progress toward integrating NMR data into a unifying data model.


Glycobiology | 2011

EUROCarbDB: An open-access platform for glycoinformatics.

Claus Wilhelm Von Der Lieth; Ana Ardá Freire; Dennis Blank; Matthew Campbell; Alessio Ceroni; David Damerell; Anne Dell; Raymond A. Dwek; Beat Ernst; Rasmus H. Fogh; Martin Frank; Hildegard Geyer; Rudolf Geyer; Mathew J. Harrison; Kim Henrick; Stefan Herget; William E. Hull; John Ionides; Hiren J. Joshi; Johannis P. Kamerling; Bas R. Leeflang; Thomas Lütteke; Magnus Lundborg; Kai Maass; Anthony Merry; René Ranzinger; Jimmy Rosen; Louise Royle; Pauline M. Rudd; Siegfried Schloissnig

The EUROCarbDB project is a design study for a technical framework, which provides sophisticated, freely accessible, open-source informatics tools and databases to support glycobiology and glycomic research. EUROCarbDB is a relational database containing glycan structures, their biological context and, when available, primary and interpreted analytical data from high-performance liquid chromatography, mass spectrometry and nuclear magnetic resonance experiments. Database content can be accessed via a web-based user interface. The database is complemented by a suite of glycoinformatics tools, specifically designed to assist the elucidation and submission of glycan structure and experimental data when used in conjunction with contemporary carbohydrate research workflows. All software tools and source code are licensed under the terms of the Lesser General Public License, and publicly contributed structures and data are freely accessible. The public test version of the web interface to the EUROCarbDB can be found at http://www.ebi.ac.uk/eurocarb.


Bioinformatics | 2005

A framework for scientific data modeling and automated software development

Rasmus H. Fogh; Wayne Boucher; Wim F. Vranken; Anne Pajon; Tim J. Stevens; Talapady N. Bhat; John D. Westbrook; John Ionides; Ernest D. Laue

MOTIVATION The lack of standards for storage and exchange of data is a serious hindrance for the large-scale data deposition, data mining and program interoperability that is becoming increasingly important in bioinformatics. The problem lies not only in defining and maintaining the standards, but also in convincing scientists and application programmers with a wide variety of backgrounds and interests to adhere to them. RESULTS We present a UML-based programming framework for the modeling of data and the automated production of software to manipulate that data. Our approach allows one to make an abstract description of the structure of the data used in a particular scientific field and then use it to generate fully functional computer code for data access and input/output routines for data storage, together with accompanying documentation. This code can be generated simultaneously for different programming languages from a single model, together with, for example for format descriptions and I/O libraries XML and various relational databases. The framework is entirely general and could be applied in any subject area. We have used this approach to generate a data exchange standard for structural biology and analysis software for macromolecular NMR spectroscopy. AVAILABILITY The framework is available under the GPL license, the data exchange standard with generated subroutine libraries under the LGPL license. Both may be found at http://www.ccpn.ac.uk; http://sourceforge.net/projects/ccpn CONTACT [email protected].


PLOS Genetics | 2013

Rearrangements of 2.5 Kilobases of Noncoding DNA from the Drosophila even-skipped Locus Define Predictive Rules of Genomic cis-Regulatory Logic

Ah-Ram Kim; Carlos Martinez; John Ionides; Alexandre F. Ramos; Michael Ludwig; Nobuo Ogawa; David H. Sharp; John Reinitz

Rearrangements of about 2.5 kilobases of regulatory DNA located 5′ of the transcription start site of the Drosophila even-skipped locus generate large-scale changes in the expression of even-skipped stripes 2, 3, and 7. The most radical effects are generated by juxtaposing the minimal stripe enhancers MSE2 and MSE3 for stripes 2 and 3 with and without small “spacer” segments less than 360 bp in length. We placed these fusion constructs in a targeted transformation site and obtained quantitative expression data for these transformants together with their controlling transcription factors at cellular resolution. These data demonstrated that the rearrangements can alter expression levels in stripe 2 and the 2–3 interstripe by a factor of more than 10. We reasoned that this behavior would place tight constraints on possible rules of genomic cis-regulatory logic. To find these constraints, we confronted our new expression data together with previously obtained data on other constructs with a computational model. The model contained representations of thermodynamic protein–DNA interactions including steric interference and cooperative binding, short-range repression, direct repression, activation, and coactivation. The model was highly constrained by the training data, which it described within the limits of experimental error. The model, so constrained, was able to correctly predict expression patterns driven by enhancers for other Drosophila genes; even-skipped enhancers not included in the training set; stripe 2, 3, and 7 enhancers from various Drosophilid and Sepsid species; and long segments of even-skipped regulatory DNA that contain multiple enhancers. The model further demonstrated that elevated expression driven by a fusion of MSE2 and MSE3 was a consequence of the recruitment of a portion of MSE3 to become a functional component of MSE2, demonstrating that cis-regulatory “elements” are not elementary objects.


Proteins | 2004

Design of a data model for developing laboratory information management and analysis systems for protein production

Anne Pajon; John Ionides; Jon Diprose; Joël Fillon; Rasmus H. Fogh; Alun Ashton; Helen M. Berman; Wayne Boucher; Miroslaw Cygler; Emeline Deleury; Robert M. Esnouf; Joël Janin; Rosalind Kim; Isabelle Krimm; Catherine L. Lawson; Eric Oeuillet; Anne Poupon; Stéphane Raymond; Tim J. Stevens; Herman van Tilbeurgh; John D. Westbrook; Peter A. Wood; Eldon L. Ulrich; Wim F. Vranken; Li Xueli; Ernest D. Laue; David I. Stuart; Kim Henrick

Data management has emerged as one of the central issues in the high‐throughput processes of taking a protein target sequence through to a protein sample. To simplify this task, and following extensive consultation with the international structural genomics community, we describe here a model of the data related to protein production. The model is suitable for both large and small facilities for use in tracking samples, experiments, and results through the many procedures involved. The model is described in Unified Modeling Language (UML). In addition, we present relational database schemas derived from the UML. These relational schemas are already in use in a number of data management projects. Proteins 2005.


Current protocols in human genetics | 2006

Using MSDchem to search the PDB ligand dictionary.

Dimitris Dimitropoulos; John Ionides; Kim Henrick

The PDB ligand dictionary is the chemical reference database of all the small building block molecules (e.g., amino acids, nucleic acids, and bound ligands) in the Protein Data Bank (PDB) referenced by a distinct three-letter code identifier. Since PDB files have only three-dimensional coordinate data, the role of the dictionary that of a reference resource for the actual chemical properties of small molecules, shared consistently across all PDB entries. The ligand dictionary is maintained in all sites of the Worldwide Protein Data Bank (wwPDB), the Research Collaboratory for Structural Bioinformatics (RCSB) in U.S., the Macromolecular Structure Database (MSD) in Europe, and the Protein Data Bank in Japan (PDBj), and it is exchanged on a regular basis. The MSD group at the European BioInformatics Institute (EBI) extends the dictionary into the MSDchem ligand database, which utilizes chemo-informatics packages and incorporates additional curation work. MSDchem is publicly available on the Web through the MSDchem search system, the functionality of which is described in more detail in this unit.


Journal of Integrative Bioinformatics | 2010

MEMOPS: data modelling and automatic code generation.

Rasmus H. Fogh; Wayne Boucher; John Ionides; Wim F. Vranken; Tim J. Stevens; Ernest D. Laue

In recent years the amount of biological data has exploded to the point where much useful information can only be extracted by complex computational analyses. Such analyses are greatly facilitated by metadata standards, both in terms of the ability to compare data originating from different sources, and in terms of exchanging data in standard forms, e.g. when running processes on a distributed computing infrastructure. However, standards thrive on stability whereas science tends to constantly move, with new methods being developed and old ones modified. Therefore maintaining both metadata standards, and all the code that is required to make them useful, is a non-trivial problem. Memops is a framework that uses an abstract definition of the metadata (described in UML) to generate internal data structures and subroutine libraries for data access (application programming interfaces--APIs--currently in Python, C and Java) and data storage (in XML files or databases). For the individual project these libraries obviate the need for writing code for input parsing, validity checking or output. Memops also ensures that the code is always internally consistent, massively reducing the need for code reorganisation. Across a scientific domain a Memops-supported data model makes it easier to support complex standards that can capture all the data produced in a scientific area, share them among all programs in a complex software pipeline, and carry them forward to deposition in an archive. The principles behind the Memops generation code will be presented, along with example applications in Nuclear Magnetic Resonance (NMR) spectroscopy and structural biology.

Collaboration


Dive into the John Ionides's collaboration.

Top Co-Authors

Avatar

Kim Henrick

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wim F. Vranken

Radboud University Nijmegen Medical Centre

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dimitris Dimitropoulos

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eldon L. Ulrich

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anne Pajon

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Tim J. Stevens

Laboratory of Molecular Biology

View shared research outputs
Researchain Logo
Decentralizing Knowledge