Da Qi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Da Qi is active.

Explore More

Publication

Featured researches published by Da Qi.

Molecular & Cellular Proteomics | 2013

The mzQuantML data standard for mass spectrometry-based quantitative studies in proteomics

Mathias Walzer; Da Qi; Gerhard Mayer; Julian Uszkoreit; Martin Eisenacher; Timo Sachsenberg; Faviel F. Gonzalez-Galarza; Jun Fan; Conrad Bessant; Eric W. Deutsch; Florian Reisinger; Juan Antonio Vizcaíno; J. Alberto Medina-Aunon; Juan Pablo Albar; Oliver Kohlbacher; Andrew R. Jones

The range of heterogeneous approaches available for quantifying protein abundance via mass spectrometry (MS)1 leads to considerable challenges in modeling, archiving, exchanging, or submitting experimental data sets as supplemental material to journals. To date, there has been no widely accepted format for capturing the evidence trail of how quantitative analysis has been performed by software, for transferring data between software packages, or for submitting to public databases. In the context of the Proteomics Standards Initiative, we have developed the mzQuantML data standard. The standard can represent quantitative data about regions in two-dimensional retention time versus mass/charge space (called features), peptides, and proteins and protein groups (where there is ambiguity regarding peptide-to-protein inference), and it offers limited support for small molecule (metabolomic) data. The format has structures for representing replicate MS runs, grouping of replicates (for example, as study variables), and capturing the parameters used by software packages to arrive at these values. The format has the capability to reference other standards such as mzML and mzIdentML, and thus the evidence trail for the MS workflow as a whole can now be described. Several software implementations are available, and we encourage other bioinformatics groups to use mzQuantML as an input, internal, or output format for quantitative software and for structuring local repositories. All project resources are available in the public domain from the HUPO Proteomics Standards Initiative http://www.psidev.info/mzquantml.

Proteomics | 2014

How to submit MS proteomics data to ProteomeXchange via the PRIDE database

Tobias Ternent; Attila Csordas; Da Qi; Guadalupe Gómez-Baena; Robert J. Beynon; Andrew R. Jones; Henning Hermjakob; Juan Antonio Vizcaíno

The ProteomeXchange (PX) consortium has been established to standardize and facilitate submission and dissemination of MS‐based proteomics data in the public domain. In the consortium, the PRIDE database at the European Bioinformatics Institute, acts as the initial submission point of MS/MS data sets. In this manuscript, we explain step by step the submission process of MS/MS data sets to PX via PRIDE. We describe in detail the two available workflows: ‘complete’ and ‘partial’ submissions, together with the available tools to streamline the process. Throughout the manuscript, we will use one example data set containing identification and quantification data, which has been deposited in PRIDE/ProteomeXchange with the accession number PXD000764 (http://proteomecentral.proteomexchange.org/dataset/PXD000764).

Omics A Journal of Integrative Biology | 2012

A Software Toolkit and Interface for Performing Stable Isotope Labeling and Top3 Quantification Using Progenesis LC-MS

Da Qi; Philip Brownridge; Dong Xia; Katherine Mackay; Faviel F. Gonzalez-Galarza; Jenna Kenyani; Victoria M. Harman; Robert J. Beynon; Andrew R. Jones

Numerous software packages exist to provide support for quantifying peptides and proteins from mass spectrometry (MS) data. However, many support only a subset of experimental methods or instrument types, meaning that laboratories often have to use multiple software packages. The Progenesis LC-MS software package from Nonlinear Dynamics is a software solution for label-free quantitation. However, many laboratories using Progenesis also wish to employ stable isotope-based methods that are not natively supported in Progenesis. We have developed a Java programming interface that can use the output files produced by Progenesis, allowing the basic MS features quantified across replicates to be used in a range of different experimental methods. We have developed post-processing software (the Progenesis Post-Processor) to embed Progenesis in the analysis of stable isotope labeling data and top3 pseudo-absolute quantitation. We have also created export ability to the new data standard, mzQuantML, produced by the Proteomics Standards Initiative to facilitate the development and standardization process. The software is provided to users with a simple graphical user interface for accessing the different features. The underlying programming interface may also be used by Java developers to develop other routines for analyzing data produced by Progenesis.

Proteomics | 2014

The jmzQuantML programming interface and validator for the mzQuantML data standard.

Da Qi; Ritesh Krishna; Andrew R. Jones

The mzQuantML standard from the HUPO Proteomics Standards Initiative has recently been released, capturing quantitative data about peptides and proteins, following analysis of MS data. We present a Java application programming interface (API) for mzQuantML called jmzQuantML. The API provides robust bridges between Java classes and elements in mzQuantML files and allows random access to any part of the file. The API provides read and write capabilities, and is designed to be embedded in other software packages, enabling mzQuantML support to be added to proteomics software tools (http://code.google.com/p/jmzquantml/). The mzQuantML standard is designed around a multilevel validation system to ensure that files are structurally and semantically correct for different proteomics quantitative techniques. In this article, we also describe a Java software tool (http://code.google.com/p/mzquantml‐validator/) for validating mzQuantML files, which is a formal part of the data standard.

Biochimica et Biophysica Acta | 2014

A tutorial for software development in quantitative proteomics using PSI standard formats

Faviel F. Gonzalez-Galarza; Da Qi; Jun Fan; Conrad Bessant; Andrew R. Jones

The Human Proteome Organisation — Proteomics Standards Initiative (HUPO-PSI) has been working for ten years on the development of standardised formats that facilitate data sharing and public database deposition. In this article, we review three HUPO-PSI data standards — mzML, mzIdentML and mzQuantML, which can be used to design a complete quantitative analysis pipeline in mass spectrometry (MS)-based proteomics. In this tutorial, we briefly describe the content of each data model, sufficient for bioinformaticians to devise proteomics software. We also provide guidance on the use of recently released application programming interfaces (APIs) developed in Java for each of these standards, which makes it straightforward to read and write files of any size. We have produced a set of example Java classes and a basic graphical user interface to demonstrate how to use the most important parts of the PSI standards, available from http://code.google.com/p/psi-standard-formats-tutorial. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.

Proteomics | 2015

Representation of selected-reaction monitoring data in the mzQuantML data standard

Da Qi; Craig Lawless; Johan Teleman; Fredrik Levander; Stephen W. Holman; Simon J. Hubbard; Andrew R. Jones

The mzQuantML data standard was designed to capture the output of quantitative software in proteomics, to support submissions to public repositories, development of visualization software and pipeline/modular approaches. The standard is designed around a common core that can be extended to support particular types of technique through the release of semantic rules that are checked by validation software. The first release of mzQuantML supported four quantitative proteomics techniques via four sets of semantic rules: (i) intensity‐based (MS1) label free, (ii) MS1 label‐based (such as SILAC or N15), (iii) MS2 tag‐based (iTRAQ or tandem mass tags), and (iv) spectral counting. We present an update to mzQuantML for supporting SRM techniques. The update includes representing the quantitative measurements, and associated meta‐data, for SRM transitions, the mechanism for inferring peptide‐level or protein‐level quantitative values, and support for both label‐based or label‐free SRM protocols, through the creation of semantic rules and controlled vocabulary terms. We have updated the specification document for mzQuantML (version 1.0.1) and the mzQuantML validator to ensure that consistent files are produced by different exporters. We also report the capabilities for production of mzQuantML files from popular SRM software packages, such as Skyline and Anubis.

international conference on digital mammography | 2006

Linking image structures with medical ontology information

Da Qi; Erika R. E. Denton; Reyer Zwiggelaar

Medical ontologies are being developed with some of these specifically for mammographic computer aided diagnosis (CAD) systems. However, to provide full functionality for such mammographic CAD systems it is essential that the ontology information is fully linked to the image information. This linking can be through problem specific image attributes. However, such an approach tends to be non-generic. Here, we propose a framework that will use generic image structures and the topology that links the image structures. In the process we describe a comparison approach which takes the classes, attributes and semantics into account.

Proteomics | 2015

The mzqLibrary - An open source Java library supporting the HUPO-PSI quantitative proteomics standard

Da Qi; Huaizhong Zhang; Jun Fan; Simon Perkins; Addolorata Pisconti; Deborah M. Simpson; Conrad Bessant; Simon J. Hubbard; Andrew R. Jones

The mzQuantML standard has been developed by the Proteomics Standards Initiative for capturing, archiving and exchanging quantitative proteomic data, derived from mass spectrometry. It is a rich XML‐based format, capable of representing data about two‐dimensional features from LC‐MS data, and peptides, proteins or groups of proteins that have been quantified from multiple samples. In this article we report the development of an open source Java‐based library of routines for mzQuantML, called the mzqLibrary, and associated software for visualising data called the mzqViewer. The mzqLibrary contains routines for mapping (peptide) identifications on quantified features, inference of protein (group)‐level quantification values from peptide‐level values, normalisation and basic statistics for differential expression. These routines can be accessed via the command line, via a Java programming interface access or a basic graphical user interface. The mzqLibrary also contains several file format converters, including import converters (to mzQuantML) from OpenMS, Progenesis LC‐MS and MaxQuant, and exporters (from mzQuantML) to other standards or useful formats (mzTab, HTML, csv). The mzqViewer contains in‐built routines for viewing the tables of data (about features, peptides or proteins), and connects to the R statistical library for more advanced plotting options. The mzqLibrary and mzqViewer packages are available from https://code.google.com/p/mzq‐lib/.

Journal of Proteomics | 2018

Comparative qualitative phosphoproteomics analysis identifies shared phosphorylation motifs and associated biological processes in evolutionary divergent plants

Shireen Al-Momani; Da Qi; Zhe Ren; Andrew R. Jones

Phosphorylation is one of the most prevalent post-translational modifications and plays a key role in regulating cellular processes. We carried out a bioinformatics analysis of pre-existing phosphoproteomics data, to profile two model species representing the largest subclasses in flowering plants the dicot Arabidopsis thaliana and the monocot Oryza sativa, to understand the extent to which phosphorylation signaling and function is conserved across evolutionary divergent plants. We identified 6537 phosphopeptides from 3189 phosphoproteins in Arabidopsis and 2307 phosphopeptides from 1613 phosphoproteins in rice. We identified phosphorylation motifs, finding nineteen pS motifs and two pT motifs shared in rice and Arabidopsis. The majority of shared motif-containing proteins were mapped to the same biological processes with similar patterns of fold enrichment, indicating high functional conservation. We also identified shared patterns of crosstalk between phosphoserines with enrichment for motifs pSXpS, pSXXpS and pSXXXpS, where X is any amino acid. Lastly, our results identified several pairs of motifs that are significantly enriched to co-occur in Arabidopsis proteins, indicating cross-talk between different sites, but this was not observed in rice. Significance Our results demonstrate that there are evolutionary conserved mechanisms of phosphorylation-mediated signaling in plants, via analysis of high-throughput phosphorylation proteomics data from key monocot and dicot species: rice and Arabidposis thaliana. The results also suggest that there is increased crosstalk between phosphorylation sites in A. thaliana compared with rice. The results are important for our general understanding of cell signaling in plants, and the ability to use A. thaliana as a general model for plant biology.

bioRxiv | 2017

Comparative Qualitative Phosphoproteomics Analysis Identifies Shared Phosphorylation Motifs and Associated Biological Processes in Flowering Plants

Shireen Al-Momani; Da Qi; Zhe Ren; Andrew Jones

Phosphorylation is regarded as one of the most prevalent post-translational modifications and plays a key role in regulating cellular processes. In this work we carried out a comparative bioinformatics analysis of phosphoproteomics data, to profile two model species representing the largest subclasses in flowering plants the dicot Arabidopsis thaliana and the monocot Oryza sativa, to understand the extent to which phosphorylation signaling and function is conserved across evolutionary divergent plants. Using pre-existing mass spectrometry phosphoproteomics datasets and bioinformatic tools and resources, we identified 6,537 phosphopeptides from 3,189 phosphoproteins in Arabidopsis and 2,307 phosphopeptides from 1,613 phosphoproteins in rice. The relative abundance ratio of serine, threonine, and tyrosine phosphorylation sites in rice and Arabidopsis were highly similar: 88.3: 11.4: 0.4 and 86.7: 12.8: 0.5, respectively. Tyrosine phosphorylation shows features different from serine and threonine phosphorylation and was found to be more frequent in doubly-phosphorylated peptides in Arabidopsis. We identified phosphorylation sequence motifs in the two species to explore the similarities, finding nineteen pS motifs and two pT motifs that are shared in rice and Arabidopsis; among them are five novel motifs that have not previously been described in both species. The majority of shared motif-containing proteins were mapped to the same biological processes with similar patterns of fold enrichment, indicating high functional conservation. We also identified shared patterns of crosstalk between phosphoserines with motifs pSXpS, pSXXpS and pSXXXpS, where X is any amino acid, in both species indicating this is an evolutionary conserved signaling mechanism in flowering plants. However, our results are suggestive that there is greater co-occurrence of crosstalk between phosphorylation sites in Arabidopsis, and we were able to identify several pairs of motifs that are statistically significantly enriched to co-occur in Arabidopsis proteins, but not in rice.

Explore More