Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Junjun Zhang is active.

Publication


Featured researches published by Junjun Zhang.


Nature | 2006

Global variation in copy number in the human genome

Richard Redon; Shumpei Ishikawa; Karen R. Fitch; Lars Feuk; George H. Perry; T. Daniel Andrews; Heike Fiegler; Michael H. Shapero; Andrew R. Carson; Wenwei Chen; Eun Kyung Cho; Stephanie Dallaire; Jennifer L. Freeman; Juan R. González; Mònica Gratacòs; Jing Huang; Dimitrios Kalaitzopoulos; Daisuke Komura; Jeffrey R. MacDonald; Christian R. Marshall; Rui Mei; Lyndal Montgomery; Keunihiro Nishimura; Kohji Okamura; Fan Shen; Martin J. Somerville; Joelle Tchinda; Armand Valsesia; Cara Woodwark; Fengtang Yang

Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single-nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. A total of 1,447 copy number variable regions (CNVRs), which can encompass overlapping or adjacent gains or losses, covering 360 megabases (12% of the genome) were identified in these populations. These CNVRs contained hundreds of genes, disease loci, functional elements and segmental duplications. Notably, the CNVRs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal marked variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.


Nucleic Acids Research | 2009

BioMart Central Portal—unified access to biological data

Syed Haider; Benoit Ballester; Damian Smedley; Junjun Zhang; Peter A. Rice; Arek Kasprzyk

BioMart Central Portal (www.biomart.org) offers a one-stop shop solution to access a wide array of biological databases. These include major biomolecular sequence, pathway and annotation databases such as Ensembl, Uniprot, Reactome, HGNC, Wormbase and PRIDE; for a complete list, visit, http://www.biomart.org/biomart/martview. Moreover, the web server features seamless data federation making cross querying of these data sources in a user friendly and unified way. The web server not only provides access through a web interface (MartView), it also supports programmatic access through a Perl API as well as RESTful and SOAP oriented web services. The website is free and open to all users and there is no login requirement.


Cytogenetic and Genome Research | 2006

Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome

Junjun Zhang; Lars Feuk; G.E. Duggan; Razi Khaja; Stephen W. Scherer

The discovery of an abundance of copy number variants (CNVs; gains and losses of DNA sequences >1 kb) and other structural variants in the human genome is influencing the way research and diagnostic analyses are being designed and interpreted. As such, comprehensive databases with the most relevant information will be critical to fully understand the results and have impact in a diverse range of disciplines ranging from molecular biology to clinical genetics. Here, we describe the development of bioinformatics resources to facilitate these studies. The Database of Genomic Variants (http://projects.tcag.ca/variation/) is a comprehensive catalogue of structural variation in the human genome. The database currently contains 1,267 regions reported to contain copy number variation or inversions in apparently healthy human cases. We describe the current contents of the database and how it can serve as a resource for interpretation of array comparative genomic hybridization (array CGH) and other DNA copy imbalance data. We also present the structure of the database, which was built using a new data modeling methodology termed Cross-Referenced Tables (XRT). This is a generic and easy-to-use platform, which is strong in handling textual data and complex relationships. Web-based presentation tools have been built allowing publication of XRT data to the web immediately along with rapid sharing of files with other databases and genome browsers. We also describe a novel tool named eFISH (electronic fluorescence in situ hybridization) (http://projects.tcag.ca/efish/), a BLAST-based program that was developed to facilitate the choice of appropriate clones for FISH and CGH experiments, as well as interpretation of results in which genomic DNA probes are used in hybridization-based experiments.


Database | 2011

International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data.

Junjun Zhang; Joachim Baran; Anthony Cros; Jonathan M. Guberman; Syed Haider; Jack Hsu; Yong Liang; Elena Rivkin; Jianxin Wang; Brett Whitty; Marie Wong-Erasmus; Long Yao; Arek Kasprzyk

The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genomic abnormalities in 50 different cancer types. To make this data available, the ICGC has created the ICGC Data Portal. Powered by the BioMart software, the Data Portal allows each ICGC member institution to manage and maintain its own databases locally, while seamlessly presenting all the data in a single access point for users. The Data Portal currently contains data from 24 cancer projects, including ICGC, The Cancer Genome Atlas (TCGA), Johns Hopkins University, and the Tumor Sequencing Project. It consists of 3478 genomes and 13 cancer types and subtypes. Available open access data types include simple somatic mutations, copy number alterations, structural rearrangements, gene expression, microRNAs, DNA methylation and exon junctions. Additionally, simple germline variations are available as controlled access data. The Data Portal uses a web-based graphical user interface (GUI) to offer researchers multiple ways to quickly and easily search and analyze the available data. The web interface can assist in constructing complicated queries across multiple data sets. Several application programming interfaces are also available for programmatic access. Here we describe the organization, functionality, and capabilities of the ICGC Data Portal. Database URL: http://dcc.icgc.org


Nature Genetics | 2006

Genome assembly comparison identifies structural variants in the human genome

Razi Khaja; Junjun Zhang; Jeffrey R. MacDonald; Yongshu He; Ann M Joseph-George; John Wei; Muhammad A Rafiq; Cheng Qian; Mary Shago; Lorena Pantano; Hiroyuki Aburatani; Keith W. Jones; Richard Redon; Lluís Armengol; Xavier Estivill; Richard J. Mural; Charles Lee; Stephen W. Scherer; Lars Feuk

Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celeras R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.


Database | 2011

BioMart Central Portal: an open database network for the biological community

Jonathan M. Guberman; J. Ai; Olivier Arnaiz; Joachim Baran; Andrew Blake; Richard Baldock; Claude Chelala; David Croft; Anthony Cros; Rosalind J. Cutts; A. Di Génova; Simon A. Forbes; T. Fujisawa; Emanuela Gadaleta; David Goodstein; Gunes Gundem; Bernard Haggarty; Syed Haider; Matthew Hall; Todd W. Harris; Robin Haw; Songnian Hu; Simon J. Hubbard; Jack Hsu; Vivek Iyer; Philip Jones; Toshiaki Katayama; Rhoda Kinsella; Lei Kong; Daniel Lawson

BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities. Database URL: http://central.biomart.org.


Database | 2011

BioMart: a data federation framework for large collaborative projects

Junjun Zhang; Syed Haider; Joachim Baran; Anthony Cros; Jonathan M. Guberman; Jack Hsu; Yong Liang; Long Yao; Arek Kasprzyk

BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike. Database URL: http://www.biomart.org


Methods of Molecular Biology | 2006

Methods for identifying and mapping recent segmental and gene duplications in eukaryotic genomes.

Razi Khaja; Jeffrey R. MacDonald; Junjun Zhang; Stephen W. Scherer

The aim of this chapter is to provide instruction for analyzing and mapping recent segmental and gene duplications in eukaryotic genomes. We describe a bioinformatics-based approach utilizing computational tools to manage eukaryotic genome sequences to characterize and understand the evolutionary fates and trajectories of duplicated genes. An introduction to bioinformatics tools and programs such as BLAST, Perl, BioPerl, and the GFF specification provides the necessary background to complete this analysis for any eukaryotic genome of interest.


Proceedings of the National Academy of Sciences of the United States of America | 2006

Hotspots for copy number variation in chimpanzees and humans

George H. Perry; Joelle Tchinda; Sean McGrath; Junjun Zhang; Simon R. Picker; Angela M. Cáceres; A. John Iafrate; Chris Tyler-Smith; Stephen W. Scherer; Evan E. Eichler; Anne C. Stone; Charles Lee


Science | 2003

Human Chromosome 7: DNA Sequence and Biology

Stephen W. Scherer; Joseph Cheung; Jeffrey R. MacDonald; Lucy R. Osborne; Kazuhiko Nakabayashi; Jo Anne Herbrick; Andrew R. Carson; Layla Parker-Katiraee; Jennifer Skaug; Razi Khaja; Junjun Zhang; Alexander K. Hudek; Martin Li; May Haddad; Gavin E. Duggan; Bridget A. Fernandez; Emiko Kanematsu; Simone Gentles; Constantine C. Christopoulos; Sanaa Choufani; Dorota Kwasnicka; Xiangqun H. Zheng; Zhongwu Lai; Deborah Nusskern; Qing Zhang; Zhiping Gu; Fu Lu; Susan Zeesman; Małgorzata J.M. Nowaczyk; Ikuko Teshima

Collaboration


Dive into the Junjun Zhang's collaboration.

Top Co-Authors

Avatar

Stephen W. Scherer

The Centre for Applied Genomics

View shared research outputs
Top Co-Authors

Avatar

Syed Haider

Ontario Institute for Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Jeffrey R. MacDonald

The Centre for Applied Genomics

View shared research outputs
Top Co-Authors

Avatar

Razi Khaja

The Centre for Applied Genomics

View shared research outputs
Top Co-Authors

Avatar

Arek Kasprzyk

Ontario Institute for Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Lars Feuk

The Centre for Applied Genomics

View shared research outputs
Top Co-Authors

Avatar

Anthony Cros

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar

Jack Hsu

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge