Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vadim Zalunin is active.

Publication


Featured researches published by Vadim Zalunin.


Nucleic Acids Research | 2011

The European Nucleotide Archive

Rasko Leinonen; Ruth Akhtar; Ewan Birney; Lawrence Bower; Ana Cerdeño-Tárraga; Ying Cheng; Iain Cleland; Nadeem Faruque; Neil Goodgame; Richard Gibson; Gemma Hoad; Mikyung Jang; Nima Pakseresht; Sheila Plaister; Rajesh Radhakrishnan; Kethi Reddy; Siamak Sobhany; Petra ten Hoopen; Robert Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe’s primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimental research platform by providing data submission, archive, search and download services. In this article, we outline these services and describe major changes and improvements introduced during 2010. These include extended EMBL-Bank and SRA-data submission services, extended ENA Browser functionality, support for submitting data to the European Genome-phenome Archive (EGA) through SRA, and the launch of a new sequence similarity search service.


Nucleic Acids Research | 2009

Petabyte-scale innovations at the European Nucleotide Archive

Guy Cochrane; Ruth Akhtar; James K. Bonfield; Lawrence Bower; Fehmi Demiralp; Nadeem Faruque; Richard Gibson; Gemma Hoad; Tim Hubbard; Chris Hunter; Mikyung Jang; Szilveszter Juhos; Rasko Leinonen; Steven Leonard; Quan Lin; Rodrigo Lopez; Dariusz Lorenc; Hamish McWilliam; Gaurab Mukherjee; Sheila Plaister; Rajesh Radhakrishnan; Stephen Robinson; Siamak Sobhany; Petra ten Hoopen; Robert Vaughan; Vadim Zalunin; Ewan Birney

Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches.


F1000Research | 2015

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

Camilla L. C. Ip; Matthew Loose; John R. Tyson; Mariateresa de Cesare; Bonnie L. Brown; Miten Jain; Richard M. Leggett; David Eccles; Vadim Zalunin; John M. Urban; Paolo Piazza; Rory Bowden; Benedict Paten; Solomon Mwaigwisya; Elizabeth M. Batty; Jared T. Simpson; Terrance P. Snutch; Ewan Birney; David Buck; Sara Goodwin; Hans J. Jansen; Justin O'Grady; Hugh E. Olsen; MinION Analysis

The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance.


Nucleic Acids Research | 2012

Facing growth in the European Nucleotide Archive

Guy Cochrane; Blaise T. F. Alako; Clara Amid; Lawrence Bower; Ana Cerdeño-Tárraga; Iain Cleland; Richard Gibson; Neil Goodgame; Mikyung Jang; Simon Kay; Rasko Leinonen; Xiu Lin; Rodrigo Lopez; Hamish McWilliam; Arnaud Oisel; Nima Pakseresht; Swapna Pallreddy; Youngmi Park; Sheila Plaister; Rajesh Radhakrishnan; Stéphane Rivière; Marc Rossello; Alexander Senf; Nicole Silvester; Dimitriy Smirnov; Petra ten Hoopen; Ana Luisa Toribio; Daniel Vaughan; Vadim Zalunin

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.


Nucleic Acids Research | 2010

Improvements to services at the European Nucleotide Archive

Rasko Leinonen; Ruth Akhtar; Ewan Birney; James K. Bonfield; Lawrence Bower; Matthew Corbett; Ying Cheng; Fehmi Demiralp; Nadeem Faruque; Neil Goodgame; Richard Gibson; Gemma Hoad; Chris Hunter; Mikyung Jang; Steven Leonard; Quan Lin; Rodrigo Lopez; Michael Maguire; Hamish McWilliam; Sheila Plaister; Rajesh Radhakrishnan; Siamak Sobhany; Guy Slater; Petra ten Hoopen; Franck Valentin; Robert Vaughan; Vadim Zalunin; Daniel R. Zerbino; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe’s primary nucleotide sequence archival resource, safeguarding open nucleotide data access, engaging in worldwide collaborative data exchange and integrating with the scientific publication process. ENA has made significant contributions to the collaborative nucleotide archival arena as an active proponent of extending the traditional collaboration to cover capillary and next-generation sequencing information. We have continued to co-develop data and metadata representation formats with our collaborators for both data exchange and public data dissemination. In addition to the DDBJ/EMBL/GenBank feature table format, we share metadata formats for capillary and next-generation sequencing traces and are using and contributing to the NCBI SRA Toolkit for the long-term storage of the next-generation sequence traces. During the course of 2009, ENA has significantly improved sequence submission, search and access functionalities provided at EMBL–EBI. In this article, we briefly describe the content and scope of our archive and introduce major improvements to our services.


Nucleic Acids Research | 2015

Content discovery and retrieval services at the European Nucleotide Archive

Nicole Silvester; Blaise T. F. Alako; Clara Amid; Ana Cerdeño-Tárraga; Iain Cleland; Richard Gibson; Neil Goodgame; Petra ten Hoopen; Simon Kay; Rasko Leinonen; Weizhong Li; Xin Liu; Rodrigo Lopez; Nima Pakseresht; Swapna Pallreddy; Sheila Plaister; Rajesh Radhakrishnan; Marc Rossello; Alexander Senf; Dimitriy Smirnov; Ana Luisa Toribio; Daniel Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europes primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.


Nucleic Acids Research | 2017

European Nucleotide Archive in 2016

Ana Luisa Toribio; Blaise T. F. Alako; Clara Amid; Ana Cerdeño-Tárraga; Laura Clarke; Iain Cleland; Susan Fairley; Richard Gibson; Neil Goodgame; Petra ten Hoopen; Suran Jayathilaka; Simon Kay; Rasko Leinonen; Xin Liu; Josué Martínez-Villacorta; Nima Pakseresht; Jeena Rajan; Kethi Reddy; Marc Rosello; Nicole Silvester; Dmitriy Smirnov; Daniel Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.


Nucleic Acids Research | 2014

Assembly information services in the European Nucleotide Archive

Nima Pakseresht; Blaise T. F. Alako; Clara Amid; Ana Cerdeño-Tárraga; Iain Cleland; Richard Gibson; Neil Goodgame; Tamer Gur; Mikyung Jang; Simon Kay; Rasko Leinonen; Weizhong Li; Xin Liu; Rodrigo Lopez; Hamish McWilliam; Arnaud Oisel; Swapna Pallreddy; Sheila Plaister; Rajesh Radhakrishnan; Stéphane Rivière; Marc Rossello; Alexander Senf; Nicole Silvester; Dimitriy Smirnov; Silvano Squizzato; Petra ten Hoopen; Ana Luisa Toribio; Daniel Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.


Nucleic Acids Research | 2012

Major submissions tool developments at the European nucleotide archive

Clara Amid; Ewan Birney; Lawrence Bower; Ana Cerdeño-Tárraga; Ying Cheng; Iain Cleland; Nadeem Faruque; Richard Gibson; Neil Goodgame; Chris Hunter; Mikyung Jang; Rasko Leinonen; Xin Liu; Arnaud Oisel; Nima Pakseresht; Sheila Plaister; Rajesh Radhakrishnan; Kethi Reddy; Stéphane Rivière; Marc Rossello; Alexander Senf; Dimitriy Smirnov; Petra ten Hoopen; Daniel Vaughan; Robert Vaughan; Vadim Zalunin; Guy Cochrane

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), Europes primary nucleotide sequence resource, captures and presents globally comprehensive nucleic acid sequence and associated information. Covering the spectrum from raw data to assembled and functionally annotated genomes, the ENA has witnessed a dramatic growth resulting from advances in sequencing technology and ever broadening application of the methodology. During 2011, we have continued to operate and extend the broad range of ENA services. In particular, we have released major new functionality in our interactive web submission system, Webin, through developments in template-based submissions for annotated sequences and support for raw next-generation sequence read submissions.


F1000Research | 2017

MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry

Miten Jain; John R. Tyson; Matthew Loose; Camilla L. C. Ip; David Eccles; Justin O'Grady; Sunir Malla; Richard M. Leggett; Ola Wallerman; Hans J. Jansen; Vadim Zalunin; Ewan Birney; Bonnie L. Brown; Terrance P. Snutch; Hugh E. Olsen

Background: Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of Escherichia coli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry. Methods: We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads. Results: Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional (“2D”) N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of “pass” template reads from 1-dimensional (“1D”) experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D “pass” reads from 9.1% in R7.3 to 7.5% in R9.0 and for template “pass” reads from 26.7% in R7.3 to 14.5% in R9.0. Conclusions: These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology. Abbreviations: K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).

Collaboration


Dive into the Vadim Zalunin's collaboration.

Top Co-Authors

Avatar

Guy Cochrane

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Rasko Leinonen

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Petra ten Hoopen

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Richard Gibson

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Ana Cerdeño-Tárraga

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Iain Cleland

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Neil Goodgame

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Nima Pakseresht

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Clara Amid

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Daniel Vaughan

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge