Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daniel Arend is active.

Publication


Featured researches published by Daniel Arend.


BMC Bioinformatics | 2014

e!DAL - a framework to store, share and publish research data

Daniel Arend; Matthias Lange; Jinbo Chen; Christian Colmsee; Steffen Flemming; Denny Hecht; Uwe Scholz

BackgroundThe life-science community faces a major challenge in handling “big data”, highlighting the need for high quality infrastructures capable of sharing and publishing research data. Data preservation, analysis, and publication are the three pillars in the “big data life cycle”. The infrastructures currently available for managing and publishing data are often designed to meet domain-specific or project-specific requirements, resulting in the repeated development of proprietary solutions and lower quality data publication and preservation overall.Resultse!DAL is a lightweight software framework for publishing and sharing research data. Its main features are version tracking, metadata management, information retrieval, registration of persistent identifiers (DOI), an embedded HTTP(S) server for public data access, access as a network file system, and a scalable storage backend. e!DAL is available as an API for local non-shared storage and as a remote API featuring distributed applications. It can be deployed “out-of-the-box” as an on-site repository.Conclusionse!DAL was developed based on experiences coming from decades of research data management at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK). Initially developed as a data publication and documentation infrastructure for the IPK’s role as a data center in the DataCite consortium, e!DAL has grown towards being a general data archiving and publication infrastructure. The e!DAL software has been deployed into the Maven Central Repository. Documentation and Software are also available at: http://edal.ipk-gatersleben.de.


Plant Methods | 2016

Measures for interoperability of phenotypic data: minimum information requirements and formatting

Hanna Ćwiek-Kupczyńska; Thomas Altmann; Daniel Arend; Elizabeth Arnaud; Dijun Chen; Guillaume Cornut; Fabio Fiorani; Wojciech Frohmberg; Astrid Junker; Christian Klukas; Matthias Lange; Cezary Mazurek; Anahita Nafissi; Pascal Neveu; Jan van Oeveren; Cyril Pommier; Hendrik Poorter; Philippe Rocca-Serra; Susanna-Assunta Sansone; Uwe Scholz; Marco van Schriek; Ümit Seren; Björn Usadel; Stephan Weise; Paul J. Kersey; Paweł Krajewski

BackgroundPlant phenotypic data shrouds a wealth of information which, when accurately analysed and linked to other data types, brings to light the knowledge about the mechanisms of life. As phenotyping is a field of research comprising manifold, diverse and time-consuming experiments, the findings can be fostered by reusing and combining existing datasets. Their correct interpretation, and thus replicability, comparability and interoperability, is possible provided that the collected observations are equipped with an adequate set of metadata. So far there have been no common standards governing phenotypic data description, which hampered data exchange and reuse.ResultsIn this paper we propose the guidelines for proper handling of the information about plant phenotyping experiments, in terms of both the recommended content of the description and its formatting. We provide a document called “Minimum Information About a Plant Phenotyping Experiment”, which specifies what information about each experiment should be given, and a Phenotyping Configuration for the ISA-Tab format, which allows to practically organise this information within a dataset. We provide examples of ISA-Tab-formatted phenotypic data, and a general description of a few systems where the recommendations have been implemented.ConclusionsAcceptance of the rules described in this paper by the plant phenotyping community will help to achieve findable, accessible, interoperable and reusable data.


Database | 2016

PGP repository: a plant phenomics and genomics data publication infrastructure.

Daniel Arend; Astrid Junker; Uwe Scholz; Danuta Schüler; Juliane Wylie; Matthias Lange

Plant genomics and phenomics represents the most promising tools for accelerating yield gains and overcoming emerging crop productivity bottlenecks. However, accessing this wealth of plant diversity requires the characterization of this material using state-of-the-art genomic, phenomic and molecular technologies and the release of subsequent research data via a long-term stable, open-access portal. Although several international consortia and public resource centres offer services for plant research data management, valuable digital assets remains unpublished and thus inaccessible to the scientific community. Recently, the Leibniz Institute of Plant Genetics and Crop Plant Research and the German Plant Phenotyping Network have jointly initiated the Plant Genomics and Phenomics Research Data Repository (PGP) as infrastructure to comprehensively publish plant research data. This covers in particular cross-domain datasets that are not being published in central repositories because of its volume or unsupported data scope, like image collections from plant phenotyping and microscopy, unfinished genomes, genotyping data, visualizations of morphological plant models, data from mass spectrometry as well as software and documents. The repository is hosted at Leibniz Institute of Plant Genetics and Crop Plant Research using e!DAL as software infrastructure and a Hierarchical Storage Management System as data archival backend. A novel developed data submission tool was made available for the consortium that features a high level of automation to lower the barriers of data publication. After an internal review process, data are published as citable digital object identifiers and a core set of technical metadata is registered at DataCite. The used e!DAL-embedded Web frontend generates for each dataset a landing page and supports an interactive exploration. PGP is registered as research data repository at BioSharing.org, re3data.org and OpenAIRE as valid EU Horizon 2020 open data archive. Above features, the programmatic interface and the support of standard metadata formats, enable PGP to fulfil the FAIR data principles—findable, accessible, interoperable, reusable. Database URL: http://edal.ipk-gatersleben.de/repos/pgp/


Scientific Data | 2016

Quantitative monitoring of Arabidopsis thaliana growth and development using high-throughput plant phenotyping

Daniel Arend; Matthias Lange; Jean-Michel Pape; Kathleen Weigelt-Fischer; Fernando Arana-Ceballos; Ingo Mücke; Christian Klukas; Thomas Altmann; Uwe Scholz; Astrid Junker

With the implementation of novel automated, high throughput methods and facilities in the last years, plant phenomics has developed into a highly interdisciplinary research domain integrating biology, engineering and bioinformatics. Here we present a dataset of a non-invasive high throughput plant phenotyping experiment, which uses image- and image analysis- based approaches to monitor the growth and development of 484 Arabidopsis thaliana plants (thale cress). The result is a comprehensive dataset of images and extracted phenotypical features. Such datasets require detailed documentation, standardized description of experimental metadata as well as sustainable data storage and publication in order to ensure the reproducibility of experiments, data reuse and comparability among the scientific community. Therefore the here presented dataset has been annotated using the standardized ISA-Tab format and considering the recently published recommendations for the semantical description of plant phenotyping experiments.


GigaScience | 2018

Predicting plant biomass accumulation from image-derived parameters

Dijun Chen; Rongli Shi; Jean-Michel Pape; Kerstin Neumann; Daniel Arend; Andreas Graner; Ming Chen; Christian Klukas

Abstract Background Image-based high-throughput phenotyping technologies have been rapidly developed in plant science recently, and they provide a great potential to gain more valuable information than traditionally destructive methods. Predicting plant biomass is regarded as a key purpose for plant breeders and ecologists. However, it is a great challenge to find a predictive biomass model across experiments. Results In the present study, we constructed 4 predictive models to examine the quantitative relationship between image-based features and plant biomass accumulation. Our methodology has been applied to 3 consecutive barley (Hordeum vulgare) experiments with control and stress treatments. The results proved that plant biomass can be accurately predicted from image-based parameters using a random forest model. The high prediction accuracy based on this model will contribute to relieving the phenotyping bottleneck in biomass measurement in breeding applications. The prediction performance is still relatively high across experiments under similar conditions. The relative contribution of individual features for predicting biomass was further quantified, revealing new insights into the phenotypic determinants of the plant biomass outcome. Furthermore, methods could also be used to determine the most important image-based features related to plant biomass accumulation, which would be promising for subsequent genetic mapping to uncover the genetic basis of biomass. Conclusions We have developed quantitative models to accurately predict plant biomass accumulation from image data. We anticipate that the analysis results will be useful to advance our views of the phenotypic determinants of plant biomass outcome, and the statistical methods can be broadly used for other plant species.


Journal of Biotechnology | 2017

From plant genomes to phenotypes

Marie E. Bolger; Rainer Schwacke; Heidrun Gundlach; Thomas Schmutzer; Jinbo Chen; Daniel Arend; Markus Oppermann; Stephan Weise; Matthias Lange; Fabio Fiorani; Manuel Spannagl; Uwe Scholz; Klaus F. X. Mayer

Recent advances in sequencing technologies have greatly accelerated the rate of plant genome and applied breeding research. Despite this advancing trend, plant genomes continue to present numerous difficulties to the standard tools and pipelines not only for genome assembly but also gene annotation and downstream analysis. Here we give a perspective on tools, resources and services necessary to assemble and analyze plant genomes and link them to plant phenotypes.


bioinformatics and biomedicine | 2012

The e!DAL JAVA-API: Store, share and cite primary data in life sciences

Daniel Arend; Matthias Lange; Christian Colmsee; Steffen Flemming; Jinbo Chen; Uwe Scholz

The paper presents e!DAL-API, a comprehensive storage backend for primary data management. It stands for (Electronical Data Archive Library) and implements a primary data storage infrastructure, but with an intuitive usability like a classical file system.


The Plant Genome | 2016

transPLANT Resources for Triticeae Genomic Data

Manuel Spannagl; Michael Alaux; Matthias Lange; Daniel M. Bolser; Kai Christian Bader; Thomas Letellier; Erik Kimmel; Raphael Flores; Cyril Pommier; Arnaud Kerhornou; Brandon Walts; Thomas Nussbaumer; Christoph Grabmüller; Jinbo Chen; Christian Colmsee; Sebastian Beier; Martin Mascher; Thomas Schmutzer; Daniel Arend; Anil Thanki; Ricardo H. Ramirez-Gonzalez; Martin Ayling; Sarah Ayling; Mario Caccamo; Klaus F. X. Mayer; Uwe Scholz; Delphine Steinbach; Hadi Quesneville; Paul J. Kersey

The genome sequences of many important Triticeae species, including bread wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.), remained uncharacterized for a long time because their high repeat content, large sizes, and polyploidy. As a result of improvements in sequencing technologies and novel analyses strategies, several of these have recently been deciphered. These efforts have generated new insights into Triticeae biology and genome organization and have important implications for downstream usage by breeders, experimental biologists, and comparative genomicists. transPLANT (http://www.transplantdb.eu) is an EU‐funded project aimed at constructing hardware, software, and data infrastructure for genome‐scale research in the life sciences. Since the Triticeae data are intrinsically complex, heterogenous, and distributed, the transPLANT consortium has undertaken efforts to develop common data formats and tools that enable the exchange and integration of data from distributed resources. Here we present an overview of the individual Triticeae genome resources hosted by transPLANT partners, introduce the objectives of transPLANT, and outline common developments and interfaces supporting integrated data access.


Journal of Biotechnology | 2017

Bioinformatics in the plant genomic and phenomic domain: The German contribution to resources, services and perspectives

Thomas Schmutzer; Marie E. Bolger; Stephen Rudd; Jinbo Chen; Heidrun Gundlach; Daniel Arend; Markus Oppermann; Stephan Weise; Matthias Lange; Manuel Spannagl; Klaus Mayer; Uwe Scholz

Plant genetic resources are a substantial opportunity for plant breeding, preservation and maintenance of biological diversity. As part of the German Network for Bioinformatics Infrastructure (de.NBI) the German Crop BioGreenformatics Network (GCBN) focuses mainly on crop plants and provides both data and software infrastructure which are tailored to the needs of the plant research community. Our mission and key objectives include: (1) provision of transparent access to germplasm seeds, (2) the delivery of improved workflows for plant gene annotation, and (3) implementation of bioinformatics services that link genotypes and phenotypes. This review introduces the GCBNs spectrum of web-services and integrated data resources that address common research problems in the plant genomics community.


data integration in the life sciences | 2014

Data Management Experiences and Best Practices from the Perspective of a Plant Research Institute

Daniel Arend; Christian Colmsee; H. Knüpffer; Markus Oppermann; Uwe Scholz; Danuta Schüler; Stephan Weise; Matthias Lange

Research in life sciences faces increasing amounts of cross-domain data, also kown as “big data”. This has notable effects on IT-departments and the dry lab desk alike. In this paper, we report on experiences from a decade of data management in a plant research institute. We explain the switch from personally managed files and heterogeneous information systems towards a centrally organised storage management. In particular, we discuss lessons that were learned within the last decade of productive research, data generation and software development from the perspective of a modern plant research institute and present the results of a strategic realignment of the data management infrastructure. Finally, we summarise the challenges which were solved and the questions which are still open.

Collaboration


Dive into the Daniel Arend's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge