Andrius Merkys
Vilnius University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andrius Merkys.
Nucleic Acids Research | 2012
Saulius Gražulis; Adriana Daškevič; Andrius Merkys; D. Chateigner; Luca Lutterotti; Miguel Quirós; Nadezhda R. Serebryanaya; Peter Moeck; Robert T. Downs; Armel Le Bail
Using an open-access distribution model, the Crystallography Open Database (COD, http://www.crystallography.net) collects all known ‘small molecule / small to medium sized unit cell’ crystal structures and makes them available freely on the Internet. As of today, the COD has aggregated ∼150 000 structures, offering basic search capabilities and the possibility to download the whole database, or parts thereof using a variety of standard open communication protocols. A newly developed website provides capabilities for all registered users to deposit published and so far unpublished structures as personal communications or pre-publication depositions. Such a setup enables extension of the COD database by many users simultaneously. This increases the possibilities for growth of the COD database, and is the first step towards establishing a world wide Internet-based collaborative platform dedicated to the collection and curation of structural knowledge.
Acta Crystallographica Section D Structural Biology | 2017
Fei Long; Robert A. Nicholls; Paul Emsley; Saulius Gražulis; Andrius Merkys; Antanas Vaitkus; Garib N. Murshudov
The program AceDRG generates accurate stereochemical descriptions, and one or more conformations, of a given ligand. The program also analyses entries and extracts local environment-dependent atom types, bonds and angles from the Crystallography Open Database.
Journal of Applied Crystallography | 2015
Saulius Gražulis; Andrius Merkys; Antanas Vaitkus; Mykolas Okulič-Kazarinas
An algorithm to compute stoichiometrically correct molecular formulae from crystal structures is proposed. The algorithm’s output is suitable for high-volume automated searches in chemical databases and for linking crystallographic and chemical information.
Journal of Applied Crystallography | 2016
Andrius Merkys; Antanas Vaitkus; Justas Butkus; Mykolas Okulič-Kazarinas; Visvaldas Kairys; Saulius Gražulis
A syntax-correcting CIF parser, COD::CIF::Parser, is described that can parse CIF 1.1 files and accurately report the position and nature of the discovered syntactic problems while automatically correcting the most common and the most obvious syntactic deficiencies.
Acta Crystallographica Section D Structural Biology | 2017
Fei Long; Robert A. Nicholls; Paul Emsley; Saulius Gražulis; Andrius Merkys; Antanas Vaitkus; Garib N. Murshudov
The entries from a freely available small-molecule database, the Crystallography Open Database, have been validated and a reliable subset of molecules has been selected for the extraction of molecular-geometry information. The atom types and corresponding bond and angle classes derived from this database have been subjected to validation, the results of which are used by AceDRG in the derivation of new ligand descriptions.
Journal of Cheminformatics | 2017
Andrius Merkys; Nicolas Mounet; Andrea Cepellotti; Nicola Marzari; Saulius Gražulis; Giovanni Pizzi
In order to make results of computational scientific research findable, accessible, interoperable and re-usable, it is necessary to decorate them with standardised metadata. However, there are a number of technical and practical challenges that make this process difficult to achieve in practice. Here the implementation of a protocol is presented to tag crystal structures with their computed properties, without the need of human intervention to curate the data. This protocol leverages the capabilities of AiiDA, an open-source platform to manage and automate scientific computational workflows, and the TCOD, an open-access database storing computed materials properties using a well-defined and exhaustive ontology. Based on these, the complete procedure to deposit computed data in the TCOD database is automated. All relevant metadata are extracted from the full provenance information that AiiDA tracks and stores automatically while managing the calculations. Such a protocol also enables reproducibility of scientific data in the field of computational materials science. As a proof of concept, the AiiDA–TCOD interface is used to deposit 170 theoretical structures together with their computed properties and their full provenance graphs, consisting in over 4600 AiiDA nodes.
Acta Crystallographica Section A | 2014
Saulius Gražulis; Andrius Merkys; Antanas Vaitkus; Armel Le Bail; D. Chateigner; Linas Vilčiauskas; Stefaan Cottenier; Torbjörn Björkman; Peter Murray-Rust
As computational chemistry methods enjoy unprecedented growth, computer power increases and price/performance ratio drops, a large number of crystal structures can today be refined and their properties computed using modern theoretical approaches (DFT, post-HF, QM/MM, etc.). We thus increasingly feel that an open collection of theoretically computed chemical structures would be a valuable resource for the scientific community. To address this need, we have launched a Theoretical Crystallography Open Database (TCOD, [4]). The TCOD sets a goal to collect a comprehensive set of computed crystal structures that would be made available under an Open Data license and invites all scientists to deposit their published results or pre-publication data. Accompanied with a large set of experimental structures in the COD database [3], the TCOD opens immediate possibilities for experimental and theoretical data cross-validation. The property results can now be tested against the Material Properties Open Database [6, 1].
Journal of Cheminformatics | 2018
Miguel Quirós; Saulius Gražulis; Saulė Girdzijauskaitė; Andrius Merkys; Antanas Vaitkus
Computer descriptions of chemical molecular connectivity are necessary for searching chemical databases and for predicting chemical properties from molecular structure. In this article, the ongoing work to describe the chemical connectivity of entries contained in the Crystallography Open Database (COD) in SMILES format is reported. This collection of SMILES is publicly available for chemical (substructure) search or for any other purpose on an open-access basis, as is the COD itself. The conventions that have been followed for the representation of compounds that do not fit into the valence bond theory are outlined for the most frequently found cases. The procedure for getting the SMILES out of the CIF files starts with checking whether the atoms in the asymmetric unit are a chemically acceptable image of the compound. When they are not (molecule in a symmetry element, disorder, polymeric species,etc.), the previously published cif_molecule program is used to get such image in many cases. The program package Open Babel is then applied to get SMILES strings from the CIF files (either those directly taken from the COD or those produced by cif_molecule when applicable). The results are then checked and/or fixed by a human editor, in a computer-aided task that at present still consumes a great deal of human time. Even if the procedure still needs to be improved to make it more automatic (and hence faster), it has already yielded more than 160,000 curated chemical structures and the purpose of this article is to announce the existence of this work to the chemical community as well as to spread the use of its results.
Acta Crystallographica Section A | 2017
Saulius Grazulis; Andrius Merkys; Antanas Vaitkus
In the today’s swiftly changing world, it is especially important that students get access to comprehensive volumes of up to date information. Developing skills for making inferences from the data as well as handling data itself is mandatory for modern curriculum. Open databases play crucial part in this process, since they permit both an unlimited re-use and sharing of data in the classroom as well as participation in real-life data gathering and curation tasks. The Crystallography Open Database (COD, [1]) provides an excellent tool for such teaching approach. After an initial introduction into crystallographic data handling, motivated students get individual assignments for data analysis and curation of the Crystallography Open Database. While working on these problems, students learn in practice how to perform various tasks using live crystallographic data. Using COD allows to present students with real-life, up to date problems. Initial solutions are checked by a supervisor, and if found good are committed to the COD under mutual agreement with students. As students gain experience, they receive subsequently more challenging tasks. After a period of initial training, the most motivated students can perform studies on their own, and use the COD as a research tool for their bachelor, MSc and PhD theses. The skills gained during initial assignments allow students to quickly join the research team, participate in projects and co-author joint publications [2–3]. For students, such arrangement gives rich work experience and early participation in research; the open nature of the COD allows students to continue their work using the same tools even if they later decide to leave the group. For the group, a large benefit arises from the skilled work force that permits to participate in various national and international projects. For example, the current COD team recruited some former MSc and current PhD students to participate in three calls from the Research Council of Lithuania (2 were granted financing), and in the currently running SOLSA project from the European Union under the Horizon 2020 program. The report will reveal several personal success stories, emphasize strengths of this approach; we will also discuss challenges that we met while implementing our approach and expose what could be considered as weaknesses of our current teaching method.
Acta Crystallographica Section A | 2014
Fei Long; Saulius Grazulis; Andrius Merkys; Garib N. Murshudov
The use of prior chemical knowledge such as bond lengths, bond angles about constituent blocks of macromolecules and ligands is an essential part of macromolecular crystal structure analysis. One of the reliable sources of such chemical knowledge is small molecule database where small molecule crystal structures have been analysed against high-resolution, high-quality experimental data. Furthermore, vast amount of data in small molecule database provide comprehensive coverage of flexible chemical environment and enable proper statistical analysis to avoid biased representation of those chemical properties. This presentation describes our work on organization of the data from open-access and daily-updated small molecule database, Crystallography Open Database(COD) [1], into a new generation of CCP4 monomer library (Dictionary), a container of prior chemical knowledge [2]. In order to describe specific environment atoms are in, they are classified into different atomic types based on local graphs and some basic chemical properties of atoms. This scheme can be applied to any small molecule databases. The atom types, and values of bond lengths and bond associated with them, are further clustered into a hierarchical tree and an isomorphism-mapping algorithm is implemented to facilitate fast search among a large number of atom types (typically several millions). This also provides a mechanism to derive reliable values for bond lengths and angles of novel ligands. Metal and non-organic atoms are treated differently with organic ones. The original data in COD are curated using several criteria and further statistical analysis on derived values of bond lengths and angles are allow to extract reliable chemical information from such databanks as COD. There are several software tools associated with new dictionary including 1) generate “ideal” bond lengths and angles for unknown ligand; 2) generate starting coordinates to represent one of the conformation of the ligand under consideration.