Heorhiy Byelas
University of Groningen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Heorhiy Byelas.
Nature Genetics | 2014
Laurent C. Francioli; Androniki Menelaou; Sara L. Pulit; Freerk van Dijk; Pier Francesco Palamara; Clara C. Elbers; Pieter B. T. Neerincx; Kai Ye; Victor Guryev; Wigard P. Kloosterman; Patrick Deelen; Abdel Abdellaoui; Elisabeth M. van Leeuwen; Mannis van Oven; Martijn Vermaat; Mingkun Li; Jeroen F. J. Laros; Lennart C. Karssen; Alexandros Kanterakis; Najaf Amin; Jouke-Jan Hottenga; Eric-Wubbo Lameijer; Mathijs Kattenberg; Martijn Dijkstra; Heorhiy Byelas; Jessica van Setten; Barbera D. C. van Schaik; Jan Bot; Isaac J. Nijman; Ivo Renkens
Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring families and constructed a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions. The intermediate coverage (∼13×) and trio design enabled extensive characterization of structural variation, including midsize events (30–500 bp) previously poorly catalogued and de novo mutations. We demonstrate that the quality of the haplotypes boosts imputation accuracy in independent samples, especially for lower frequency alleles. Population genetic analyses demonstrate fine-scale structure across the country and support multiple ancient migrations, consistent with historical changes in sea level and flooding. The GoNL Project illustrates how single-population whole-genome sequencing can provide detailed characterization of genetic variation and may guide the design of future population studies.
European Journal of Human Genetics | 2014
Dorret I. Boomsma; Cisca Wijmenga; Eline Slagboom; Morris A. Swertz; Lennart C. Karssen; Abdel Abdellaoui; Kai Ye; Victor Guryev; Martijn Vermaat; Freerk van Dijk; Laurent C. Francioli; Jouke-Jan Hottenga; Jeroen F. J. Laros; Qibin Li; Yingrui Li; Hongzhi Cao; Ruoyan Chen; Yuanping Du; Ning Li; Sujie Cao; Jessica van Setten; Androniki Menelaou; Sara L. Pulit; Jayne Y. Hehir-Kwa; Marian Beekman; Clara C. Elbers; Heorhiy Byelas; Anton J. M. de Craen; Patrick Deelen; Martijn Dijkstra
Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
European Journal of Human Genetics | 2014
Patrick Deelen; Androniki Menelaou; Elisabeth M. van Leeuwen; Alexandros Kanterakis; Freerk van Dijk; Carolina Medina-Gomez; Laurent C. Francioli; J ouke; Jan Hottenga; Lennart C. Karssen; Karol Estrada; Eskil Kreiner-Møller; Fernando Rivadeneira; Jessica van Setten; Javier Gutierrez-Achury; Lude Franke; David van Enckevort; Martijn Dijkstra; Heorhiy Byelas; Paul I. W. de Bakker; Cisca Wijmenga; Morris A. Swertz
Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with ‘true’ genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05–0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r2, increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r2 improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r2 increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.
ieee pacific visualization symposium | 2009
Heorhiy Byelas; Alexandru Telea
We present a new method for the combined visualization of software architecture diagrams, such as UML class diagrams or component diagrams, and software metrics defined on groups of diagram elements. Our method extends an existing rendering technique for the so-called areas of interest in system architecture diagrams to visualize several metrics, possibly having missing values, defined on overlapping areas of interest. For this, we use a solution that combines texturing, blending, and smooth scattered-data point interpolation. Our new method simplifies the task of visually correlating the distribution and outlier values of a multivariate metric dataset with a systems structure. We demonstrate the application of our method on component and class diagrams extracted from real-world systems.
Journal of Visual Languages and Computing | 2009
Heorhiy Byelas; Alexandru Telea
Areas of interest (AOIs) are defined as groups of elements of system architecture diagrams that share some common property. Visualizing AOIs is a useful addition to plain diagrams, such as UML diagrams. Some methods have been proposed to automatically draw AOIs on UML diagrams. However, it is not clear whether actual users perceive the results of such methods to be better or worse as compared to human-drawn AOI, and what needs to be improved. We present here a process of studying and improving the perceived quality of computer-drawn AOI. For this, we conducted a qualitative evaluation that delivered insight in how users perceive the quality of computer-drawn AOIs as compared to hand-drawn diagrams. Following these results, we derived and implemented several improvements to an existing algorithm for computer-drawn AOIs. Next, we designed a distance metric to quantitatively compare different AOI drawings, and used this metric to show that our improved rendering algorithm creates drawings which are closer to (good) human drawings than the original rendering algorithm. We present here the results of the user evaluation, our improved algorithm for drawing AOIs, and the quantitative analysis performed to compare different drawings. The combined user evaluation, algorithmic improvements, and quantitative comparison method support our claim of having improved the perceived quality and understandability of AOI rendered on architecture diagrams.
Electronic Notes in Theoretical Computer Science | 2009
Alexandru Telea; Heorhiy Byelas; Lucian Voinea
When assessing the quality and maintainability of large C++ code bases, tools are needed for extracting several facts from the source code, such as: architecture, structure, code smells, and quality metrics. Moreover, these facts should be presented in such ways so that one can correlate them and find outliers and anomalies. We present SolidFX, an integrated reverse-engineering environment (IRE) for C and C++. SolidFX was specifically designed to support code parsing, fact extraction, metric computation, and interactive visual analysis of the results in much the same way IDEs and design tools offer for the forward engineering pipeline. In the design of SolidFX, we adapted and extended several existing code analysis and data visualization techniques to render them scalable for handling code bases of millions of lines. In this paper, we detail several design decisions taken to construct SolidFX. We also illustrate the application of our tool and our lessons learnt in using it in several types of analyses of real-world industrial code bases, including maintainability and modularity assessments, detection of coding patterns, and complexity analyses.
European Journal of Human Genetics | 2017
Laurent C. Francioli; Mircea Cretu-Stancu; Kiran Garimella; Menachem Fromer; Wigard P. Kloosterman; Cisca Wijmenga; Principal Investigator; Morris A. Swertz; Cornelia M. van Duijn; Dorret I. Boomsma; PEline Slagboom; Gert-Jan B. van Ommen; Paul I. W. de Bakker; Freerk van Dijk; Androniki Menelaou; Pieter B. T. Neerincx; Sara L. Pulit; Patrick Deelen; Clara C. Elbers; Pier Francesco Palamara; Itsik Pe'er; Abdel Abdellaoui; Mannis van Oven; Martijn Vermaat; Mingkun Li; Jeroen F. J. Laros; Mark Stoneking; Peter de Knijff; Manfred Kayser; Jan H. Veldink
Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father’s age at conception and the number of DNMs in female offspring’s X chromosome, consistent with previous literature reports.
parallel, distributed and network-based processing | 2011
Heorhiy Byelas; Alexandros Kanterakis; Morris A. Swertz
High-throughput bioinformatics research is complex and requires the combination of multiple experimental approaches each producing large amounts of diverse data. The analysis and evaluation of these data are equally complex requiring specific integrations of various software components into complex workflows. The challenge is to provide less technically involved bioinformaticians with simple interfaces to specify the workflow of commands they need while at the same time scale up to hundreds of jobs to get the terabytes of genetic data processed by recent methods. Here, we present a computational framework for bioinformatics which enables data and workflow management in a distributed computational environment. Firstly, we propose a new data model to specify workflow execution logic on available network resources and components. Our model extends existing generic workflow and bioinformatics models to describe workflows compactly and unambiguously. Secondly, we present the implementation of our computational framework, which is constructed as a computational cloud for bioinformatics using open source off-the-shelf components. Finally, we demonstrate applications of the framework on complex real-world bioinformatics tasks.
working conference on reverse engineering | 2008
Heorhiy Byelas; Alexandru Telea
We present the metric lens, a new visualization of method level code metrics atop UML class diagrams, which allows performing metric-metric and metric-structure correlations on large diagrams. We demonstrate an interactive visualization tool in which users can quickly specify a wide palette of analyses, based on color-mapping, scaling, and sorting metric tables on UML diagrams. We illustrate our technique and tool by a sample complexity assessment analysis of a real-world C++ software system.
software visualization | 2008
Heorhiy Byelas; Alexandru Telea
We present a method that combines textures, blending, and scattered-data interpolation to visualize several metrics defined on overlapping areas-of-interest on UML class diagrams. We aim to simplify the task of visually correlating the distribution and outlier values of a multivariate metric dataset with a systems structure. We illustrate our method on a class diagram of a real-world system.