Aleksandra Swiercz
Poznań University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aleksandra Swiercz.
Operations Research | 2006
Jacek Blazewicz; Ceyda Oguz; Aleksandra Swiercz; Jan Węglarz
An innovative approach to DNA sequencing by hybridization utilizes isothermic oligonucleotide libraries. In this paper, we demonstrate the utility of a genetic algorithm for the combinatorial portion of this new approach by incorporating characteristics of DNA sequencing by hybridization in addition to isothermic oligonucleotide libraries. Specialized crossover and mutation operators were developed for this purpose. After initial experiments for parameter adjustment, the performance of the genetic algorithm approach was evaluated with respect to previous methods in the literature. The results indicate that the proposed new approach is superior to previous approaches. The proposed new crossover operator that inherits some features of the structured weighted combinations might also be of value for some other combinatorial problems, including the traveling salesman problem.
Computational Biology and Chemistry | 2006
Jacek Blazewicz; Fred Glover; Marta Kasprzak; Wojciech T. Markiewicz; Ceyda Ouz; Dietrich Rebholz-Schuhmann; Aleksandra Swiercz
DNA sequencing by hybridization (SBH) induces errors in the biochemical experiment. Some of them are random and disappear when the experiment is repeated. Others are systematic, involving repetitions in the probes of the target sequence. A good method for solving SBH problems must deal with both types of errors. In this work we propose a new hybrid genetic algorithm for isothermic and standard sequencing that incorporates the concept of structured combinations. The algorithm is then compared with other methods designed for handling errors that arise in standard and isothermic SBH approaches. DNA sequences used for testing are taken from GenBank. The set of instances for testing was divided into two groups. The first group consisted of sequences containing positive and negative errors in the spectrum, at a rate of up to 20%, excluding errors coming from repetitions. The second group consisted of sequences containing repeated oligonucleotides, and containing additional errors up to 5% added into the spectra. Our new method outperforms the best alternative procedures for both data sets. Moreover, the method produces solutions exhibiting extremely high degree of similarity to the target sequences in the cases without repetitions, which is an important outcome for biologists. The spectra prepared from the sequences taken from GenBank are available on our website http://bio.cs.put.poznan.pl/.
Plant Methods | 2016
Michal Goralski; Paula Sobieszczanska; Aleksandra Obrępalska-Stęplowska; Aleksandra Swiercz; Agnieszka Zmienko; Marek Figlerowicz
BackgroundNicotiana benthamiana has been widely used in laboratories around the world for studying plant-pathogen interactions and posttranscriptional gene expression silencing. Yet the exploration of its transcriptome has lagged behind due to the lack of both adequate sequence information and genome-wide analysis tools, such as DNA microarrays. Despite the increasing use of high-throughput sequencing technologies, the DNA microarrays still remain a popular gene expression tool, because they are cheaper and less demanding regarding bioinformatics skills and computational effort.ResultsWe designed a gene expression microarray with 103,747 60-mer probes, based on two recently published versions of N. benthamiana transcriptome (v.3 and v.5). Both versions were reconstructed from RNA-Seq data of non-strand-specific pooled-tissue libraries, so we defined the sense strand of the contigs prior to designing the probe. To accomplish this, we combined a homology search against Arabidopsis thaliana proteins and hybridization to a test 244k microarray containing pairs of probes, which represented individual contigs. We identified the sense strand in 106,684 transcriptome contigs and used this information to design an Nb-105k microarray on an Agilent eArray platform. Following hybridization of RNA samples from N. benthamiana roots and leaves we demonstrated that the new microarray had high specificity and sensitivity for detection of differentially expressed transcripts. We also showed that the data generated with the Nb-105k microarray may be used to identify incorrectly assembled contigs in the v.5 transcriptome, by detecting inconsistency in the gene expression profiles, which is indicated using multiple microarray probes that match the same v.5 primary transcripts.ConclusionsWe provided a complete design of an oligonucleotide microarray that may be applied to the research of N. benthamiana transcriptome. This, in turn, will allow the N. benthamiana research community to take full advantage of microarray capabilities for studying gene expression in this plant. Additionally, by defining the sense orientation of over 106,000 contigs, we substantially improved the functional information on the N. benthamiana transcriptome. The simple hybridization-based approach for detecting the sense orientation of computationally assembled sequences can be used for updating the transcriptomes of other non-model organisms, including cases where no significant homology to known proteins exists.
Foundations of Computing and Decision Sciences | 2013
Jacek Blazewicz; Wojciech Frohmberg; Piotr Gawron; Marta Kasprzak; Michal Kierzynka; Aleksandra Swiercz; Paweł T. Wojciechowski
Abstract The problem of DNA sequence assembly is well known for its high complexity. Experimental errors of di erent kinds present in data and huge sizes of the problem instances make this problem very hard to solve. In order to deal with such data, advanced efficient heuristics must be constructed. Here, we propose a new approach to the sequence assembly problem, modeled as the problem of searching for paths in an acyclic digraph. Since the graph representing an assembly instance is not acyclic in general, it is heuristically transformed into the acyclic form. This approach reduces the time of computations significantly and allows to maintain high quality of produced solutions.
Oncotarget | 2016
Katarzyna Klonowska; Luiza Handschuh; Aleksandra Swiercz; Marek Figlerowicz; Piotr Kozlowski
Although currently available strategies for the preparation of exome-enriched libraries are well established, a final validation of the libraries in terms of exome enrichment efficiency prior to the sequencing step is of considerable importance. Here, we present a strategy for the evaluation of exome enrichment, i.e., the Multipoint Test for Targeted-enrichment Efficiency (MTTE), PCR-based approach utilizing multiplex ligation-dependent probe amplification with capillary electrophoresis separation. We used MTTE for the analysis of subsequent steps of the Illumina TruSeq Exome Enrichment procedure. The calculated values of enrichment-associated parameters (i.e., relative enrichment, relative clearance, overall clearance, and fold enrichment) and the comparison of MTTE results with the actual enrichment revealed the high reliability of our assay. Additionally, the MTTE assay enabled the determination of the sequence-associated features that may confer bias in the enrichment of different targets. Importantly, the MTTE is low cost method that can be easily adapted to the region of interest important for a particular project. Thus, the MTTE strategy is attractive for post-capture validation in a variety of targeted/exome enrichment NGS projects.
European Journal of Operational Research | 2018
Jacek Blazewicz; Marta Kasprzak; Michal Kierzynka; Wojciech Frohmberg; Aleksandra Swiercz; Pawel Wojciechowski; Piotr Zurkowski
With the ubiquitous presence of next-generation sequencing in modern biological, genetic, pharmaceutical and medical research, not everyone pays attention to the underlying computational methods. Even fewer researchers know what were the origins of the current models for DNA assembly. We present original graph models used in DNA sequencing by hybridization, discuss their properties and connections between them. We also explain how these graph models evolved to adapt to the characteristics of next-generation sequencing. Moreover, we present a practical comparison of state-of-the-art DNA de novo assembly tools representing these transformed models, i.e. overlap and decomposition-based graphs. Even though the competition is tough, some assemblers perform better and certainly large differences may be observed in hardware resources utilization. Finally, we outline the most important trends in the sequencing field, and try to predict their impact on the computational models in the future.
software engineering, artificial intelligence, networking and parallel/distributed computing | 2008
Jacek Blazewicz; Marta Kasprzak; Aleksandra Swiercz; Marek Figlerowicz; Piotr Gawron; Darren Platt; Lukasz Szajkowski
DNA assembly problem is well known for its high complexity both on biological and computational levels. Traditional laboratory approach to the problem, which involves DNA sequencing by hybridization or by gel electrophoresis, entails a lot of errors coming from experimental and algorithmic stages. DNA sequences constituting the traditional assembly input have lengths about a few hundreds of nucleotides and they cover each other rather sparsely. A new biochemical approach to DNA sequencing gives highly reliable output of relatively low cost and in short time. It is 454 sequencing, based on the pyrosequencing protocol, owned by 454 Life Sciences Corporation. The produced sequences are shorter (about 100-200 nucleotides) but their coverage in the assembled sequence is very dense. In the paper, we propose a parallel implementation of an algorithm dealing well with such data and outperforming other assembly algorithms used in practice. The algorithm is a heuristic based on a graph model, the graph being built on the set of input sequences. Computational tests were performed on real data obtained from the 454 sequencer during sequencing the genome of bacteria Prochlorococcus marinus.
international conference on information technology | 2008
Jacek Blazewicz; Marcin Bryja; Marek Figlerowicz; Piotr Gawron; Marta Kasprzak; Darren Platt; Jakub Przybytek; Aleksandra Swiercz; Lukasz Szajkowski
Progress in bioengineering brought a new approach to DNA sequencing, which aim is to give highly reliable output of low cost and in short time. It is 454 sequencing, based on the pyrosequencing protocol, owned by 454 Life Sciences Corporation. Because of the sequences reliability this method is much better than others for assembly purposes. However, produced sequences are much shorter and there are many more of them, which indicate that the problem is harder. Presented algorithm was created to process data from 454 sequencing method. Usefulness of the algorithm has been proven in tests on raw data generated during sequencing of the whole 1.84 Mbp genome of bacteria Prochlorococcus marinus.
PLOS ONE | 2018
Aleksandra Swiercz; Wojciech Frohmberg; Michal Kierzynka; Paweł T. Wojciechowski; Piotr Zurkowski; Jan Badura; Artur Laskowski; Marta Kasprzak; Jacek Blazewicz
Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of overlap-layout-consensus. It uses an efficient GPU implementation for the sequence alignment during the graph construction stage and a greedy hyper-heuristic algorithm at the fork detection stage. A two-part fork detection method allows us to identify repeated fragments of a genome and to reconstruct them without misassemblies. The assemblies of data sets of bacteria Candidatus Microthrix, nematode Caenorhabditis elegans, and human chromosome 14 were evaluated with the golden standard tool QUAST. In comparison with other assemblers, GRASShopPER provided contigs that covered the largest part of the genomes and, at the same time, kept good values of other metrics, e.g., NG50 and misassembly rate.
parallel processing and applied mathematics | 2011
Jacek Blazewicz; Bartosz Bosak; Piotr Gawron; Marta Kasprzak; Krzysztof Kurowski; Tomasz Piontek; Aleksandra Swiercz
Due to the rapid development of the technology, next-generation sequencers can produce huge amount of short DNA fragments covering a genomic sequence of an organism in short time. There is a need for the time-efficient algorithms which could assembly these fragments together and reconstruct the examined DNA sequence. Previously proposed algorithm for de novo assembly, SR-ASM, produced results of high quality, but required a lot of time for computations. The proposed hybrid parallel programming strategy allows one to use the two-level hierarchy: computations in threads (on a single node with many cores) and computations on different nodes in a cluster. The tests carried out on real data of Prochloroccocus marinus coming from Roche sequencer showed, that the algorithm was speeded up 20 times in comparison to the sequential approach with the maintenance of the high accuracy and beating results of other algorithms.