Adam Gudyś
Silesian University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Adam Gudyś.
BMC Bioinformatics | 2013
Adam Gudyś; Michał Wojciech Szcześniak; Marek Sikora; Izabela Makalowska
BackgroundMachine learning techniques are known to be a powerful way of distinguishing microRNA hairpins from pseudo hairpins and have been applied in a number of recognised miRNA search tools. However, many current methods based on machine learning suffer from some drawbacks, including not addressing the class imbalance problem properly. It may lead to overlearning the majority class and/or incorrect assessment of classification performance. Moreover, those tools are effective for a narrow range of species, usually the model ones. This study aims at improving performance of miRNA classification procedure, extending its usability and reducing computational time.ResultsWe present HuntMi, a stand-alone machine learning miRNA classification tool. We developed a novel method of dealing with the class imbalance problem called ROC-select, which is based on thresholding score function produced by traditional classifiers. We also introduced new features to the data representation. Several classification algorithms in combination with ROC-select were tested and random forest was selected for the best balance between sensitivity and specificity. Reliable assessment of classification performance is guaranteed by using large, strongly imbalanced, and taxon-specific datasets in 10-fold cross-validation procedure. As a result, HuntMi achieves a considerably better performance than any other miRNA classification tool and can be applied in miRNA search experiments in a wide range of species.ConclusionsOur results indicate that HuntMi represents an effective and flexible tool for identification of new microRNAs in animals, plants and viruses. ROC-select strategy proves to be superior to other methods of dealing with class imbalance problem and can possibly be used in other machine learning classification tasks. The HuntMi software as well as datasets used in the research are freely available at http://lemur.amu.edu.pl/share/HuntMi/.
Plant and Cell Physiology | 2013
Michał Wojciech Szcześniak; Rafał Pokrzywa; Adam Gudyś; Izabela Makalowska
Splicing is one of the major contributors to observed spatiotemporal diversification of transcripts and proteins in metazoans. There are numerous factors that affect the process, but splice sites themselves along with the adjacent splicing signals are critical here. Unfortunately, there is still little known about splicing in plants and, consequently, further research in some fields of plant molecular biology will encounter difficulties. Keeping this in mind, we performed a large-scale analysis of splice sites in eight plant species, using novel algorithms and tools developed by us. The analyses included identification of orthologous splice sites, polypyrimidine tracts and branch sites. Additionally we identified putative intronic and exonic cis-regulatory motifs, U12 introns as well as splice sites in 45 microRNA genes in five plant species. We also provide experimental evidence for plant splice sites in the form of expressed sequence tag and RNA-Seq data. All the data are stored in a novel database called ERISdb and are freely available at http://lemur.amu.edu.pl/share/ERISdb/.
PLOS ONE | 2014
Adam Gudyś; Sebastian Deorowicz
Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors.
Fundamenta Informaticae | 2013
Marek Sikora; Adam Gudyś
In the paper we present CHIRA, an algorithm performing decision rules aggregation. New elementary conditions, which are linear combinations of attributes may appear in rule premises during the aggregation, leading to so-called oblique rules. The algorithm merges rules iteratively, in pairs, according to a certain order specified in advance. It applies the procedure of determining convex hulls for regions in a feature space which are covered by aggregated rules. CHIRA can be treated as the generalization of rule shortening and joining algorithms which, unlike them, allows a rule representation language to be changed. Application of presented algorithm allows one to decrease a number of rules, especially in the case of data in which decision classes are separated by hyperplanes not perpendicular to the attribute axes. Efficiency of CHIRA has been verified on rules obtained by two known rule induction algorithms, RIPPER and q-ModLEM, run on 18 benchmark data sets. Additionally, the algorithm has been applied on synthetic data as well as on a real-life set concerning classification of natural hazards in hard-coal mines.
ICMMI | 2014
Sebastian Deorowicz; Agnieszka Debudaj-Grabysz; Adam Gudyś
Determination of similarities between species is a crucial issue in life sciences. This task is usually done by comparing fragments of genomic or proteomic sequences of organisms subjected to analysis. The basic procedure which facilitates these comparisons is called multiple sequence alignment. There are a lot of algorithms aiming at this problem, which are either accurate or fast. We present Kalign-LCS, a variant of fast Kalign2 algorithm, that addresses the accuracy vs. speed trade-off. It employs the longest common subsequence measure and was thoroughly optimized. Experiments show that it is faster than Kalign2 and produces noticeably more accurate alignments.
Scientific Reports | 2016
Sebastian Deorowicz; Agnieszka Debudaj-Grabysz; Adam Gudyś
Rapid development of modern sequencing platforms enabled an unprecedented growth of protein families databases. The abundance of sets composed of hundreds of thousands sequences is a great challenge for multiple sequence alignment algorithms. In the article we introduce FAMSA, a new progressive algorithm designed for fast and accurate alignment of thousands of protein sequences. Its features include the utilisation of longest common subsequence measure for determining pairwise similarities, a novel method of gap costs evaluation, and a new iterative refinement scheme. Importantly, its implementation is highly optimised and parallelised to make the most of modern computer platforms. Thanks to the above, quality indicators, namely sum-of-pairs and total-column scores, show FAMSA to be superior to competing algorithms like Clustal Omega or MAFFT for datasets exceeding a few thousand of sequences. The quality does not compromise time and memory requirements which are an order of magnitude lower than that of existing solutions. For example, a family of 415 519 sequences was analysed in less than two hours and required only 8GB of RAM. FAMSA is freely available at this http URL
international conference on computer vision and graphics | 2014
Adam Gudyś; Jakub Rosner; Jakub Segen; Konrad Wojciechowski; Marek Kulbacki
Methods of tracking human motion in video sequences can be used to count people, identify pedestrian traffic patterns, analyze behavior statistics of shoppers, or as a preliminary step in the analysis and recognition of a person’s actions and behavior. A novel method for tracking multiple people in a video sequence is presented, based on clustering the motion paths of local features in images. It extends and improves the earlier tracking method based on clustering motion paths, by using the SURF detector and descriptor to identify, compare, and link the local features between video frames, instead of the characteristic points in bounding contours. A special care was put into the implementation to minimize time and memory requirements of the procedure, which allows it to process a 1080p video sequence in real-time on a dual processor workstation. The correctness of the procedure has been confirmed by experiments on synthetic and real video data.
asian conference on intelligent information and database systems | 2014
Marek Kulbacki; Jakub Segen; Kamil Wereszczyński; Adam Gudyś
Expansion of capabilities of intelligent surveillance systems and research in human motion analysis requires massive amounts of video data for training of learning methods and classifiers and for testing the solutions under realistic conditions. While there are many publicly available video sequences which are meant for training and testing, the existing video datasets are not adequate for real world problems, due to low realism of scenes and acted out human behaviors, relatively small sizes of datasets, low resolution and sometimes low quality of video.
ICMMI | 2011
Adam Gudyś; Sebastian Deorowicz
Modern graphical processing units (GPUs) offer much more computational power than modern CPUs, so it is natural that GPUs are often used for solving many computationally-intensive problems. One of the tasks of huge importance in bioinformatics is sequence alignment. We investigate its variant introduced a few years ago in which some additional requirement on the alignment is given. As a result we propose a parallel version of Center-Star algorithm computing the constrained multiple sequence alignment at the GPU. The obtained speedup over the serial CPU relative is in range [20, 200].
asian conference on intelligent information and database systems | 2015
Adam Gudyś; Kamil Wereszczyński; Jakub Segen; Marek Kulbacki; Aldona Drabik
Camera calibration is one of the basic problems concerning intelligent video analysis in networks of multiple cameras with changeable pan and tilt (PT). Traditional calibration methods give satisfactory results, but are human labour intensive. In this paper we introduce a method of camera calibration and navigation based on continuous tracking, which requires minimal human involvement. After the initial pre-calibration, it allows the camera pose to be calculated recursively in real time on the basis of the current and previous camera images and the previous pose. The method is suitable if multiple coplanar points are shared between views from neighbouring cameras, which is often the case in the video surveillance systems.