Bogumil M. Konopka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bogumil M. Konopka is active.

Explore More

Publication

Featured researches published by Bogumil M. Konopka.

BMC Bioinformatics | 2012

Quality assessment of protein model-structures based on structural and functional similarities

Bogumil M. Konopka; Jean-Christophe Nebel; Malgorzata Kotulska

BackgroundExperimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology.ResultsGOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wangs algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests.ConclusionsThe validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

ICCCI (SCI Volume) | 2009

Accuracy in Predicting Secondary Structure of Ionic Channels

Bogumil M. Konopka; Witold Dyrka; Jean-Christophe Nebel; Malgorzata Kotulska

Ionic channels are among the most difficult proteins for experimental structure determining, very few of them has been resolved. Bioinformatical tools has not been tested for this specific protein group. In the paper, prediction quality of ionic channel secondary structure is evaluated. The tests were carried out with general protein predictors and predictors only for transmembrane segments. The predictor performance was measured by the accuracy per residue Q and per segment SOV. The evaluation comparing ionic channels and other transmembrane proteins shows that ionic channels are only slightly more difficult objects for modeling than transmembrane proteins; the modeling quality is comparable with a general set of all proteins. Prediction quality showed dependence on the ratio of secondary structures in the ionic channel. Surprisingly, general purpose PSIPRED predictor outperformed other general but also dedicated transmembrane predictors under evaluation.

Journal of Computational Biology | 2014

Evaluating the Significance of Protein Functional Similarity Based on Gene Ontology

Bogumil M. Konopka; Tomasz Golda; Malgorzata Kotulska

Gene ontology is among the most successful ontologies in the biomedical domain. It is used to describe, unambiguously, protein molecular functions, cellular localizations, and processes in which proteins participate. The hierarchical structure of gene ontology allows quantifying protein functional similarity by application of algorithms that calculate semantic similarities. The scores, however, are meaningless without a given context. Here, we propose how to evaluate the significance of protein function semantic similarity scores by comparing them to reference distributions calculated for randomly chosen proteins. In the study, thresholds for significant functional semantic similarity, in four representative annotation corpuses, were estimated. We also show that the score significance is influenced by the number and specificity of gene ontology terms that are annotated to compared proteins. While proteins with a greater number of terms tend to yield higher similarity scores, proteins with more specific terms produce lower scores. The estimated significance thresholds were validated using protein sequence-function and structure-function relationships. Taking into account the term number and term specificity improves the distinction between significant and insignificant semantic similarity comparisons.

Bioinformatics | 2017

Forecasting residue-residue contact prediction accuracy

Pawel P. Wozniak; Bogumil M. Konopka; Jinbo Xu; Gert Vriend; Malgorzata Kotulska

Motivation Apart from meta‐predictors, most of todays methods for residue‐residue contact prediction are based entirely on Direct Coupling Analysis (DCA) of correlated mutations in multiple sequence alignments (MSAs). These methods are on average ˜40% correct for the 100 strongest predicted contacts in each protein. The end‐user who works on a single protein of interest will not know if predictions are either much more or much less correct than 40%, which is especially a problem if contacts are predicted to steer experimental research on that protein. Results We designed a regression model that forecasts the accuracy of residue‐residue contact prediction for individual proteins with an average error of 7 percentage points. Contacts were predicted with two DCA methods (gplmDCA and PSICOV). The models were built on parameters that describe the MSA, the predicted secondary structure, the predicted solvent accessibility and the contact prediction scores for the target protein. Results show that our models can be also applied to the meta‐methods, which was tested on RaptorX. Availability and implementation All data and scripts are available from http://comprec‐lin.iiar.pwr.edu.pl/dcaQ/. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Proteins | 2016

Fast assessment of structural models of ion channels based on their predicted current-voltage characteristics.

Witold Dyrka; Monika Kurczynska; Bogumil M. Konopka; Malgorzata Kotulska

Computational prediction of protein structures is a difficult task, which involves fast and accurate evaluation of candidate model structures. We propose to enhance single‐model quality assessment with a functionality evaluation phase for proteins whose quantitative functional characteristics are known. In particular, this idea can be applied to evaluation of structural models of ion channels, whose main function ‐ conducting ions ‐ can be quantitatively measured with the patch‐clamp technique providing the current–voltage characteristics. The study was performed on a set of KcsA channel models obtained from complete and incomplete contact maps. A fast continuous electrodiffusion model was used for calculating the current–voltage characteristics of structural models. We found that the computed charge selectivity and total current were sensitive to structural and electrostatic quality of models. In practical terms, we show that evaluating predicted conductance values is an appropriate method to eliminate models with an occluded pore or with multiple erroneously created pores. Moreover, filtering models on the basis of their predicted charge selectivity results in a substantial enrichment of the candidate set in highly accurate models. Tests on three other ion channels indicate that, in addition to being a proof of the concept, our function‐oriented single‐model quality assessment method can be directly applied to evaluation of structural models of some classes of protein channels. Finally, our work raises an important question whether a computational validation of functionality should be included in the evaluation process of structural models, whenever possible. Proteins 2016; 84:217–231.

Scientific Reports | 2018

Distinguishing mirtrons from canonical miRNAs with data exploration and machine learning methods

Grzegorz Rorbach; Olgierd Unold; Bogumil M. Konopka

Mirtrons are non-canonical microRNAs encoded in introns the biogenesis of which starts with splicing. They are not processed by Drosha and enter the canonical pathway at the Exportin-5 level. Mirtrons are much less evolutionary conserved than canonical miRNAs. Due to the differences, canonical miRNA predictors are not applicable to mirtron prediction. Identification of differences is important for designing mirtron prediction algorithms and may help to improve the understanding of mirtron functioning. So far, only simple, single-feature comparisons were reported. These are insensitive to complex feature relations. We quantified miRNAs with 25 features and showed that it is impossible to distinguish the two miRNA species using simple thresholds on any single feature. However, when using the Principal Component Analysis mirtrons and canonical miRNAs are grouped separately. Moreover, several methodologically diverse machine learning classifiers delivered high classification performance. Using feature selection algorithms we found features (e.g. bulges in the stem region), previously reported divergent in two classes, that did not contribute to improving classification accuracy, which suggests that they are not biologically meaningful. Finally, we proposed a combination of the most important features (including Guanine content, hairpin free energy and hairpin length) which convey a specific pattern, crucial for identifying mirtrons.

Archive | 2017

Role of Bioinformatics in the Study of Ionic Channels

Monika Kurczynska; Bogumil M. Konopka; Malgorzata Kotulska

Ionic channels belong to the group of the most important proteins. Not only do they enable transmembrane transport but they are also the key factors for proper cell function. Mutations changing their structure and functionality often lead to severe diseases called channelopathies. On the other hand, transmembrane channels are very difficult objects for experimental studies. Only 2% of experimentally identified structures are transmembrane proteins, while genomic studies show that transmembrane proteins make up 30% of all coded proteins. This gap could be diminished by bioinformatical methods which enable modeling unknown protein structures, functions, transmembrane location, and ligand binding. Several in silico methods dedicated to transmembrane proteins have been developed; some general methods could also be used. They provide the information unavailable from experiments. Current modeling tools use a variety of computational methods, which provide results of surprisingly high quality.

BMC Bioinformatics | 2017

Quantiprot - a Python package for quantitative analysis of protein sequences

Bogumil M. Konopka; Marta Marciniak; Witold Dyrka

BackgroundThe field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted.ResultsQuantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf’s law coefficient.ConclusionsWe propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

The Journal of Membrane Biology | 2014