Dariusz Plewczynski
University of Warsaw
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dariusz Plewczynski.
Cell | 2015
Zhonghui Tang; Oscar Junhong Luo; Xingwang Li; Meizhen Zheng; Przemysław Szałaj; Paweł Trzaskoma; Adriana Magalska; Jakub Wlodarczyk; Blazej Ruszczycki; Paul Michalski; Emaly Piecuch; Ping Wang; Danjuan Wang; Simon Zhongyuan Tian; May Penrad-Mobayed; Laurent M. Sachs; Xiaoan Ruan; Chia-Lin Wei; Edison T. Liu; Grzegorz M. Wilczynski; Dariusz Plewczynski; Guoliang Li; Yijun Ruan
Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases.
Journal of Computational Chemistry | 2011
Dariusz Plewczynski; Michał Łaźniewski; Rafał Augustyniak; Krzysztof Ginalski
Docking is one of the most commonly used techniques in drug design. It is used for both identifying correct poses of a ligand in the binding site of a protein as well as for the estimation of the strength of protein–ligand interaction. Because millions of compounds must be screened, before a suitable target for biological testing can be identified, all calculations should be done in a reasonable time frame. Thus, all programs currently in use exploit empirically based algorithms, avoiding systematic search of the conformational space. Similarly, the scoring is done using simple equations, which makes it possible to speed up the entire process. Therefore, docking results have to be verified by subsequent in vitro studies. The purpose of our work was to evaluate seven popular docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) on the extensive dataset composed of 1300 protein–ligands complexes from PDBbind 2007 database, where experimentally measured binding affinity values were also available. We compared independently the ability of proper posing [according to Root mean square deviation (or Root mean square distance) of predicted conformations versus the corresponding native one] and scoring (by calculating the correlation between docking score and ligand binding strength). To our knowledge, it is the first large‐scale docking evaluation that covers both aspects of docking programs, that is, predicting ligand conformation and calculating the strength of its binding. More than 1000 protein–ligand pairs cover a wide range of different protein families and inhibitor classes. Our results clearly showed that the ligand binding conformation could be identified in most cases by using the existing software, yet we still observed the lack of universal scoring function for all types of molecules and protein families.
Current protocols in human genetics | 2006
Liisa Holm; Sakari Kääriäinen; Chris Wilton; Dariusz Plewczynski
The Dali program is widely used for carrying out automatic comparisons of protein structures determined by X-ray crystallography or NMR. The most familiar version is the Dali server, which performs a database search comparing a query structure supplied by the user against the database of known structures (PDB) and returns the list of structural neighbors by e-mail. The more recently introduced DaliLite server compares two structures against each other and visualizes the result interactively. The Dali database is a structural classification based on precomputed all-against-all structural similarities within the PDB. The resulting hierarchical classification can be browsed on the Web and is linked to protein sequence classification resources. All Dali resources use an identical algorithm for structure comparison. Users may run Dali using the Web, or the program may be downloaded to be run locally on Linux computers.
Journal of Chemical Information and Modeling | 2006
Dariusz Plewczynski; Stéphane A. H. Spieser; Uwe Koch
How well do different classification methods perform in selecting the ligands of a protein target out of large compound collections not used to train the model? Support vector machines, random forest, artificial neural networks, k-nearest-neighbor classification with genetic-algorithm-optimized feature selection, trend vectors, naïve Bayesian classification, and decision tree were used to divide databases into molecules predicted to be active and those predicted to be inactive. Training and predicted activities were treated as binary. The database was generated for the ligands of five different biological targets which have been the object of intense drug discovery efforts: HIV-reverse transcriptase, COX2, dihydrofolate reductase, estrogen receptor, and thrombin. We report significant differences in the performance of the methods independent of the biological target and compound class. Different methods can have different applications; some provide particularly high enrichment, others are strong in retrieving the maximum number of actives. We also show that these methods do surprisingly well in predicting recently published ligands of a target on the basis of initial leads and that a combination of the results of different methods in certain cases can improve results compared to the most consistent method.
Journal of Computational Chemistry | 2011
Dariusz Plewczynski; Michał Łażniewski; Marcin von Grotthuss; Leszek Rychlewski; Krzysztof Ginalski
Molecular recognition plays a fundamental role in all biological processes, and that is why great efforts have been made to understand and predict protein–ligand interactions. Finding a molecule that can potentially bind to a target protein is particularly essential in drug discovery and still remains an expensive and time‐consuming task. In silico, tools are frequently used to screen molecular libraries to identify new lead compounds, and if protein structure is known, various protein–ligand docking programs can be used. The aim of docking procedure is to predict correct poses of ligand in the binding site of the protein as well as to score them according to the strength of interaction in a reasonable time frame. The purpose of our studies was to present the novel consensus approach to predict both protein–ligand complex structure and its corresponding binding affinity. Our method used as the input the results from seven docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) that are widely used for docking of ligands. We evaluated it on the extensive benchmark dataset of 1300 protein–ligands pairs from refined PDBbind database for which the structural and affinity data was available. We compared independently its ability of proper scoring and posing to the previously proposed methods. In most cases, our method is able to dock properly approximately 20% of pairs more than docking methods on average, and over 10% of pairs more than the best single program. The RMSD value of the predicted complex conformation versus its native one is reduced by a factor of 0.5 Å. Finally, we were able to increase the Pearson correlation of the predicted binding affinity in comparison with the experimental value up to 0.5.
Bioinformatics | 2005
Dariusz Plewczynski; Adrian Tkacz; Lucjan S. Wyrwicz; Leszek Rychlewski
UNLABELLED The AutoMotif Server allows for identification of post-translational modification (PTM) sites in proteins based only on local sequence information. The local sequence preferences of short segments around PTM residues are described here as linear functional motifs (LFMs). Sequence models for all types of PTMs are trained by support vector machine on short-sequence fragments of proteins in the current release of Swiss-Prot database (phosphorylation by various protein kinases, sulfation, acetylation, methylation, amidation, etc.). The accuracy of the identification is estimated using the standard leave-one-out procedure. The sensitivities for all types of short LFMs are in the range of 70%. AVAILABILITY The AutoMotif Server is available free for academic use at http://automotif.bioinfo.pl/
Amino Acids | 2012
Indrajit Saha; Ujjwal Maulik; Sanghamitra Bandyopadhyay; Dariusz Plewczynski
In this article, we categorize presently available experimental and theoretical knowledge of various physicochemical and biochemical features of amino acids, as collected in the AAindex database of known 544 amino acid (AA) indices. Previously reported 402 indices were categorized into six groups using hierarchical clustering technique and 142 were left unclustered. However, due to the increasing diversity of the database these indices are overlapping, therefore crisp clustering method may not provide optimal results. Moreover, in various large-scale bioinformatics analyses of whole proteomes, the proper selection of amino acid indices representing their biological significance is crucial for efficient and error-prone encoding of the short functional sequence motifs. In most cases, researchers perform exhaustive manual selection of the most informative indices. These two facts motivated us to analyse the widely used AA indices. The main goal of this article is twofold. First, we present a novel method of partitioning the bioinformatics data using consensus fuzzy clustering, where the recently proposed fuzzy clustering techniques are exploited. Second, we prepare three high quality subsets of all available indices. Superiority of the consensus fuzzy clustering method is demonstrated quantitatively, visually and statistically by comparing it with the previously proposed hierarchical clustered results. The processed AAindex1 database, supplementary material and the software are available at http://sysbio.icm.edu.pl/aaindex/.
Briefings in Bioinformatics | 2011
Tomas Klingström; Dariusz Plewczynski
The amount of information regarding protein-protein interactions (PPI) at a proteomic scale is constantly increasing. This is paralleled with an increase of databases making information available. Consequently there are diverse ways of delivering information about not only PPIs but also regarding the databases themselves. This creates a time consuming obstacle for many researchers working in the field. Our survey provides a valuable tool for researchers to reduce the time necessary to gain a broad overview of PPI-databases and is supported by a graphical representation of data exchange. The graphical representation is made available in cooperation with the team maintaining www.pathguide.org and can be accessed at http://www.pathguide.org/interactions.php in a new Cytoscape web implementation. The local copy of Cytoscape cys file can be downloaded from http://bio.icm.edu.pl/~darman/ppi web page.
Applied Soft Computing | 2011
Indrajit Saha; Ujjwal Maulik; Dariusz Plewczynski
This article presents a new multiobjective differential evolution based fuzzy clustering technique. Recent research has shown that clustering techniques that optimize a single objective may not provide satisfactory result because no single validity measure works well on different kinds of data sets. The fact motivated us to present a new multiobjective Differential Evolution based fuzzy clustering technique that encodes the cluster centres in its vectors and optimizes multiple validity measures simultaneously. In the final generation, it produces a set of non-dominated solutions, from which the user can relatively judge and pick up the most promising one according to the problem requirements. Superiority of the proposed method over its single objective versions, multiobjective version of classical differential evolution and genetic algorithm, well-known fuzzy C-means and average linkage clustering algorithms has been demonstrated quantitatively and visually for several synthetic and real life data sets. Statistical significance test has been conducted to establish the statistical superiority of the proposed multiobjective clustering approach. Finally, the proposed algorithm has been applied for segmentation of a remote sensing image to show its effectiveness in pixel classification.
BMC Bioinformatics | 2010
Subhadip Basu; Dariusz Plewczynski
BackgroundWe present here the recent update of AMS algorithm for identification of post-translational modification (PTM) sites in proteins based only on sequence information, using artificial neural network (ANN) method. The query protein sequence is dissected into overlapping short sequence segments. Ten different physicochemical features describe each amino acid; therefore nine residues long segment is represented as a point in a 90 dimensional space. The database of sequence segments with confirmed by experiments post-translational modification sites are used for training a set of ANNs.ResultsThe efficiency of the classification for each type of modification and the prediction power of the method is estimated here using recall (sensitivity), precision values, the area under receiver operating characteristic (ROC) curves and leave-one-out tests (LOOCV). The significant differences in the performance for differently optimized neural networks are observed, yet the AMS 3.0 tool integrates those heterogeneous classification schemes into the single consensus scheme, and it is able to boost the precision and recall values independent of a PTM type in comparison with the currently available state-of-the art methods.ConclusionsThe standalone version of AMS 3.0 presents an efficient way to indentify post-translational modifications for whole proteomes. The training datasets, precompiled binaries for AMS 3.0 tool and the source code are available at http://code.google.com/p/automotifserver under the Apache 2.0 license scheme.