Wojciech Marian Czarnecki

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wojciech Marian Czarnecki is active.

Explore More

Publication

Featured researches published by Wojciech Marian Czarnecki.

IEEE Computational Intelligence Magazine | 2015

Weighted Tanimoto Extreme Learning Machine with Case Study in Drug Discovery

Wojciech Marian Czarnecki

Machine learning methods are becoming more and more popular in the field of computer-aided drug design. The specific data characteristic, including sparse, binary representation as well as noisy, imbalanced datasets, presents a challenging binary classification problem. Currently, two of the most successful models in such tasks are the Support Vector Machine (SVM) and Random Forest (RF). In this paper, we introduce a Weighted Tanimoto Extreme Learning Machine (T-WELM), an extremely simple and fast method for predicting chemical compound biological activity and possibly other data with discrete, binary representation. We show some theoretical properties of the proposed model including the ability to learn arbitrary sets of examples. Further analysis shows numerous advantages of T-WELM over SVMs, RFs and traditional Extreme Learning Machines (ELM) in this particular task. Experiments performed on 40 large datasets of thousands of chemical compounds show that T-WELMs achieve much better classification results and are at the same time faster in terms of both training time and further classification than both ELM models and other state-of-the-art methods in the field.

Expert Systems With Applications | 2015

Multithreshold Entropy Linear Classifier

Wojciech Marian Czarnecki; Jacek Tabor

We propose a new entropy based multithreshold linear classifier with an adaptive kernel density estimation.Proposed classifier maximizes multiple margins, while being conceptually similar in nature to SVM.This method gives good classification results and is especially designed for unbalanced datasets.It achieves significantly better results than SVM as part of an expert system designed for drug discovery.Resulting model provides insight into the internal data geometry and can detect multiple clusters. This paper proposes a new multithreshold linear classifier (MELC) based on the Renyis quadratic entropy and Cauchy-Schwarz divergence, combined with the adaptive kernel density estimation in the one dimensional projections space. Due to its nature MELC is especially well adapted to deal with unbalanced data. As the consequence of both used model and the applied density regularization technique, it shows strong regularization properties and therefore is almost unable to overfit. Moreover, contrary to SVM, in its basic form it has no free parameters, however, at the cost of being a non-convex optimization problem which results in the existence of local optima and the possible need for multiple initializations.In practice, MELC obtained similar or higher scores than the ones given by SVM on both synthetic and real data from the UCI repository. We also perform experimental evaluation of proposed method as a part of expert system designed for drug discovery problem. It appears that not only MELC achieves better results than SVM but also gives some additional insights into data structure, resulting in more complex decision support system.

Expert Systems With Applications | 2014

Two ellipsoid Support Vector Machines

Wojciech Marian Czarnecki; Jacek Tabor

Abstract In classification problems classes usually have different geometrical structure and therefore it seems natural for each class to have its own margin type. Existing methods using this principle lead to the construction of the different (from SVM) optimization problems. Although they outperform the standard model, they also prevent the utilization of existing SVM libraries. We propose an approach, named 2 eSVM , which allows use of such method within the classical SVM framework. This enables to perform a detailed comparison with the standard SVM. It occurs that classes in the resulting feature space are geometrically easier to separate and the trained model has better generalization properties. Moreover, based on evaluation on standard datasets, 2 eSVM brings considerable profit for the linear classification process in terms of training time and quality. We also construct the 2 eSVM kernelization and perform the evaluation on the 5-HT2A ligand activity prediction problem (real, fingerprint based data from the cheminformatic domain) which shows increased classification quality, reduced training time as well as resulting model’s complexity.

Journal of Cheminformatics | 2015

Robust optimization of SVM hyperparameters in the classification of bioactive compounds

Wojciech Marian Czarnecki; Sabina Podlewska; Andrzej J. Bojarski

BackgroundSupport Vector Machine has become one of the most popular machine learning tools used in virtual screening campaigns aimed at finding new drug candidates. Although it can be extremely effective in finding new potentially active compounds, its application requires the optimization of the hyperparameters with which the assessment is being run, particularly the C and

Pattern Analysis and Applications | 2017

Extreme entropy machines: robust information theoretic classification

Wojciech Marian Czarnecki; Jacek Tabor

computer recognition systems | 2016

Online Extreme Entropy Machines for Streams Classification and Active Learning

Wojciech Marian Czarnecki; Jacek Tabor

\gamma

Journal of Chemical Information and Modeling | 2017

Creating the New from the Old: Combinatorial Libraries Generation with Machine-Learning-Based Compound Structure Optimization

Sabina Podlewska; Wojciech Marian Czarnecki; Rafał Kafel; Andrzej J. Bojarski

Bioorganic & Medicinal Chemistry Letters | 2017

Quo vadis G protein-coupled receptor ligands? A tool for analysis of the emergence of new groups of compounds over time

Damian Leśniak; Stanisław Jastrzębski; Sabina Podlewska; Wojciech Marian Czarnecki; Andrzej J. Bojarski

γ values. The optimization requirement in turn, establishes the need to develop fast and effective approaches to the optimization procedure, providing the best predictive power of the constructed model.Results In this study, we investigated the Bayesian and random search optimization of Support Vector Machine hyperparameters for classifying bioactive compounds. The effectiveness of these strategies was compared with the most popular optimization procedures—grid search and heuristic choice. We demonstrated that Bayesian optimization not only provides better, more efficient classification but is also much faster—the number of iterations it required for reaching optimal predictive performance was the lowest out of the all tested optimization methods. Moreover, for the Bayesian approach, the choice of parameters in subsequent iterations is directed and justified; therefore, the results obtained by using it are constantly improved and the range of hyperparameters tested provides the best overall performance of Support Vector Machine. Additionally, we showed that a random search optimization of hyperparameters leads to significantly better performance than grid search and heuristic-based approaches.ConclusionsThe Bayesian approach to the optimization of Support Vector Machine parameters was demonstrated to outperform other optimization methods for tasks concerned with the bioactivity assessment of chemical compounds. This strategy not only provides a higher accuracy of classification, but is also much faster and more directed than other approaches for optimization. It appears that, despite its simplicity, random search optimization strategy should be used as a second choice if Bayesian approach application is not feasible.Graphical abstractThe improvement of classification accuracy obtained after the application of Bayesian approach to the optimization of Support Vector Machines parameters.

european conference on machine learning | 2015

Maximum Entropy Linear Manifold for learning discriminative low-dimensional representation

Wojciech Marian Czarnecki; Rafal Jozefowicz; Jacek Tabor

Most existing classification methods are aimed at minimization of empirical risk (through some simple point-based error measured with loss function) with added regularization. We propose to approach the classification problem by applying entropy measures as a model objective function. We focus on quadratic Renyi’s entropy and connected Cauchy–Schwarz Divergence which leads to the construction of extreme entropy machines (EEM). The main contribution of this paper is proposing a model based on the information theoretic concepts which on the one hand shows new, entropic perspective on known linear classifiers and on the other leads to a construction of very robust method competitive with the state of the art non-information theoretic ones (including Support Vector Machines and Extreme Learning Machines). Evaluation on numerous problems spanning from small, simple ones from UCI repository to the large (hundreds of thousands of samples) extremely unbalanced (up to 100:1 classes’ ratios) datasets shows wide applicability of the EEM in real-life problems. Furthermore, it scales better than all considered competitive methods.

Schedae Informaticae | 2015

Fast Optimization of Multithreshold Entropy Linear Classifier

Rafa l Józefowicz; Wojciech Marian Czarnecki

When dealing with large evolving datasets one needs machine learning models able to adapt to the growing number of information. In particular, stream classification is a research topic where classifiers need an ability to rapidly change their solutions and behave stably after many changes in training set structure. In this paper we show how recently proposed Extreme Entropy Machine can be trained in an online fashion supporting not only adding/removing points to/from the model but even changing the size of the internal representation on demand. In particular we show how one can build a well-conditioned covariance estimator in an online scenario. All these operations are guaranteed to converge to the optimal solutions given by their offline counterparts.

Explore More