Dariusz Czerski
Polish Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dariusz Czerski.
intelligent information systems | 2006
Krzysztof Ciesielski; Michał Dramiński; Mieczyslaw A. Klopotek; Dariusz Czerski; Slawomir T. Wierzchon
As document map creation algorithms like WebSOM are computation- ally expensive, and hardly reconstructible even from the same set of documents, new methodology is urgently needed to allow to construct document maps to han- dle streams of new documents entering document collection. This challenge is dealt with within this paper. In a multi-stage process, incrementality of a document map is warranted. 1 The quality of map generation process has been investigated based on a number of clustering and classification measures. Conclusions concerning the impact of incremental, topic-sensitive approach on map quality are drawn.
Challenging Problems and Solutions in Intelligent Systems | 2016
Dariusz Czerski; Krzysztof Ciesielski; Michał Dramiński; Mieczyslaw A. Klopotek; Paweł Łoziński; Slawomir T. Wierzchon
We introduce a new semantic search engine, developed at our institute. Its unique feature is the automatic construction of semantic resources, like discovery of millions of facts, IS-A relations and automated generation of sentimental analysis dictionaries. We developed a new method of document categorization. The engine can be queried in natural language and possesses interfaces to be used not only by humans but also by machines.
international conference on human system interactions | 2010
Szymon Chojnacki; Dariusz Czerski; Mieczyslaw A. Klopotek
The purpose of this article is to describe the performance related features of our tag recommending webservice. The service is one of several systems currently integrated with multiplexing recommender framework for Bibsonomy.org bookmarking portal. The framework was introduced during ECML PKDD Discovery Challenge 2009. Due to good balance between speed and quality of our recommendations we managed to outperform other systems according to one of three most important evaluation measures i.e. the proportion of correctly recommended tags that have been clicked by a user.
E-Service Intelligence | 2007
Mieczyslaw A. Klopotek; Slawomir T. Wierzchon; Krzysztof Ciesielski; Michał Dramiński; Dariusz Czerski
The increasing number of documents returned by search engines for typical requests makes it necessary to look for new methods of representation of contents of the results. Nowadays, simple ranked lists, or even hierarchies of results seem not to be adequate for some applications. Within a broad stream of various novel approaches, we would like to concentrate on the well known WebSOM project, producing two-dimensional maps of documents (research by Kohonen and co-workers). A pixel on such a map represents a cluster of documents. The document clusters are arranged on a 2-dimensional map in such a way that the clusters closer on the map contain documents more similar in content. The WebSOM like document map representation is regrettably time and space consuming, and rises also questions of scaling and updating of document maps. In this chapter we will describe some approaches we found useful in coping with these challenges. Among others, techniques like Bayesian networks, growing neural gas, SVD analysis and artificial immune systems will be discussed. We created a full-fledged search engine for collections of documents (up to a million) capable of representing on-line replies to queries in graphical form on a document map, based on the above-mentioned techniques, for exploration of free text documents by creating a navigational 2-dimensional document map in which geometrical vicinity would reflect conceptual closeness of the documents. We extended WebSOM’s goals by a multilingual approach, new forms of geometrical representation and we experimented also with various modifications to the clustering process itself. The crucial issue for understanding the two-dimensional map by the user is the clustering of its contents and appropriate labelling of the clustered map areas. It has been recognized long time ago, that clustering techniques are vital in information retrieval on the Web. We will discuss several important issues that need to be resolved in order to present the user with an understandable map of documents. The first issue is the way one is clustering the documents. In the domains, like e.g. biomedical texts, where the concepts are not sharply separated, a fuzzy-set theoretic approach to clustering appears to be a promising one. The other one is the issue of initialization of topical maps. Our experiments showed that the random initialization performed in the original WebSOM may not lead to appearance of meaningful structure of the map. Therefore, we proposed several methods for topical map initialization, based on SVD, PHITS and Bayesian network techniques, which we will explain in the chapter. We consider also selected optimization approaches to dictionary reduction, so called reference vector optimization and new approaches to map visualization. We will report also on an experimental study on the impact of various parameters of the map creation process on the quality of the final map,.
international conference on artificial neural networks | 2005
Mieczyslaw A. Klopotek; Slawomir T. Wierzchon; Krzysztof Ciesielski; Michał Dramiński; Dariusz Czerski
SOM document-map based search engines require initial document clustering in order to present results in a meaningful way. This paper1 reports on our ongoing research in applications of Bayesian Networks for document map creation at various stages of document processing. Modifications are proposed to original algorithms based on our experience of superiority of crisp edge point between classes/groups of documents.
federated conference on computer science and information systems | 2016
Pawel Lozinski; Dariusz Czerski; Mieczyslaw A. Klopotek
Pattern-based methods of IS-A relation extraction rely heavily on so called Hearst patterns. These are ways of expressing instance enumerations of a class in natural language. While these lexico-syntactic patterns prove quite useful, they may not capture all taxonomical relations expressed in text. Therefore in this paper we describe a novel method of IS-A relation extraction from patterns, which uses morpho-syntactical annotations along with grammatical case of noun phrases that constitute entities participating in IS-A relation. We also describe a method for increasing the number of extracted relations that we call pseudosubclass boosting which has potential application in any pattern-based relation extraction method. Experiments were conducted on a corpus of about 0.5 billion web documents in Polish language.
intelligent information systems | 2013
Mieczyslaw A. Klopotek; Slawomir T. Wierzchon; Dariusz Czerski; Krzysztof Ciesielski; Michał Dramiński
This paper proposes a calculus for computing personalized PageRank for complex categories given a precomputed set of primitive categories. This is a work in progress aiming at reduction of the necessary number of precomputed PageRanks for a set of (next to disjoint) categories.
Archive | 2007
Dariusz Czerski; Krzysztof Ciesielski; Michał Dramiński; Mieczyslaw A. Klopotek; Slawomir T. Wierzchon
In this paper we present new approach to compression of inverted lists in indexes of information retrieval systems. The technique exploits contextual information obtained from a non-supervised clustering process run on the document collection. A substantial improvement of compression factor is achieved.
intelligent information systems | 2005
Guillermo Ch. Bali; Dariusz Czerski; Mieczyslaw A. Klopotek; Andrzej Matuszewski
The problem of testing statistical hypotheses of independence of two multiresponse variables is considered. This is a specific inferential environment to analyze certain patterns particularly for the questionnaire data. Data analyst normally looks for certain combination of responses being more frequently chosen than the other ones. As a result of experimental study we formulate some practical advices and suggest areas of further research.
Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2014
Mieczyslaw A. Klopotek; Slawomir T. Wierzchon; Krzysztof Ciesielski; Dariusz Czerski; Michał Dramiński
In this paper a reflection on the relationships among random-walk-with-back step, lazy walk and traditional Page Rank is made. It is demonstrated that though all of them differ semantically, they still can be computed using the very same algorithm.