Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dariusz Ceglarek is active.

Publication


Featured researches published by Dariusz Ceglarek.


Advances in Intelligent Information and Database Systems | 2010

Semantic Compression for Specialised Information Retrieval Systems

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

The aim of this work is to present methods some of the ongoing research done as a part of development of Semantically Enhanced Intellectual Property Protection System - SEIPro2S. Main focus is on description of methods that allow for creation of more concise documents preserving semantically the same meaning as their originals. Thus, compacting methods are denoted as a semantic compression.


international conference on computational collective intelligence | 2009

Semantically Enhanced Intellectual Property Protection System - SEIPro2S

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

The aim of this work is to present some of the capabilities of a Semantically Intellectual Enhanced Property Protection System. The system has reached a prototype phase where experiments are possible. It uses an extensive semantic net algorithms for Polish language that enable it to detect similarities in two compared documents on a level far beyond simple text matching. SEIPro2S benefits both from using a local document repository and from Web based resources. Main focus of this work is to give a reader overview of architecture and some actual results.


international conference on computational collective intelligence | 2010

Quality of semantic compression in classification

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

Article presents results of implementation of semantic compression for English. An idea of semantic compression is reintroduced with examples and steps taken to perform experiment are given. A task of re-engineering available structures in order to apply them to already existing project infrastructure for experiments is described. Experiment demonstrates validity of research along with real examples of semantically compressed documents.


asian conference on intelligent information and database systems | 2011

Towards Knowledge Acquisition with WiSENet

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

This article is a continuation of research work started with an idea of semantic compression. As authors proved that semantic compression is viable concept for English, they decided to focus on potential applications. An algorithm is presented that employing WiSENet allows for knowledge acquisition with flexible rules that yield high precision results. Detailed discussion is given with description of devised algorithm, usage examples and results of experiments.


international conference on artificial intelligence and soft computing | 2012

Fast plagiarism detection by sentence hashing

Dariusz Ceglarek; Konstanty Haniewicz

This work presents a Sentence Hashing Algorithm for Plagiarism Detection - SHAPD. To present a user with the best results the algorithm makes use of special trait of the written texts - their natural sentence fragmentation, later employing a set of special techniques for text representation. Results obtained demonstrate that the algorithm delivers solution faster than the alternatives. Its algorithmic complexity is logarithmic, thus its performance is better than most algorithms using dynamic programming used to find the longest common subsequence.


Archive | 2007

Automated Acquisition of Semantic Relations for Information Retrieval Systems

Dariusz Ceglarek; Wojciech Rutkowski

Considering a continuous rise of world’s information resources, it is necessary for companies and other organizations to obtain, aggregate, process and utilize them in an appropriate manner in order to maximize the effectiveness of activities that are being conducted. Since the document libraries (or a number of sources to filter from) are becoming bigger and bigger, it is crucial to provide a trusted system which would be able to find the resources relevant to user’s needs. This is a main goal of information retrieval (IR) systems. (Daconta et al. 2003) Traditionally, the effectiveness of IR systems is measured by two basic factors: recall and precision. Both are quantified by a percentage or a value between 0 and 1. Suppose we have a set of documents. A user has specific information needs, represented by a query. Task of IR system is to provide the user with relevant documents from the set. Recall equals a relation of relevant documents returned by IR system to the number of all relevant documents, and precision is a relation of returned relevant documents to all returned documents. Were everything perfect, the recall and precision of IR system would reach 100%. This is a goal of developing and improving retrieval systems.


international conference on computational collective intelligence | 2011

Domain based semantic compression for automatic text comprehension augmentation and recommendation

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

This works presents an application of semantic compression where domain frequency dictionaries are used to augment comprehension of documents. This is achieved by incorporating users feedback into proposed solution. Experiments and examples of actual output are given. Moreover, a measure that allows for evaluation of changes in a structure of available groups is defined and presented.


ADBIS Workshops | 2013

A Detection of the Most Influential Documents

Dariusz Ceglarek; Konstanty Haniewicz

This work is a result of the ongoing research on semantic compression and robust algorithms applicable in plagiarism detection. This article includes a brief description of Sentence Hashing Algorithm for Plagiarism Detection SHAPD along with a comparison with the other available alternatives using frame structures for subsequence detection. What is more, the core of this publication is devoted to the application of SHAPD to a task of discovery of the most influential documents in a corpus. The experiments were carried out on multiple datasets diversified in terms of structure and content. The observations gathered during the experiments were summarised and are given in the article. The experiment allowed the authors to verify their initial hypothesis that it is possible to single out the most important documents in a corpus capturing the relations of citation among them.


information management, innovation management and industrial engineering | 2011

Methods of Detecting Rules for Discovery of New Concepts for Semantic Networks

Dariusz Ceglarek; Konstanty Haniewicz

Successful experiments with extraction of previously uncharted concepts with specialized finite state automata and already stored data in semantic network encouraged authors to further extend capabilities of semi-automatic concept acquisition. This paper focuses on methods of extracting viable proposition that can be included in a rule repository used to retrieve candidates for semantic network extension.


international conference on computational collective intelligence | 2012

Robust plagiary detection using semantic compression augmented SHAPD

Dariusz Ceglarek; Konstanty Haniewicz; Wojciech Rutkowski

Collaboration


Dive into the Dariusz Ceglarek's collaboration.

Top Co-Authors

Avatar

Konstanty Haniewicz

Poznań University of Economics

View shared research outputs
Researchain Logo
Decentralizing Knowledge