Przemysław Skibiński

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Przemysław Skibiński is active.

Explore More

Publication

Featured researches published by Przemysław Skibiński.

advances in databases and information systems | 2007

Combining efficient XML compression with query processing

Przemysław Skibiński; Jakub Swacha

This paper describes a new XML compression scheme that offers both high compression ratios and short query response time. Its core is a fully reversible transform featuring substitution of every word in an XML document using a semi-dynamic dictionary, effective encoding of dictionary indices, as well as numbers, dates and times found in the document, and grouping data within the same structural context in individual containers. The results of conducted tests show that the proposed scheme attains compression ratios rivaling the best available algorithms, and fast compression, decompression, and query processing.

data compression conference | 2004

Variable-length contexts for PPM

Przemysław Skibiński; Szymon Grabowski

This paper presents a PPM variation which combines traditional character based processing with string matching. Such an approach can effectively handle repetitive data and can be used with practically any algorithm from the PPM family. The algorithm, inspired by its predecessors, PPM/sup */ and PPMZ, searches for matching sequences in arbitrarily long, variable-length, deterministic contexts. The experimental results show that the proposed technique may be very useful, especially in combination with relatively low order (up to 8) models, where the compression gains are often significant and the additional memory requirements are moderate.

international conference on experience of designing and applications of cad systems in microelectronics | 2007

Fast Transform for Effective XML Compression

Przemysław Skibiński; Szymon Grabowski; Jakub Swacha

The main drawback of the XML format seems to be its verbosity, a key problem especially in case of large documents. Therefore, efficient encoding of XML constitutes an important research issue. In this work, we describe a preprocessing transform meant to be used with popular LZ77-style compressors. We show experimentally that our transform, albeit quite simple, leads to better compression ratios than existing XML-aware compressors. Moreover, it offers high decoding speed, which often is of utmost priority.

Information Sciences | 2006

PPM with the extended alphabet

Przemysław Skibiński

In the following paper we propose modification of Prediction by Partial Matching (PPM)-a lossless data compression algorithm, which extends an alphabet, used in the PPM method, to long repeated strings. Usually the PPM algorithms alphabet consists of 256 characters only. We show, on the basis of the Calgary corpus [T.C. Bell, J. Cleary, I.H. Witten, Text compression. Advanced Reference Series, Prentice Hall, Englewood Cliffs, New Jersey, 1990], that for ordinary files such a modification improves the compression performance in lower, but not greater than 10, orders. However, for some kind of files, this modification gives much better compression performance than any known lossless data compression algorithm.

conference on current trends in theory and practice of informatics | 2008

A highly efficient XML compression scheme for the web

Przemysław Skibiński; Jakub Swacha; Szymon Grabowski

Contemporary XML documents can be tens of megabytes long, and reducing their size, thus allowing to transfer them faster, poses a significant advantage for their users. In this paper, we describe a new XML compression scheme which outperforms the previous state-of-the-art algorithm, SCMPPM, by over 9% on average in compression ratio, having the practical feature of streamlined decompression and being almost twice faster in the decompression. Applying the scheme can significantly reduce transmission time/bandwidth usage for XML documents published on the Web. The proposed scheme is based on a semi-dynamic dictionary of the most frequent words in the document (both in the annotation and contents), automatic detection and compact encoding of numbers and specific patterns (like dates or IP addresses), and a back-end PPM coding variant tailored to efficiently handle long matching sequences. Moreover, we show that the compression ratio can be improved by additional 9% for the price of a significant slow-down.

data compression conference | 2005

Two-level directory based compression

Przemysław Skibiński

Summary form only given. The basic idea of preprocessing is to transform the text into some intermediate form which can be used as input of any existing general-purpose compressor and compressed more efficiently. Dictionary-based preprocessing is based on the notion of replacing whole words with shorter codes. We present a dictionary-based preprocessing technique and its implementation called TWRT (two-level word replacing transformation). Our preprocessor uses several dictionaries and divides files into various kinds. The first level dictionaries (small dictionaries) are specific for some kind of data (e.g., programming language, references). The second level dictionaries (large dictionaries) are specific for natural languages (e.g., English, Russian, French). On the Calgary corpus, TWRT improves the compression performance of bzip2 by over 7% and PPMonstr by about 6% on average. Even for the top compressor nowadays, PAQ6, the gain is significant - 5%. On multilingual text files, TWRT improves the compression performance of bzip2, PPMonstr, and PAQ6 by about 8%. Moreover, TWRT improves the compression speed with PAQ6 and on larger files with PPMonstr.

web information systems engineering | 2009

Visually Lossless HTML Compression

Przemysław Skibiński

The verbosity of the Hypertext Markup Language (HTML) remains one of its main weaknesses. This problem can be solved with the aid of HTML specialized compression algorithms. In this work, we describe a visually lossless HTML transform that, combined with generally used compression algorithms, allows to attain high compression ratios. Its core is a transform featuring substitution of words in an HTML document using a static English dictionary, effective encoding of dictionary indexes, numbers, and specific patterns. Visually lossless compression means that the HTML document layout will be modified, but the document displayed in a browser will provide the exact fidelity with the original. The experimental results show that the proposed transform improves the HTML compression efficiency of general purpose compressors on average by 21% in the case of gzip, achieving comparable processing speed. Moreover, we show that the compression ratio of gzip can be improved by up to 32% for the price of higher memory requirements and much slower processing.

Software - Practice and Experience | 2008

Effective asymmetric XML compression

Przemysław Skibiński; Szymon Grabowski; Jakub Swacha

advances in databases and information systems | 2007

Fast and efficient log file compression

Przemysław Skibiński; Jakub Swacha

Information Technology and Libraries | 2009

The Efficient Storage of Text Documents in Digital Libraries

Przemysław Skibiński; Jakub Swacha

Explore More

Collaboration

Dive into the Przemysław Skibiński's collaboration.

Top Co-Authors

Jakub Swacha

University of Szczecin

View shared research outputs

Top Co-Authors

Szymon Grabowski

Lodz University of Technology

View shared research outputs

Top Co-Authors

Sebastian Deorowicz

Silesian University of Technology

View shared research outputs

Explore More

Przemysław Skibiński

University of Wrocław

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Przemysław Skibiński is active.

Publication

Featured researches published by Przemysław Skibiński.

Combining efficient XML compression with query processing

Variable-length contexts for PPM

Fast Transform for Effective XML Compression

PPM with the extended alphabet

A highly efficient XML compression scheme for the web

Two-level directory based compression

Visually Lossless HTML Compression

Effective asymmetric XML compression

Fast and efficient log file compression

The Efficient Storage of Text Documents in Digital Libraries

Collaboration

Dive into the Przemysław Skibiński's collaboration.