Piotr Wierzchoń
Adam Mickiewicz University in Poznań
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Piotr Wierzchoń.
language and technology conference | 2015
Filip Graliński; Piotr Wierzchoń
We present a corpus for training and evaluating systems for the dating of Polish texts. A number of baselines (using year references, knowledge of spelling reforms and birth years) are given for the temporal classification task. We also show that the problem can be viewed as a regression problem and a standard supervised learning tool (Vowpal Wabbit) can be applied. So far, the best result has been achieved with supervised learning with word tokens and character 5-g as features. In addition, error analysis of the results obtained with the best solution are presented in this paper.
computational methods in science and technology | 2018
Daniel Dzienisiewicz; Łukasz Borchmann; Piotr Wierzchoń; Filip Graliński
The article discusses selected projects from the field of digital humanities realised by the Re-research.pl group. The group consists of researchers from the Institute of Linguistics and the Department of Natural Language Processing at Adam Mickiewicz University, Poznań, Poland. The projects discussed include National Photocorpus of Polish, Discovermat, Korea, Koreans and ‘Koreanity’ in the digitised Polish press of the 20 century, Biography of the Nation, 100,000 ministories, Gonito.net and 50,000 words. Domain and chronologisation index. However, the main focus of the article is the interdisciplinary popular-scientific blog Re-research.pl. The daily blog posts include texts on a variety of subjects, ranging from linguistics, history and folklore to computer science. Selected posts and categories of posts are discussed, such as chronologisational challenges, texts devoted to folklore and materials on the structure of text files. Apart from providing daily analyses, the blog promotes other projects and serves as a dialogue platform for representatives of various fields.
asian conference on intelligent information and database systems | 2017
Piotr Wierzchoń
Lexicography is the science and practice of making dictionaries. Its development has led to new techniques for the visual presentation of lexicographic entries. This article focuses on the technique of photodocumentation, which enables a textual quotation to be shown in its natural context. We aim to present a technological system which will make it possible, relatively cheaply, to produce a monolingual dictionary together with quotations and chronologisation—that is, the date at which a given word first appears. We consider the example of Vietnamese. As a preliminary database of material we selected just over 100 books, which we scanned and from which we excerpted quotations to illustrate the natural use of the headwords.
Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage | 2017
Filip Graliński; Rafał Jaworski; Łukasz Borchmann; Piotr Wierzchoń
This article describes research in automatic content-based temporal classification of texts. Experiments are carried out on a set of texts coming from Polish digital libraries, dating between the years 1814 and 2013. Following successful research in the field of temporal classification, this work aims at creating an automatic dating mechanism to be used in situations, where the publication date of the text is unknown. Automatic publication date assessment from the computer system can provide useful for researchers from various fields of humanities, such as history (incl. history of language), culture-historical archaeology, sociology or anthropology.
text speech and dialogue | 2016
Filip Graliński; Rafał Jaworski; Łukasz Borchmann; Piotr Wierzchoń
This article describes a series of experiments on gender attribution of Polish texts. The research was conducted on the publicly available corpus called “He Said She Said”, consisting of a large number of short texts from the Polish version of Common Crawl. As opposed to other experiments on gender attribution, this research takes on a task of classifying relatively short texts, authored by many different people.
asian conference on intelligent information and database systems | 2016
Piotr Wierzchoń
The paper will concern the theoretical and practical problems of analysing the mass of linguistic data which has arisen in conjunction with the development of many fields of life. Moreover, the universe of texts is growing every day – both forwards and backwards. Forwards because every new article, book, blog, e-mail or text message expands the set of existing texts; and backwards because the same set is also expanded whenever a scan is made of another historical text. Our knowledge about past times is growing by leaps and bounds. We are therefore particularly interested in the analysis of historical texts that can be carried out in the second decade of the 21st century.
Investigationes Linguisticae | 2018
Jassem Krzysztof; Filip Graliński; Tomasz Obrębski; Piotr Wierzchoń
Electronic lexicography in the 21st century: Proceedings of eLex 2017 conference, 2017, págs. 680-702 | 2017
Lukasz Borchmann; Daniel Dzienisiewicz; Piotr Wierzchoń
text, speech and dialogue | 2016
Filip Graliński; Rafał Jaworski; Lukasz Borchmann; Piotr Wierzchoń
language resources and evaluation | 2016
Filip Graliński; Lukasz Borchmann; Piotr Wierzchoń