Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Petr Sojka is active.

Publication


Featured researches published by Petr Sojka.


document engineering | 2011

The art of mathematics retrieval

Petr Sojka; Martin Líška

The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene.


International Conference on Intelligent Computer Mathematics | 2011

Indexing and Searching Mathematics in Digital Libraries

Petr Sojka; Martin Líška

This paper surveys approaches and systems for searching mathematical formulae in mathematical corpora and on the web. The design and architecture of our MIaS (Math Indexer and Searcher) system is presented, and our design decisions are discussed in detail. An approach based on PresentationMathML using a similarity of math subformulae is suggested and verified by implementing it as a math-aware search engine based on the state-of-the-art system, Apache Lucene. Scalability issues were checked based on 324,000 real scientific documents from arXiv archive with 112 million mathematical formulae. More than two billions MathML subformulae were indexed using our Solr-compatible Lucene extension.


artificial intelligence and symbolic computation | 2008

Automated Classification and Categorization of Mathematical Knowledge

Radim; ehůřek; Petr Sojka

There is a commonMathematics SubjectClassification(MSC) System used for categorizing mathematical papers and knowledge. We present results of machine learning of the MSC on full texts of papers in the mathematical digital libraries DML-CZ and NUMDAM. The F1- measure achieved on classification task of top-level MSC categories exceeds 89%. We describe and evaluate our methods for measuring the similarity of papers in the digital library based on paper full texts.


conference on information and knowledge management | 2015

Combining Text and Formula Queries in Math Information Retrieval: Evaluation of Query Results Merging Strategies

Martin Líška; Petr Sojka; Michal Růžička

Specific to Math Information Retrieval is combining text with mathematical formulae both in documents and in queries. Rigorous evaluation of query expansion and merging strategies combining math and standard textual keyword terms in a query are given. It is shown that techniques similar to those known from textual query processing may be applied in math information retrieval as well, and lead to a cutting edge performance. Striping and merging partial results from subqueries is one technique that improves results measured by information retrieval evaluation metrics like Bpref.


technical symposium on computer science education | 2003

Interactive teaching materials in PDF using JavaScript

Petr Sojka

The use of JavaScript language for adding interaction to portable teaching materials of a high typographical quality in PDF file format is described. An extended version of the program TEX called pdfTEX is extremely useful for such purposes. It is shown that applications similar to those done by CGI script on the web can be done in PDF, exploiting the embedded JavaScript engine implementation in PDF viewers.


ACM Transactions on Multimedia Computing, Communications, and Applications | 2018

Gait Recognition from Motion Capture Data

Michal Balazia; Petr Sojka

Gait recognition from motion capture data, as a pattern classification discipline, can be improved by the use of machine learning. This article contributes to the state of the art with a statistical approach for extracting robust gait features directly from raw data by a modification of Linear Discriminant Analysis with Maximum Margin Criterion. Experiments on the CMU MoCap database show that the suggested method outperforms 13 relevant methods based on geometric features and a method to learn the features by a combination of Principal Component Analysis and Linear Discriminant Analysis. The methods are evaluated in terms of the distribution of biometric templates in respective feature spaces expressed in a number of class separability coefficients and classification metrics. Results also indicate a high portability of learned features, what means that we can learn what aspects of walk people generally differ in and extract those as general gait features. Recognizing people without needing group-specific features is convenient, as particular people might not always provide annotated learning data. As a contribution to reproducible research, our evaluation framework and database have been made publicly available. This research makes motion capture technology directly applicable for human recognition.


international conference on pattern recognition | 2016

Learning robust features for gait recognition by Maximum Margin Criterion

Michal Balazia; Petr Sojka

In the field of gait recognition from motion capture data, designing human-interpretable gait features is a common practice of many fellow researchers. To refrain from ad-hoc schemes and to find maximally discriminative features we may need to explore beyond the limits of human interpretability. This paper contributes to the state-of-the-art with a machine learning approach for extracting robust gait features directly from raw joint coordinates. The features are learned by a modification of Linear Discriminant Analysis with Maximum Margin Criterion so that the identities are maximally separated and, in combination with an appropriate classifier, used for gait recognition. Experiments on the CMU MoCap database show that this method outperforms eight other relevant methods in terms of the distribution of biometric templates in respective feature spaces expressed in four class separability coefficients. Additional experiments indicate that this method is a leading concept for rank-based classifier systems.


arXiv: Computer Vision and Pattern Recognition | 2016

Walker-Independent Features for Gait Recognition from Motion Capture Data

Michal Balazia; Petr Sojka

MoCap-based human identification, as a pattern recognition discipline, can be optimized using a machine learning approach. Yet in some applications such as video surveillance new identities can appear on the fly and labeled data for all encountered people may not always be available. This work introduces the concept of learning walker-independent gait features directly from raw joint coordinates by a modification of the Fisher’s Linear Discriminant Analysis with Maximum Margin Criterion. Our new approach shows not only that these features can discriminate different people than who they are learned on, but also that the number of learning identities can be much smaller than the number of walkers encountered in the real operation.


arXiv: Computer Vision and Pattern Recognition | 2017

An Evaluation Framework and Database for MoCap-Based Gait Recognition Methods

Michal Balážia; Petr Sojka

As a contribution to reproducible research, this paper presents a framework and a database to improve the development, evaluation and comparison of methods for gait recognition from Motion Capture (MoCap) data. The evaluation framework provides implementation details and source codes of state-of-the-art human-interpretable geometric features as well as our own approaches where gait features are learned by a modification of Fishers Linear Discriminant Analysis with the Maximum Margin Criterion, and by a combination of Principal Component Analysis and Linear Discriminant Analysis. It includes a description and source codes of a mechanism for evaluating four class separability coefficients of feature space and four rank-based classifier performance metrics. This framework also contains a tool for learning a custom classifier and for classifying a custom query on a custom gallery. We provide an experimental database along with source codes for its extraction from the general CMU MoCap database.


technical symposium on computer science education | 2003

Animations in PDF

Petr Sojka

This paper describes a technique to create interactive teaching materials as animations that are stored and distributed in PDF file format. PdfLATEX with small macropackage, Maple and Javascript are used and allow the development of interactive animations of high typographical quality that are fine-tuned for on-the-screen reading.

Collaboration


Dive into the Petr Sojka's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thierry Bouche

Joseph Fourier University

View shared research outputs
Top Co-Authors

Avatar

Volker Sorge

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan P. Sexton

University of Birmingham

View shared research outputs
Researchain Logo
Decentralizing Knowledge