Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval | 2021

PYA0: A Python Toolkit for Accessible Math-Aware Search

 
 

Abstract


Mathematical Information Retrieval (MIR) has been actively studied in recent years and many fruitful results have emerged. Among those, the Approach Zero system is one of the few math-aware search engines that is able to perform substructure matching efficiently. Furthermore, it has been deployed in ARQMath2020, the most recent community-wide MIR evaluation, as a strong baseline due to its empirical effectiveness and ability to handle structured math content. However, in order to implement a retrieval model that handles structured queries efficiently, Approach Zero is written in C from the ground up, requiring special pipelines for processing math content and queries. Thus, the system is not conveniently accessible and reusable to the community as a research tool. In this paper, we present PyA0, an easy-to-use Python toolkit built on Approach Zero that improves its accessibility to researchers. We introduce the toolkit interface and report evaluation results on popular MIR datasets to demonstrate the effectiveness and efficiency of our toolkit. We have made PyA0 source code publicly accessible at https://github.com/approach0/pya0, which includes a link to a notebook demo.

Volume None
Pages None
DOI 10.1145/3404835.3462794
Language English
Journal Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Full Text