Transformer-Encoder-Based Mathematical Information Retrieval uri icon

Open Access

  • false

Peer Reviewed

  • true

Abstract

  • Mathematical Information Retrieval (MIR) deals with the task of finding relevant documents that contain text and mathematical formulas. Therefore, retrieval systems should not only be able to process natural language, but also mathematical and scientific notation to retrieve documents.
    In this work, we evaluate two transformer-encoder-based approaches on a Question Answer retrieval task. Our pre-trained ALBERT-model demonstrated competitive performance as it ranked in the first place for p'@10. Furthermore, we found that separating the pre-training data into chunks of text and formulas improved the overall performance on formula data.

Veröffentlichungszeitpunkt

  • Januar 1, 2022