BERT is not The Count: Learning to Match Mathematical Statements with Proofs

Li, Weixian Waylon; Ziser, Yftah; Coavoux, Maximin; Cohen, Shay B.

Computer Science > Computation and Language

arXiv:2302.09350 (cs)

[Submitted on 18 Feb 2023]

Title:BERT is not The Count: Learning to Match Mathematical Statements with Proofs

Authors:Weixian Waylon Li, Yftah Ziser, Maximin Coavoux, Shay B. Cohen

View PDF

Abstract:We introduce a task consisting in matching a proof to a given mathematical statement. The task fits well within current research on Mathematical Information Retrieval and, more generally, mathematical article analysis (Mathematical Sciences, 2014). We present a dataset for the task (the MATcH dataset) consisting of over 180k statement-proof pairs extracted from modern mathematical research articles. We find this dataset highly representative of our task, as it consists of relatively new findings useful to mathematicians. We propose a bilinear similarity model and two decoding methods to match statements to proofs effectively. While the first decoding method matches a proof to a statement without being aware of other statements or proofs, the second method treats the task as a global matching problem. Through a symbol replacement procedure, we analyze the "insights" that pre-trained language models have in such mathematical article analysis and show that while these models perform well on this task with the best performing mean reciprocal rank of 73.7, they follow a relatively shallow symbolic analysis and matching to achieve that performance.

Comments:	Accepted to the Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023; 14 pages. arXiv admin note: substantial text overlap with arXiv:2102.02110
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2302.09350 [cs.CL]
	(or arXiv:2302.09350v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2302.09350

Submission history

From: Yftah Ziser [view email]
[v1] Sat, 18 Feb 2023 14:48:20 UTC (7,655 KB)

Computer Science > Computation and Language

Title:BERT is not The Count: Learning to Match Mathematical Statements with Proofs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:BERT is not The Count: Learning to Match Mathematical Statements with Proofs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators