Estimating absolute binding free energies from molecular simulations is a key step in computer-aided drug design pipelines, but the agreement between computational results and experiments is still very inconsistent. Both the accuracy of the computational model and the quality of the statistical sampling contribute to this discrepancy, yet disentangling the two remains a challenge. In this study, we present an automated protocol based on OneOPES, an enhanced sampling method that exploits replica exchange and can accelerate several collective variables to address the sampling problem. We apply this protocol to 37 host-guest systems. The simplicity of setting up the simulations and producing well-converged binding free energy estimates without the need to optimize simulation parameters provides a reliable solution to the sampling problem. This, in turn, allows for a systematic force field comparison and ranking according to the correlation between simulations and experiments, which can inform the selection of an appropriate model. The protocol can be readily adapted to test more force field combinations and study more complex protein-ligand systems, where the choice of an appropriate physical model is often based on heuristic considerations rather than systematic optimization.