In T252079#6117724 it was identified that we had an incident that was exacerbated by the fact that our formatter cache caches negative results for the full 24hour TTL.
In this incident LUA calls incorrectly always returning no value when checking for terms, and this value was cached for 24 hours.
A rollback of the train happening in the hours after the incident, however the cache continued to have bad data until the next day.
In a comment on that ticket I speculated about some possible solutions:
Another possible area for investigation would be, is a 24h TTL actually needed for this cache?
Currently the TTL used is the same generic TTL for the "shared cache" used for entity storage.
We could experiment with this value as a much shorter value would result in a short fallout if something like this were to happen again.
If we can't improve the situation then perhaps we should think about one of the other ideas in this ticket?