Description
(Please provide the context of the performance review, and describe how the feature or service works at a high level technically and from a user point of view, or link to documentation describing that.)
Automoderator is a new extension under development by the Moderator-Tools-Team. It uses a machine learning model to automatically perform edit reverts on edits which cross a threshold model score. It is configurable by each community which uses it, including turning it on or off and changing the edit summary for reverts. We are looking to pilot our MVP on the Test and Indonesian Wikipedias, and anticipate being ready in May.
Preview environment
(Insert one or more links to where the feature can be tested, e.g. on Beta Cluster.)
testwiki
AutoModerator can be enabled/disabled on testwiki by an interface admin editing MediaWiki:AutoModeratorConfig.json as described in Extension:AutoModerator#On-wiki_configuration. At the time of writing it is enabled. Other than the configuration page, there is no interface. We have been doing some edits during shakedown, and the recent job errors can be viewed in the logstash dashboard linked in the final questionnaire section. Reverts can be viewed at Special:Contributions/AutoModeratorTest
beta enwiki
AutoModerator can be enabled/disabled on beta enwiki by an interface admin editing MediaWiki:AutoModeratorConfig.json. At the time of writing it is enabled.
The AutoModerator extension uses the language agnostic revert risk machine learning model which has a big limitations for beta wikis: it does revisions table lookups against production databases only. Deploying to beta has been useful for some basic shakedown of our extension, but the results we get back from the api will be unrelated to actual edits on beta wiki. We only get results due to name and id collisions (eg. the db names are the same and a rev id is just an auto incremented number). For this reason you can't make intentionally "bad" edits to trigger reverts on beta wikis.
Which code to review
(Provide links to all proposed changes and/or repositories. It should also describe changes which have not yet been merged or deployed but are planned prior to deployment. E.g. production Puppet, wmf config, or in-flight features expected to complete prior to launch date, etc.).
https://gerrit.wikimedia.org/g/mediawiki/extensions/AutoModerator
Performance assessment
Please initiate the performance assessment by answering the below:
- What work has been done to ensure the best possible performance of the feature?
- The code that does network requests and potentially reverts edits is encapsulated in AutoModeratorFetchRevScoreJob.php to minimize impact to page load. Within the job, we run a precheck to skip over as many edits as possible before requesting a score from the LiftWing api. There should be no page load cost for view actions and only negligible cost for edits.
- What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?
- maybe missed opportunities for early returns
- Are there potential optimisations that haven't been performed yet?
- We still have the resource loader config from the boilerplate extension even though we don't currently provide any ui outside of the config file. That can be stripped out until we need to actually provide client-side assets.
- Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the MediaWiki Platform Team for advice: [email protected].
- Since almost all of the work is happening in the job queue, I created a logstash dashboard for our jobtype to make it easy to keep an eye on execution time and errors:
- production: https://logstash.wikimedia.org/app/dashboards#/view/1176db40-1943-11ef-a427-37449ed9b38c?_g=(filters%3A!()%2CrefreshInterval%3A(pause%3A!t%2Cvalue%3A0)%2Ctime%3A(from%3Anow-3d%2Cto%3Anow))
- beta https://logstash.wikimedia.org/app/dashboards#/view/1176db40-1943-11ef-a427-37449ed9b38c?_g=(filters%3A!()%2CrefreshInterval%3A(pause%3A!t%2Cvalue%3A0)%2Ctime%3A(from%3Anow-3d%2Cto%3Anow))