Page MenuHomePhabricator

Performance review of AutoModerator extension
Open, Needs TriagePublic

Description

Description

(Please provide the context of the performance review, and describe how the feature or service works at a high level technically and from a user point of view, or link to documentation describing that.)

Automoderator is a new extension under development by the Moderator-Tools-Team. It uses a machine learning model to automatically perform edit reverts on edits which cross a threshold model score. It is configurable by each community which uses it, including turning it on or off and changing the edit summary for reverts. We are looking to pilot our MVP on the Test and Indonesian Wikipedias, and anticipate being ready in May.

Preview environment

(Insert one or more links to where the feature can be tested, e.g. on Beta Cluster.)

testwiki
AutoModerator can be enabled/disabled on testwiki by an interface admin editing MediaWiki:AutoModeratorConfig.json as described in Extension:AutoModerator#On-wiki_configuration. At the time of writing it is enabled. Other than the configuration page, there is no interface. We have been doing some edits during shakedown, and the recent job errors can be viewed in the logstash dashboard linked in the final questionnaire section. Reverts can be viewed at Special:Contributions/AutoModeratorTest

beta enwiki
AutoModerator can be enabled/disabled on beta enwiki by an interface admin editing MediaWiki:AutoModeratorConfig.json. At the time of writing it is enabled.

The AutoModerator extension uses the language agnostic revert risk machine learning model which has a big limitations for beta wikis: it does revisions table lookups against production databases only. Deploying to beta has been useful for some basic shakedown of our extension, but the results we get back from the api will be unrelated to actual edits on beta wiki. We only get results due to name and id collisions (eg. the db names are the same and a rev id is just an auto incremented number). For this reason you can't make intentionally "bad" edits to trigger reverts on beta wikis.

Which code to review

(Provide links to all proposed changes and/or repositories. It should also describe changes which have not yet been merged or deployed but are planned prior to deployment. E.g. production Puppet, wmf config, or in-flight features expected to complete prior to launch date, etc.).

https://gerrit.wikimedia.org/g/mediawiki/extensions/AutoModerator

Performance assessment

Please initiate the performance assessment by answering the below:

Event Timeline

(For future reference, per https://www.mediawiki.org/wiki/Writing_an_extension_for_deployment linking to https://wikitech.wikimedia.org/wiki/MediaWiki_Engineering/Performance_Review , please use the corresponding form which adds the corresponding team project tag so they can get aware of it. Thanks!)

(For future reference, per https://www.mediawiki.org/wiki/Writing_an_extension_for_deployment linking to https://wikitech.wikimedia.org/wiki/MediaWiki_Engineering/Performance_Review , please use the corresponding form which adds the corresponding team project tag so they can get aware of it. Thanks!)

I had left the tag off the request for now because I was just filing a placeholder for our team's engineers to fill out. This request isn't ready to be actioned yet. Happy to leave it on if that's not confusing.

Ah, sorry! Could set the status to stalled until the task is ready then?

Samwalton9-WMF changed the task status from Open to Stalled.Apr 3 2024, 12:56 PM

Makes sense!

larissagaulia subscribed.

Your first set up was right.
I'll remove the tag since it's not yet ready to be triaged as per @Samwalton9-WMF's comment. Please add the tag back in when it's done so it can appear in our inbox for us to prioritize. Thank you!

@larissagaulia I'm wondering if it wouldn't make sense to do some or all of the performance evaluation on test wiki.

The AutoModerator extension uses the language agnostic revert risk machine learning model which has two big limitations for beta wikis:

  • it does revisions table lookups against production databases only. Deploying to beta has been useful for some basic shakedown of our extension, but the results we get back from the api will be unrelated to actual edits on beta wiki. We'll only get results due to name and id collisions (eg. the db names are the same and a rev id is just an auto incremented number). For this reason you can't make intentionally "bad" edits to trigger reverts on beta wikis.
  • we have to use an api gateway on beta wiki since it is hosted on cloud vps; production wikis get to bypass the gateway and use an internal endpoint

@larissagaulia I'm wondering if it wouldn't make sense to do some or all of the performance evaluation on test wiki.

Yes. We generally recommend testing on beta, but of course, if it makes more sense for you to test in a different environment we can try to help you run the evaluations on other envs as well :)

Let us know if you need any help with it and feel free to ask questions.

@larissagaulia I'm wondering if it wouldn't make sense to do some or all of the performance evaluation on test wiki.

Yes. We generally recommend testing on beta, but of course, if it makes more sense for you to test in a different environment we can try to help you run the evaluations on other envs as well :)

Let us know if you need any help with it and feel free to ask questions.

I've been going back and forth on this. I'll fill this out for what can be checked on beta and we can determine if there should be an additional evaluation in test or our pilot wiki (idwiki).

jsn.sherman changed the task status from Stalled to Open.May 17 2024, 6:17 PM
jsn.sherman updated the task description. (Show Details)

We've deployed to testwiki since opening this request, so I'll update the task accordingly

@jsn.sherman Is this something that needs to be followed up soon or it can wait since we're already live on trwiki?

@jsn.sherman Is this something that needs to be followed up soon or it can wait since we're already live on trwiki?

This isn't currently blocking us since we've really only started gathering feedback from partner/pilot wikis and have feature work in the pipeline. It will be a blocker for larger rollout after the pilot period.

I believe we'll pilot until end of July or at the very least until we get enough feedback from idwiki or any of the other Pilot Wikis as a second pilot wiki. As we respond to our learnings we can pursue the request for a performance review more intently