Page MenuHomePhabricator

User impact API: Create maintenance script for refreshing data
Closed, ResolvedPublic

Description

T313393: User impact API: Create GrowthExperimentsUserImpactManager, GrowthExperimentsUserImpactLookup, and GrowthExperimentsUserImpactCompute services describes the services needed to compute, lookup, and store user impact data.

This task is about creating a maintenance script that will eventually run periodically (hourly? daily?) in production.

The maintenance script should:

Event Timeline

@kostajh: Will we need any DBA help on this?

The necessary DBA work is in T317534: Create growthexperiments_user_impact table in Wikimedia production. @kostajh pointed out that we could also use the main stash though, that has some restrictions but does not require any DB setup.

Change 833828 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] Add maintenance script to refresh user impact data

https://gerrit.wikimedia.org/r/833828

Change 834612 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] Add maintenance script for deleting expired user impact data

https://gerrit.wikimedia.org/r/834612

Change 835573 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] RefreshUserImpactData: Fix queries that filesort

https://gerrit.wikimedia.org/r/835573

Change 833828 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Add maintenance script to refresh user impact data

https://gerrit.wikimedia.org/r/833828

Change 834612 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Add maintenance script for deleting expired user impact data

https://gerrit.wikimedia.org/r/834612

Change 835573 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] RefreshUserImpactData: Fix queries that filesort

https://gerrit.wikimedia.org/r/835573

Urbanecm_WMF changed the task status from Open to In Progress.Sep 29 2022, 10:05 AM

Change 837215 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/GrowthExperiments@master] Fix RefreshUserImpactData in MySQL strict mode

https://gerrit.wikimedia.org/r/837215

Change 837215 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Fix RefreshUserImpactData in MySQL strict mode

https://gerrit.wikimedia.org/r/837215

The script fatal errors if there are zero edits tagged with newcomer task (which happens to be the case in my dev environment). Since the script doesn't appear to be feature flagged, it'll run on all wikis, even on wikis where suggested edits are not set up (the smallest ones).

Stacktrace from my dev environment:

urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$ php maintenance/refreshUserImpactData.php --wiki=awiki --editedWithin=1year --registeredWithin=1year
processing 100 users starting with 0
MediaWiki\Storage\NameTableAccessException from line 43 of /home/urbanecm/unsynced/gerrit/mediawiki/core/includes/Storage/NameTableAccessException.php: Failed to access name from change_tag_def using name = newcomer task
#0 /home/urbanecm/unsynced/gerrit/mediawiki/core/includes/Storage/NameTableStore.php(261): MediaWiki\Storage\NameTableAccessException::newFromDetails()
#1 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/includes/UserImpact/ComputedUserImpactLookup.php(214): MediaWiki\Storage\NameTableStore->getId()
#2 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/includes/UserImpact/ComputedUserImpactLookup.php(161): GrowthExperiments\UserImpact\ComputedUserImpactLookup->getEditData()
#3 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/maintenance/refreshUserImpactData.php(70): GrowthExperiments\UserImpact\ComputedUserImpactLookup->getExpensiveUserImpact()
#4 /home/urbanecm/unsynced/gerrit/mediawiki/core/maintenance/includes/MaintenanceRunner.php(309): GrowthExperiments\Maintenance\RefreshUserImpactData->execute()
#5 /home/urbanecm/unsynced/gerrit/mediawiki/core/maintenance/doMaintenance.php(85): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/maintenance/refreshUserImpactData.php(186): require_once('/home/urbanecm/...')
#7 {main}
urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$

The script fatal errors if there are zero edits tagged with newcomer task (which happens to be the case in my dev environment). Since the script doesn't appear to be feature flagged, it'll run on all wikis, even on wikis where suggested edits are not set up (the smallest ones).

Stacktrace from my dev environment:

urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$ php maintenance/refreshUserImpactData.php --wiki=awiki --editedWithin=1year --registeredWithin=1year
processing 100 users starting with 0
MediaWiki\Storage\NameTableAccessException from line 43 of /home/urbanecm/unsynced/gerrit/mediawiki/core/includes/Storage/NameTableAccessException.php: Failed to access name from change_tag_def using name = newcomer task
#0 /home/urbanecm/unsynced/gerrit/mediawiki/core/includes/Storage/NameTableStore.php(261): MediaWiki\Storage\NameTableAccessException::newFromDetails()
#1 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/includes/UserImpact/ComputedUserImpactLookup.php(214): MediaWiki\Storage\NameTableStore->getId()
#2 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/includes/UserImpact/ComputedUserImpactLookup.php(161): GrowthExperiments\UserImpact\ComputedUserImpactLookup->getEditData()
#3 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/maintenance/refreshUserImpactData.php(70): GrowthExperiments\UserImpact\ComputedUserImpactLookup->getExpensiveUserImpact()
#4 /home/urbanecm/unsynced/gerrit/mediawiki/core/maintenance/includes/MaintenanceRunner.php(309): GrowthExperiments\Maintenance\RefreshUserImpactData->execute()
#5 /home/urbanecm/unsynced/gerrit/mediawiki/core/maintenance/doMaintenance.php(85): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /home/urbanecm/unsynced/gerrit/mediawiki/extensions/GrowthExperiments/maintenance/refreshUserImpactData.php(186): require_once('/home/urbanecm/...')
#7 {main}
urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$

Thanks. That is addressed in the as-yet-unmerged https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/853501

Change 855575 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData: Add feature flag

https://gerrit.wikimedia.org/r/855575

Change 855576 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/mediawiki-config@master] GrowthExperiments: Set feature-flag for RefreshUserImpactDataMaintenanceScriptEnabled

https://gerrit.wikimedia.org/r/855576

Change 855587 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.8] refreshUserImpactData: Add feature flag

https://gerrit.wikimedia.org/r/855587

Change 855576 merged by jenkins-bot:

[operations/mediawiki-config@master] GrowthExperiments: Set feature-flag for RefreshUserImpactDataMaintenanceScriptEnabled

https://gerrit.wikimedia.org/r/855576

Mentioned in SAL (#wikimedia-operations) [2022-11-10T14:30:55Z] <kharlan@deploy1002> Started scap: Backport for [[gerrit:855576|GrowthExperiments: Set feature-flag for RefreshUserImpactDataMaintenanceScriptEnabled (T313395)]]

Mentioned in SAL (#wikimedia-operations) [2022-11-10T14:31:15Z] <kharlan@deploy1002> kharlan and kharlan: Backport for [[gerrit:855576|GrowthExperiments: Set feature-flag for RefreshUserImpactDataMaintenanceScriptEnabled (T313395)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-11-10T14:35:53Z] <kharlan@deploy1002> Finished scap: Backport for [[gerrit:855576|GrowthExperiments: Set feature-flag for RefreshUserImpactDataMaintenanceScriptEnabled (T313395)]] (duration: 04m 57s)

Change 855587 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.40.0-wmf.8] refreshUserImpactData: Add feature flag

https://gerrit.wikimedia.org/r/855587

Mentioned in SAL (#wikimedia-operations) [2022-11-10T14:58:23Z] <kharlan@deploy1002> Started scap: Backport for [[gerrit:855525|refreshUserImpactData: Add option to use job queue (T322706)]], [[gerrit:855587|refreshUserImpactData: Add feature flag (T313395)]]

Mentioned in SAL (#wikimedia-operations) [2022-11-10T14:58:42Z] <kharlan@deploy1002> kharlan and kharlan: Backport for [[gerrit:855525|refreshUserImpactData: Add option to use job queue (T322706)]], [[gerrit:855587|refreshUserImpactData: Add feature flag (T313395)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-11-10T15:03:10Z] <kharlan@deploy1002> Finished scap: Backport for [[gerrit:855525|refreshUserImpactData: Add option to use job queue (T322706)]], [[gerrit:855587|refreshUserImpactData: Add feature flag (T313395)]] (duration: 04m 47s)

Change 855575 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] refreshUserImpactData: Add feature flag

https://gerrit.wikimedia.org/r/855575

Etonkovidova subscribed.

Checked in betalabs - the proper error is displayed for a wiki without GrowthExperiments. I could not find an example of wiki in betalabs where GrowthExperiments extension is enabled, but SuggestedEdits are not.

$ mwscript extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=hewiktionary --editedWithin=1year --registeredWithin=1year
The "GrowthExperiments" extension must be installed for this script to run. Please enable it and then try again.