Page MenuHomePhabricator

Investigation: how many event participants have been affected by IP Blocks
Open, Needs TriagePublic

Description

IP blocking affects a number of communities (https://meta.wikimedia.org/wiki/Talk:No_open_proxies/Unfair_blocking). For experienced editors, it is easier to request a workaround for the block, such as IPBlockExempt right on the wikis they are active on. However, IP blocks have acute effects for in person events. During the event, a person who is new to the wikis often experiences the block. We want to better understand metrics and data around the impact of IP blocks on good faith users attending events.

We have a event logging data when a block notice is shown to a user. @Iflorez has documented several ways to both identify the individual users enrolled in Event Registration or historical Events in the Programs and Events Dashboard. In order to determine if the event organizer tools being built by the Campaign Product team might be a viable route for reducing the impact of IP Blocks, we need to better understand the baseline and metrics around this problem.

We would minimally like to be able to evaluate the following questions:

  • What percent of event participants enrolled in an event have experienced an IP Block within a specific window (a month, quarter or year, etc) of time near when they attended an event?
  • What percent of event participants experienced an IP Block during an event they were participating in? (might be more complex and require a more complex analysis)
  • Which geographies have most effected users?
  • If we can divide the data between in-person and online-first events, what differences do we see in impact?

Notes

Notes from @Iflorez:

A query to pull all names in the CampaignEvents:

user_ids_query = 
'''
    SELECT DISTINCT ce_participants.cep_user_id
    FROM ce_participants
'''
cep_user_ids = mariadb.run(user_ids_query, 'centralauth')

And a second query to match the CampaignEvents id to global username:

#GET usernames
user_names_participants_query =  '''
SELECT gu_name AS username,
gu_id AS user_id
FROM globaluser 
WHERE globaluser.gu_id IN {cep_user_id_tuple}
'''
user_names_p = mariadb.run(user_names_participants_query.format(**query_vars), 'centralauth')

You can view cells 9-18 in this GitHub repo for more on editor data pulling for CampaignEvents monthly reporting.

See also this thread

Prior work

Event Timeline

mpopov subscribed.

Status update: Product Analytics is going to own a hypothesis under WE 4.2, likely in Q2 (since we are fully booked for Q1), to the effect of:

If we develop a metric for measuring how many event participants are prevented from participating in events due to IP blocks, we will be able to measure baselines across different dimensions (such as geographic regions and languages) to understand the severity and distribution of the problem.

(To be refined by the hypothesis owner (TBD) with @kostajh as KR owner)

More details in Slack.

kostajh renamed this task from Investigation: how many event participants have been effected by IP Blocks to Investigation: how many event participants have been affected by IP Blocks.Jul 12 2024, 7:58 AM
kostajh added a project: FY24-25 WE4.2.