Page MenuHomePhabricator

Determine and implement SUL 3 login handshake mechanism
Open, Needs TriagePublic

Description

CentralAuth relies on a central session (a MediaWiki session on a predefined domain, like login.wikimedia.org) for looking up the user's identity on any given wiki. This requires a handshake: the browser, the server when accessed via the central session domain, and the server when accessed via the local wiki domain need to exchange information with each other in a way that allows the server to verify that the user talking to it through the login domain and the user talking to it through the local domain are the same.

Today, there are two such handshake mechanisms used:

  • Central login: after the user successfully logs in on the local wiki, prove that to the central wiki and establish a session there.
  • Central autologin: retrieve the user's central session from the central wiki and log them in there.

T348388: Use central login wiki for login (SUL3) will change the login process - credential verification will happen on the central domain, not the local domain, so instead of communicating the fact of a successful login from the local domain to the central domain, we will need to do it in the opposite direction. We need to come up with an exact algorithm for this. This is the most security-sensitive part of CentralAuth and will require careful auditing.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Tgr moved this task from Ready to In progress on the SUL3 board.

Central login algorithm today:

  1. after a successful login on the local wiki, store a set of data in the local session:
    1. a random secret
    2. the login type (login or signup) and "remember me flag"
    3. the three "return to" values (title, query, anchor)
  2. store a set of data in the central token store (with 1 minute expiry), keyed by a random token:
    1. the secret from above
    2. the user's name and ID
    3. the wiki ID
    4. potentially other things extensions add via the CentralAuthLoginRedirectData hook (this is used by MobileFrontend to preserve the mobile domain)
  3. redirect the user to Special:CentralLogin/start on the central login wiki, and pass the token from the previous step as a query parameter
  4. retrieve and delete the data from step 2 from the central token store, verify the user exists etc.
  5. log in the user on the login wiki by creating and persisting a local session, create a stub session in the central session backend (this is something that holds the user's name and ID but isn't actually treated as a valid session), and store the central session ID in the local session
    1. this needs to handle various edge cases:
      1. the user is already logged in on the login wiki (don't do anything, use the session ID of the existing session in the next step)
      2. the user is already logged in on the login wiki by a different name, and switching accounts (override the existing session with the stub session)
      3. the user already has a stub session (reuse it)
      4. the user already has a stub session, but with a different username (error out; also in this code path the data is not deleted in step 4)
  6. store a set of data in the central token store (with 1 minute expiry), keyed by a new random token:
    1. the central session ID (ie. the key of the stub session in the central session backend(
    2. the secret from step 2
  7. redirect back to Special:CentralLogin/complete on the local wiki (using the wiki ID from the central token store, and with possible modifications to the URL by the CentralAuthSilentLoginRedirect hook, used by MobileFrontend to preserve the mobile domain), and pass the token from the step above as a query parameter
  8. retrieve and delete the data from step 6 from the central token store, and the session data from step 1, verify the secret matches and that the user exists
  9. replace the stub central session with a proper session, store the central session ID in the local session, delete the local session data from step 1
  10. log the user in locally by writing the session (not sure what's the point of this, it should have already happened when submitting the login form)
  11. schedule an edge login
  12. fetch the "return to" values from step 1, possibly change them via the CentralAuthPostLoginRedirect hook, and do the redirect
  13. the edge login will, among other things, make a request to Special:CentralAutoLogin/refreshCookies on the central wiki, which will copy the "remember me" flag from the central session to the loginwiki local session (which possibly means adding a cookie matching the user token in gu_token)

Steps 1-3 are on the local wiki, 4-7 on the central login wiki, 8-12 on the local wiki again, 13 on the central login wiki again.

Central autologin algorithm today:

  1. in various settings, a request (could be AJAX, a top-level redirect, a pixel etc) is made to Special:CentralAutoLogin/checkLoggedIn on the central login wiki
  2. after verifying that the user exists etc, store the user's central user ID in the central token store (with 1 minute expiry), keyed by a random token (to mitigate T59081)
  3. redirect to Special:CentralAutoLogin/createSession on the local wiki, pass the token from the step above as query parameter, and possibly various other parameters related to what should happen at the end of autologin (local wiki ID, returnto, whether the output should JSON or JS or a pixel, whether to use the mobile domain...)
  4. retrieve and delete the data from step 2 from the central token store, create an anonymous session and store the central user ID in it.
  5. store some data in the central token store (with 1 minute expiry), keyed by a random token:
    1. the user's central user ID
    2. the ID of the local wiki (passed as a URL parameter)
  6. redirect to Special:CentralAutoLogin/validateSession on the central wiki, pass the token from the step above as query parameter, and forward other query parameters
  7. retrieve and delete the data from step 5 from the central token store, verify that the user ID and the wiki ID are correct
  8. store some data in the central token store (with 1 minute expiry), keyed by a random token (it actually uses the same token as step 5, but that doesn't seem important):
    1. the user's central user ID
    2. the ID of the local wiki (passed as a URL parameter)
    3. data needed to set up a session: username, central user token, "remember me" flag, central session ID
  9. redirect to Special:CentralAutoLogin/setCookies on the local wiki, pass the token from the step above as query parameter, and forward other query parameters
  10. retrieve and delete the data from step 8 from the central token store, verify that the user ID and the wiki ID are correct (for the user ID, compare it to the session data from step 4), the various data fields are consistent with each other, the user exists etc.
  11. validate the central user token (the user might have logged out in the meantime)
  12. log in the user by writing the local session (note the user might not have an account on this wiki at this point, but with CentralAuth the "local session" also handles second-level-domain cookies and the user might have accounts on other wikis in the same family, so we still need to do this)
  13. schedule an edge login
  14. for certain types of autologin, autocreate and log in the local user if it doesn't exist (and invalidate the CA user cache to make sure the new account is picked up); there's some SessionManager hackery to ensure 12/14 are done in a single session write
  15. handle the final output, depending on the autologin type (e.g. with type=redirect we need to redirect to returnto; with type=script we need to add JS that shows a success notice)

So step 1 is done on the local wiki, 2-3 on the central wiki, 4-6 on the local wiki (or edge wiki if this is an edge login), 7-9 on the central wiki again, 10-15 on the local or edge wiki again.

Some generic things to keep in mind:

  • every step should validate whether the user exists and is attached
  • every step should validate whether it is on the right wiki
  • where relevant, steps should validate whether the session used is a CentralAuth session
  • every step should set the right caching headers (mostly just means disabling caching, but anything that can be encountered by an anonymous user must be cacheable)

At a high level, without the security steps, the new login process looks like this:

  1. the user goes to Special:Userlogin on the local wiki, which starts the local AuthManager login process
  2. the authentication provider redirects to Special:Userlogin on the central wiki (or rather shared login domain, since we won't exactly have a central login wiki)
  3. the user does a normal login there
  4. the user is redirected back (via PostLoginRedirect) to the login continue step on the local wiki, the outcome of the login is passed back as a query parameter
  5. the local wiki finishes the login based on what information it received; AuthManager handles autocreation if needed

Together with the security steps, it might look like this:

  1. the user goes to Special:Userlogin on the local wiki, which starts the local AuthManager login process
  2. the authentication provider stores some data in the central token store, keyed by a random token:
    1. a random secret
    2. the wiki ID
    3. maybe the returnto parameters, to prevent tampering?
  3. the random secret is also stored in the local session
  4. the authentication provider redirects to Special:Userlogin on the shared login domain
  5. the user does a normal login there (this also handles the "remember me" flag in the central session, we are not using a stub session so no need to do any magic about that)
  6. the authentication provider, in a post-login/post-signup callback, reads and deletes the data from step 2, re-stores it under a new random token (with a different key namespace) with some extra data:
    1. the user name or ID
    2. the central session ID
    3. the "remember me" flag
  7. the user is redirected back (via PostLoginRedirect) to the login continue step on the local wiki, with the token from the step above passed as a query parameter
  8. the authentication provider on the local wiki looks up and deletes the data from step 6 in the token store, verifies the secret against the session, verifies the wiki ID, and returns the username to AuthManager to indicate a successful login (AuthManager will then log the user in and handle autocreation)
    1. FIXME 1: how to set the "remember me" flag? AuthManager expects it to come from user input via the RememberMeAuthenticationRequest; authentication requests are frozen once the authentication started
    2. FIXME 2: how to make sure the central session ID is written into the local session (which is required for it to be valid)? Otherwise the user will end up with two central sessions, and logout won't work as expected.

Security-wise:

  • the token mechanism ensures all data (including the login result) is passed safely, and cannot be tampered with
  • the random secret prevents session fixation (we know the user in step 8 is the same as the user in step 1 because it has the same secret in its local session; the secret has been passed along step-by-step via the token store and cannot be compromised)
  • the login on the shared domain is self-contained (AuthManager does not know or care this is part of a central login process, it's just a normal login setting normal CentralAuth cookies, same as it worked in the SUL1 phase)
  • an attacker could send a victim to the login page on the shared domain, prepared such that (after a successful login) it redirects the victim to the return wiki and page determined by the attacker. On the return wiki we'll know something is wrong (since the secret does not match) but we should think carefully about the UX implications.

Note that unlike central login and central autologin which are both immediate redirect chains and can use short expiries, the new login process involves showing a login form to the user so it's slow. The local login session times out after 5 min which might or might not be something to fix. (That timeout won't prevent central login from succeeding, but the user won't be logged in locally at the end, and we need to explain what happened.) There might be a user expextation that a login form with username and password fields is a "safe" steps - you can spend a bunch of time staring at it and it will still work. In any case the token store expiries should probably match the local session expiry.

Note Special:Userlogin on the shared login domain is something shown to user, so user may save it to bookmark. Next time user may directly browse the login page in shared domain without visiting Special:Userlogin in local wiki. So either we need to guarantee either the login procedure work will succeed if starting in step 4 (not having a local session), or try to create a local session (by redirect user first to local wiki then back to login domain) and hope it can be saved by browser.

Tgr renamed this task from Determine SUL 3 login handshake mechanism to Determine and implement SUL 3 login handshake mechanism.Wed, Jul 31, 6:25 PM