Page MenuHomePhabricator

defaultcontentmodel missing from most namespaces in Wikidata namespaces siteinfo (breaks pywikibot)
Closed, ResolvedPublic

Description

Most namespaces on Wikidata are no longer declaring the defaultcontentmodel in action=query+meta=siteinfo+siprop=namespaces, e.g. the main (Item) namespace:

$ curl -s 'https://www.wikidata.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2&format=json' | jq '.query.namespaces["0"]'
{
  "id": 0,
  "case": "first-letter",
  "name": "",
  "subpages": false,
  "content": true,
  "nonincludable": false
}

In fact, only four namespaces still declare it:

$ curl -s 'https://www.wikidata.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2&format=json' | jq '.query.namespaces | .[] | select(has("defaultcontentmodel")) | { id, canonical, defaultcontentmodel }'
{
  "id": 640,
  "canonical": "EntitySchema",
  "defaultcontentmodel": "EntitySchema"
}
{
  "id": 641,
  "canonical": "EntitySchema talk",
  "defaultcontentmodel": "wikitext"
}
{
  "id": 2302,
  "canonical": "Gadget definition",
  "defaultcontentmodel": "GadgetDefinition"
}
{
  "id": 2600,
  "canonical": "Topic",
  "defaultcontentmodel": "flow-board"
}

One outcome of this is that Pywikibot refuses to operate on Wikidata, since it thinks the Item entity type isn’t supported:

>>> import pywikibot
>>> site = pywikibot.Site("wikidata", "wikidata")
>>> item = pywikibot.ItemPage(site, "Q42")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/luwe/git/pywikibot/core/pywikibot/page/__init__.py", line 3900, in __init__
    ns = site.item_namespace
  File "/home/luwe/git/pywikibot/core/pywikibot/site/_datasite.py", line 91, in item_namespace
    self._item_namespace = self.get_namespace_for_entity_type('item')
  File "/home/luwe/git/pywikibot/core/pywikibot/site/_datasite.py", line 78, in get_namespace_for_entity_type
    raise EntityTypeUnknownError(
pywikibot.exceptions.EntityTypeUnknownError: DataSite("wikidata", "wikidata") does not support entity type "item"

Event Timeline

Workaround for Pywikibot:

site._entity_namespaces['item'] = site.namespaces[0]
>>> import pywikibot
>>> site = pywikibot.Site("wikidata", "wikidata")
>>> item = pywikibot.ItemPage(site, "Q42")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/luwe/git/pywikibot/core/pywikibot/page/__init__.py", line 3900, in __init__
    ns = site.item_namespace
  File "/home/luwe/git/pywikibot/core/pywikibot/site/_datasite.py", line 91, in item_namespace
    self._item_namespace = self.get_namespace_for_entity_type('item')
  File "/home/luwe/git/pywikibot/core/pywikibot/site/_datasite.py", line 78, in get_namespace_for_entity_type
    raise EntityTypeUnknownError(
pywikibot.exceptions.EntityTypeUnknownError: DataSite("wikidata", "wikidata") does not support entity type "item"
>>> site._entity_namespaces['item'] = site.namespaces[0]
>>> item = pywikibot.ItemPage(site, "Q42")

(You’ll probably need a similar line with 120 instead of 0 for the property namespace, and possibly another one for lexemes.)

I can reproduce locally that MediaWikiServices::getInstance()->getNamespaceInfo()->getNamespaceContentModel() returns null for most namespace IDs, including 0 (wikitext on my wiki) and 120 (items on my wiki), but not 640 (EntitySchema).

Hm, this might not actually be a new issue… on English Wikipedia (still on wmf.17) there’s only one namespace with defaultcontentmodel too:

$ curl -s 'https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2&format=json' | jq '.query.namespaces | .[] | select(has("defaultcontentmodel")) | { id, canonical, defaultcontentmodel }'
{
  "id": 2302,
  "canonical": "Gadget definition",
  "defaultcontentmodel": "GadgetDefinition"
}

And I seem to get the same result locally (namespace 0 having content model null) going back as far as wmf/1.37.0-wmf.1 (though I changed the MediaWiki core checkout for this, not extensions, so it’s just a quick check).

It is a regression in wmf.18 after all, I reset mwdebug2001 to wmf.17 and Wikidata gave the wikibase-item content model again:

lucaswerkmeister-wmde@mwdebug2001:~$ sudo -u mwdeploy sed -i '/\bwikidatawiki\b/ s/18/17/' /srv/mediawiki/wikiversions.{json,php}
lucaswerkmeister-wmde@mwdebug2001:~$ mwscript shell.php wikidatawiki
>>> MediaWiki\MediaWikiServices::getInstance()->getNamespaceInfo()->getNamespaceContentModel( 0 )
=> "wikibase-item"
lucaswerkmeister-wmde@mwdebug2001:~$ scap pull
lucaswerkmeister-wmde@mwdebug2001:~$ mwscript shell.php wikidatawiki
>>> MediaWiki\MediaWikiServices::getInstance()->getNamespaceInfo()->getNamespaceContentModel( 0 )
=> null

Change 712113 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Revert \"Inject NamespaceInfo into EntitySourceDefinitionsConfigParser\"

https://gerrit.wikimedia.org/r/712113

Change 711714 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.37.0-wmf.18] Revert \"Inject NamespaceInfo into EntitySourceDefinitionsConfigParser\"

https://gerrit.wikimedia.org/r/711714

Change 712113 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Revert \"Inject NamespaceInfo into EntitySourceDefinitionsConfigParser\"

https://gerrit.wikimedia.org/r/712113

Change 711714 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.37.0-wmf.18] Revert \"Inject NamespaceInfo into EntitySourceDefinitionsConfigParser\"

https://gerrit.wikimedia.org/r/711714

Mentioned in SAL (#wikimedia-operations) [2021-08-12T09:29:51Z] <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.37.0-wmf.18/extensions/Wikibase/data-access/: Backport: [[gerrit:711714|Revert "Inject NamespaceInfo into EntitySourceDefinitionsConfigParser" (T288724)]] (1/2) (duration: 01m 08s)

Mentioned in SAL (#wikimedia-operations) [2021-08-12T09:31:15Z] <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.37.0-wmf.18/extensions/Wikibase/: Backport: [[gerrit:711714|Revert "Inject NamespaceInfo into EntitySourceDefinitionsConfigParser" (T288724)]] (2/2) (duration: 01m 12s)

Should be fixed for now, I’ll leave the task open because we probably want tests for this, or to look into it further.

Hm, this might not actually be a new issue… on English Wikipedia (still on wmf.17) there’s only one namespace with defaultcontentmodel too:

$ curl -s 'https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespaces&formatversion=2&format=json' | jq '.query.namespaces | .[] | select(has("defaultcontentmodel")) | { id, canonical, defaultcontentmodel }'
{
  "id": 2302,
  "canonical": "Gadget definition",
  "defaultcontentmodel": "GadgetDefinition"
}

And I seem to get the same result locally (namespace 0 having content model null) going back as far as wmf/1.37.0-wmf.1 (though I changed the MediaWiki core checkout for this, not extensions, so it’s just a quick check).

Side note: I think this was a conflation of two issues – the fact that namespace 0 on English Wikipedia doesn’t declare the wikitext content model (and other special namespaces like Module don’t declare anything either) is strange, but apparently not a recent regression, and unrelated to this issue. 🤷