Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat: Difference between revisions

From Wikidata
Jump to navigation Jump to search
Content deleted Content added
Gettinwikiwidit (talk | contribs)
Line 742: Line 742:
::::: You seem to agree that cleaning up is required. How would you describe the situation? Unclean? Dirty? --- [[User talk:Jura1|Jura]] 10:42, 15 August 2020 (UTC)
::::: You seem to agree that cleaning up is required. How would you describe the situation? Unclean? Dirty? --- [[User talk:Jura1|Jura]] 10:42, 15 August 2020 (UTC)
:::::: {{ping|Jura1}} You're not engaging with any of the information I'm providing. You're simply declaring yourself the arbiter of right and wrong. I'm happy to have a productive conversation about the data but not to genuflect. When you're ready for the former, please let me know. -- [[User:Gettinwikiwidit|Gettinwikiwidit]] ([[User talk:Gettinwikiwidit|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 10:47, 15 August 2020 (UTC)
:::::: {{ping|Jura1}} You're not engaging with any of the information I'm providing. You're simply declaring yourself the arbiter of right and wrong. I'm happy to have a productive conversation about the data but not to genuflect. When you're ready for the former, please let me know. -- [[User:Gettinwikiwidit|Gettinwikiwidit]] ([[User talk:Gettinwikiwidit|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 10:47, 15 August 2020 (UTC)
::::::: I'm ok with a mere revert. Did you seek consensus for the change you made? As you are aware, in the meantime I listed [[Q98077491]] for deletion. --- [[User talk:Jura1|Jura]] 10:49, 15 August 2020 (UTC)


== Bots not required to be Open Source? ==
== Bots not required to be Open Source? ==

Revision as of 10:49, 15 August 2020

tree-lined

Hi, suppose that I have a pedestrian walkway, alley, avenue, promenade, road (sometimes also a parking lot or a square or a building) and I want to encode that it's "tree-lined" (in my language, "alberato"), how do I do that? It's quite a common description and sometimes they are considered scenic and part of the cultural heritage also because of this feature. So, how do we cover this information in The Avenue, Raglan Castle (Q24256091), Hamilton Terrace (Q47490961), Chaussee (Q41397963) or the next item I would like to create for Wiki Loves Monuments?--Alexmar983 (talk) 22:16, 25 July 2020 (UTC)[reply]

You could use avenue (Q7543083) for streets, not sure about parking lots etc. Ghouston (talk) 00:39, 26 July 2020 (UTC)[reply]
I don't want instances for a specific concept that is tree-lined, I want to add "tree lined" to different concepts. For example an avenue is in a urban area, you can have country road with trees and they are not avenues. Should I try a qualifier? I create an item for the concept? --Alexmar983 (talk) 03:49, 26 July 2020 (UTC)[reply]
avenue (Q7543083) is not necessarily urban, and most urban streets with "avenue" in their name are not examples of Q7543083. FWIW: "avenue" in the Q7543083 sense is probably not common U.S. English, I'm not sure we even have a word for it. Pretty common in the UK there, maybe the main meaning of avenue there. - Jmabel (talk) 16:02, 26 July 2020 (UTC)[reply]
There are many avenues in the UK that are tree-lined in the UK but there are also many other streets with different names that are tree-lined. There are also large numbers of avenues that aren't tree-lined. Whether the term historically had any connection to trees, I think it has lost that connection in British English. From Hill To Shore (talk) 18:20, 26 July 2020 (UTC)[reply]
@From Hill To Shore: So are you saying that UK English now does not use this sense of "avenue" any more than American English? If so, is there any word for this in contemporary UK English? Avenue (landscape) doesn't really have much to say about contemporary usage of the word, other than just its use in "the usual suite of words used in street names". - Jmabel (talk) 21:29, 26 July 2020 (UTC)[reply]
@Jmabel: People in the UK may use that sense of the word in some contexts, but if you are wanting a word that will be instantly recognised as "tree-lined" then Avenue isn't the one from a UK perspective (I know several avenues in my area that don't have trees). The text at the top of c:Category:Avenues in England perhaps puts it more clearly; "Avenues - the garden and landscape architecture feature... This is not the category for streets or roads which include the name 'Avenue', but for deliberately planted parallel rows of trees, hedges or other flora, unless the named 'Avenue' photo is focused on a regularly spaced twin line of trees." I can't think of a specific phrase that we would use in the UK other than, "tree-lined." It is possible that there is a word that fits, but I can't think of it. From Hill To Shore (talk) 23:00, 26 July 2020 (UTC)[reply]
@From Hill To Shore: would you agree, though, that avenue (Q7543083) is certainly this sense of the term? - Jmabel (talk) 01:10, 27 July 2020 (UTC)[reply]
It seems to be how it was intended, but items like Sixth Avenue (Q109873) seem to be instances based on name alone. Ghouston (talk) 04:48, 27 July 2020 (UTC)[reply]

All this discussion simply proves besides my original comment that you need to encode "tree-lined" as a separate concept. So how do we do it?--Alexmar983 (talk) 00:32, 29 July 2020 (UTC)[reply]

So no clue? I have two more items of tree-lined alley to create, I put them in a personal list and i am ready for whatever solution you want ot propose.--Alexmar983 (talk) 16:32, 6 August 2020 (UTC)[reply]

Thanks in any case.--Alexmar983 (talk) 00:13, 11 August 2020 (UTC)[reply]
P2670 seems to be the way to go. --- Jura 00:29, 11 August 2020 (UTC)[reply]
Should I create an item to be its argument?--Alexmar983 (talk) 22:43, 11 August 2020 (UTC)[reply]

IF you know how to use properly OpenStreetMap tag or key (P1282) this is the tag on OSM.--Alexmar983 (talk) 00:44, 12 August 2020 (UTC)[reply]

Do islands need to have separate items for its administrative territory? (2)

I am sorry I have not replied to the discussion before (now archived here) because of irl problems. I still want to ask about the problem I mentioned there. Say, I want to know articles about regencies in Indonesia in the Javanese Wikipedia and the query is using instance of (P31) for regency of Indonesia (Q3191695). There are regencies in Indonesia which comprises a whole island group such as Selayar Isls. In managing this using an approach like calorie (Q87260855) would make problems. The supposed combined item for Selayar Isls. then wouldn't come up in the query because it would be a Wikipedia article covering multiple topics (Q21484471), not a Q3191695. RXerself (talk) 15:38, 2 August 2020 (UTC)[reply]

I'm quite sure you want at least two items for an island. One for the administrative unit, one for the island, both will have a different P31. And some islands might possibly have more administrations (many countries have like provinces/municipalities/neighbourhoods and a small island might be all three). Edoderoo (talk) 18:25, 2 August 2020 (UTC)[reply]
+1 to Edoderoo. Obviously, not every island is an administrative unit, but when they are that merits two separate items, linked with coextensive with (P3403). - Jmabel (talk) 18:31, 2 August 2020 (UTC)[reply]
Technically you can create a page "X (island)" which redirects to article "X", and have an item about the island connected to that redirect. (Or the other way around.) I am not sure everybody here likes that, but I have found it very useful. Islands and administrative entities often have very different founding dates. Most island here have a history back to the last ice age, while most administrive territories only have a history of decades. 62 etc (talk) 18:42, 2 August 2020 (UTC)[reply]
Yes I know about the different area and different founding dates from the last thread. I want to repeat that the islands I am referring to here is the ones which also exist as its own individual administrative unit with the exactly same border and area. There are Wikipedias which treat the two as separate subjects and thus have separate articles. I understand that having two separate items would accomodate such cases. Having only one item would also create a problem when querying those articles. The question I am asking now is the vise versa problem which arises when we have two items. RXerself (talk) 16:13, 3 August 2020 (UTC)[reply]
I have just came to think about it again, connecting it to redirect maybe reasonable and there have been an RfC about it before but I'm not sure that it has been a consensus? I like the sound of it but I think that the list of Wikipedia articles returned from the query would contain redirect pages (and I'm not sure whether it's sound from the point of view of the data consumer). RXerself (talk) 07:29, 9 August 2020 (UTC)[reply]
Just tried one, I realized that we can't use redirect pages for Wikipedia links in Wikidata. Welp. RXerself (talk) 07:07, 10 August 2020 (UTC)[reply]
@RXerself: It's technically possible to do so, but it is deliberately made difficult. I think you have to temporarily change the page away from being a redirect, link it, and change it back. There definitely are some done like this on purpose, and I gather that there is no hard-and-fast rule against the practice. My own feeling is that if it is allowed, it shouldn't be so tricky to do and there should be a more overt override, and if it isn't allowed it shouldn't be OK to override this way, but it is as it is. - Jmabel (talk) 15:33, 10 August 2020 (UTC)[reply]
Please note that there are two separate questions here:
  1. Should there be separate entities for geographical-island and island-as-administrative-division? (Edoderoo and Jmabel just above, and I'm getting the impression general current consensus, all say "yes".)
  2. How do you backlink these entities to the Wikipedias, especially if a single Wikipedia article covers both the geographical and the administrative entities, and especially if different Wikipedias partition things differently? To handle this, do we need a third, pseudo entity for either-the-geographical-or-the-administrative-island-but-we're-not-sure? (That's where the example of calorie (Q87260855) comes in, and although I'm the one who brought it up -- and who created Q87260855 in the first place! -- it does indeed cause problems and I don't think I like it in the case of islands.)
This is the Bonnie and Clyde problem on steroids, and it's a thorny one. —Scs (talk) 15:02, 3 August 2020 (UTC)[reply]
I think the issue goes much further than this. This issue occurs not just for islands but actually most political entities such as towns, cities, municipalities (eg Kesswil (Q69413) and Kesswil (Q22388447)) where the human settlement at that place is co-existing with the political entity at that place and there is only one Wikipedia article (in most languages). This causes the same issue with geonames for example making the distinction ([1] and [2]). Currently it is basically ignored for the most part but using the "Bonny and Clyde" approach would really mean that linking Wikipedia articles to these places becomes very difficult (how will you find the correct Wikipedia article for a town for example)? I think an easier solution for this problem would actually be to only have 2 articles: main article about the community itself and this is then located in the administrative territorial entity (P131) using the current political entity. This will work for simple cases (one community, one administrative unit). What do people think?--Hannes Röst (talk) 15:20, 3 August 2020 (UTC)[reply]
If it's for two administrative units on different levels but having the same name and boundary, yes I think that it is a sufficient approach. RXerself (talk) 16:56, 3 August 2020 (UTC)[reply]
Not quite, this would *not* be for two admin units but for two entities, one a settlement/town/city (the humans living there and their houses etc) and one an administrative unit. The idea would be to use Kesswil (Q69413)located in the administrative territorial entity (P131)Kesswil (Q22388447), similar to how one would use (island name) -> located in the administrative territorial entity (P131) -> Island administrative unit if the two are exactly overlapping (leading to a lot of statements "X P131 X" where X is both the name of the town/island/city and the admin unit). --Hannes Röst (talk) 18:45, 3 August 2020 (UTC)[reply]
Yeah, I think I have ever used it before on islands that are wholly in an administrative unit (like Boji Island (Q24824681)). Both an island and administrative unit if they are occupying the same boundary can be connected using that and located in/on physical feature (P706). RXerself (talk) 07:29, 9 August 2020 (UTC)[reply]

Importing short descriptions from enwp

Hi all. The English Wikipedia has set up 'short descriptions', which replace the descriptions from Wikidata in searches etc. If you're not aware of the history, then you can find the discussions at en:Wikipedia:Short_description#History. They now have over 2 million local descriptions, and WMF might disable using the Wikidata descriptions there soon. This means that enwp has a separate set of descriptions that are maintained separately from the Wikidata descriptions, and will steadily get out of sync with the descriptions here (and their uses elsewhere) unless we do something about it.

As such, I've proposed a bot to import the descriptions at Wikidata:Requests for permissions/Bot/Pi bot 14. It has two options, either only importing descriptions where we don't already have one, or completely synchronising enwp and wikidata English descriptions. Technically, this is possible, and I contend that the descriptions are short enough to be ineligible for copyright. Should we do this? Thanks. Mike Peel (talk) 19:33, 3 August 2020 (UTC)[reply]

I think you mean en:Wikipedia:Short description#History. From Hill To Shore (talk) 19:43, 3 August 2020 (UTC)[reply]
Yes, link fixed above, thanks! Mike Peel (talk) 19:52, 3 August 2020 (UTC)[reply]
The matching discussion on enwp is at en:Wikipedia:Village_pump_(proposals)#Synchronising_short_descriptions_and_Wikidata_descriptions. Thanks. Mike Peel (talk) 21:52, 6 August 2020 (UTC)[reply]
I have some concerns with this proposed import.
  1. While I expect that in most cases the concept of a Wikidata item will perfectly align with the concept of the English Wikipedia article, this is not always the case. There are a lot of items where the topic discussed on the articles on the Wikipedia pages stray beyond the bounds of the Wikidata item (especially where you are linking to articles in multiple languages).
  2. Many Wikidata items link to multiple sites related to a topic; Wikipedias of multiple languages, Commons and WikiSource to name a few. Just as English Wikipedia users objected to using Wikidata descriptions because they didn't meet the specific requirements of their project, wouldn't importing the changed descriptions now cause problems at Wikidata due to not meeting the requirements of other projects? Just because English Wikipedia is the largest project, that doesn't make them right all the time or cause their output to be in the best interest of unrelated projects.
Being able to reuse data to reduce our workload is a positive but we need to be careful that we don't break what we have for the sake of short term convenience. If we can get more eyes on this and think through how to handle the resulting problems, then I would be able to support the proposal. From Hill To Shore (talk) 20:12, 3 August 2020 (UTC)[reply]
"This means that enwp has a separate set of descriptions that are maintained separately from the Wikidata descriptions, and will steadily get out of sync with the descriptions here (and their uses elsewhere) unless we do something about it." This stupidity is entirely of the en.Wikipedia community's making. Regardless, I'm not clear why we would overwrite Wikidata descriptions with those from en.Wikipedia. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:37, 3 August 2020 (UTC)[reply]
  • There is no need to keep WD desc and En.wp desc synced, there are reasons to keep this unsynced and separated. WD description describes an item, En.wp description describes an article – there are many situations in which there is no 1:1 equivalency between an item and an article. What's more, there are unfortunately many situations in which people do not know that and keep adding descriptions to WD that are in fact descriptions of an article in a particular language project. So  Oppose to any bot action that would overwrite existing data based on En.wp descriptions. Wostr (talk) 20:43, 3 August 2020 (UTC)[reply]
    • What about a third option, namely adding a NEW property for the short description written by Wikipedians who feel a need to overrule whatever short description may exist in Wikidata? A bot could automatically update any "Wikipedia short description(s)" when they are specified or changed. Wikidata could be programmed to not allow a user to change that in Wikidata and would tell anyone who tried where they need to go to actually change that. Then humans could easily review the different short descriptions and decide whether to revise one or the other, make them the same, or whatever. DavidMCEddy (talk) 20:50, 3 August 2020 (UTC)[reply]
Support, subject to reviewing any edge cases and specific issues. I've edited getting on for 10,000 English Wikipedia short descriptions, manually and semi-manually, and I'm confident that almost without exception those edits would have improved the bot-generated Wikidata text, had I been easily able to get them back here. The main issue with making a direct link is likely to be not 'breaking Wikidata' but the preference of enWP to start with a capital letter, which WD normally doesn't. But that is a technical side issue that could no doubt be solved. MichaelMaggs (talk) 20:57, 3 August 2020 (UTC)[reply]
 Oppose per Wostr there is no reason to be in sync with one particular external webpage -- but I would support a new property as described by DavidMCEddy (it seems his suggestion is merely to fetch the enWP description and display it per userScript or addon next to our existing ones similar to the Preview gadget). There should anyways soon be support for "bridges" and cross project editing so this could be an interesting testcase. --Hannes Röst (talk) 02:43, 4 August 2020 (UTC)[reply]
 Oppose also to the new property idea because articles are often about several concepts. You might not see this often if your interest is persons, but in molecular biology most articles contain several concepts. --SCIdude (talk) 07:18, 4 August 2020 (UTC) P.S. Except if you can put the new property on the sitelink...[reply]
If the Wikipedia article is about several concepts, it should not be linked to a Wikidata item about one concept. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:35, 4 August 2020 (UTC)[reply]
Practically impossible to achieve in some disciplines like chemistry/molecular biology. Wostr (talk) 10:26, 4 August 2020 (UTC)[reply]
Nonsense - but feel free to provide an example, if you disagree. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 4 August 2020 (UTC)[reply]
Carbohydrates for example. Usually Wikipedia have one article about a specific carbohydrate (sometimes two: about D and L form) and it describes it as a 'chemical compound', we have a lot more items – several about 'group of isomers' (e.g. ribose (Q59817493), L-ribose (Q85553776), D-ribose (Q38176423)), several about 'group of stereoisomers' (e.g. L-ribopyranose (Q27120756), L-ribofuranose (Q27120751), D-ribopyranose (Q27120754), D-ribofuranose (Q179271)) and several about specific chemical compounds (aldehydo-L-ribose (Q27120760), aldehydo-D-ribose (Q27120759), α-L-ribopyranose (Q27120757), β-L-ribopyranose (Q27120758), α-L-ribofuranose (Q27120752), β-L-ribofuranose (Q27120753), α-D-ribopyranose (Q27120755), β-D-ribopyranose (Q27095107), α-D-ribofuranose (Q27104554), β-D-ribofuranose (Q27104584)). Description of Wikipedia article cannot be applied to the Wikidata item it is linked with. Second example: inositol trisphosphate (Q138145) – in Wikidata we should have different items for a neutral compound and for every anion (monoanion, dianion, trianion). This is not the case in Wikipedia, where the article is named as neutral compound, but most of the text is about an anionic form. Wostr (talk) 15:41, 4 August 2020 (UTC)[reply]
You have given an example where Wikipedia has one article, and Wikidata has many, more specific, items, and where the Wikipedia article is linked to one of those items, possibly incorrectly. You have not given an example where "[a] Wikipedia article is about several concepts [and so] it should not be linked to a Wikidata item about one concept" is "Practically impossible to achieve ". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:45, 8 August 2020 (UTC)[reply]
 Strong oppose (emphasis added) This means that enwp has a separate set of descriptions that are maintained separately from the Wikidata descriptions, and will steadily get out of sync with the descriptions here (and their uses elsewhere) unless we do something about it. What? Why? It was the English Wikipedia community who unilaterally decided to fork the Wikidata descriptions; I firmly see this as their problem, not ours. They are free to reverse this (in my opinion, terrible) decision at any time and start using Wikidata descriptions again (and more importantly, contribute to them again, thereby benefiting many users outside of the walled garden of English Wikipedia short descriptions). Besides that, I disagree that the short descriptions are short enough to be ineligible for copyright. English Wikipedia is free to copy our descriptions, thanks to our CC0 license; but we cannot copy their descriptions, since I do believe they fall under CC BY-SA. --Lucas Werkmeister (talk) 09:45, 4 August 2020 (UTC)[reply]
Even allowing for the specific issues that have been mentioned, in the vast majority of cases a single well-written text will work both for Wikidata and the English Wikipedia, and can be released under CC0. So, at the very least I'd like to see a one-click or semi-manual feature so that editors working to improve descriptions here can push their edits through to Wikipedia without having to go there and duplicate the work; likewise, so that editors improving short descriptions at enWP can copy their improvements under CC0 over to Wikidata, without having to come here and duplicate the work. In both cases, of course, under user control. We should be striving to improve interoperability, and it is sad to read so many people blaming other projects (that works both ways). MichaelMaggs (talk) 11:25, 4 August 2020 (UTC)[reply]
 Oppose I find the claim that there are no copyright issues questionable. Both legally and socially for our relationship to EnWiki. There are also many cases where the goals for a good description in Wikipedia and Wikidata are different. ChristianKl10:46, 4 August 2020 (UTC)[reply]
Don't really understand the copyright concern, as many existing Wikidata descriptions were created by bot using text automatically taken from the linked English Wikipedia article. If those existing bot-created descriptions are reasonably considered to be CC0 here (perhaps because they are short enough), why the new concern about future wordings? MichaelMaggs (talk) 11:35, 4 August 2020 (UTC)[reply]
I don't think it's acceptable to say that any major issue like this is "their problem", especially when we are talking about something like this. The English-language Wikipedia user community is too big and too important to ignore. DavidMCEddy (talk) 12:38, 4 August 2020 (UTC)[reply]
@DavidMCEddy: I think you meant to attach this to one of the other threads above (in the same section) as the topics you mention are in earlier threads. Here on Wikidata we provide a service to all Wikimedia projects and we need to consider the impacts of changes on all users. We don't ignore the input from other projects but we also need to be mindful of the effects. Swaying to the will of the largest project and not considering the potential harm to others would be very wrong. From Hill To Shore (talk) 15:50, 4 August 2020 (UTC)[reply]
I don't think that the whole out-of-sync/their problem/too big and to important to ignore is relevant here. The question is very simple - we have "free" dataset of high-quality manually curated 1-sentence description for ~2M of items. There will be few specific cases when en-wiki article is associated with wrong wikidata item, but any mass import has some percentage of errors. So  Strong support for option #1 if there are no copyright issues involved (IANAL, but I believe 1 sentence cannot be copyrightable, not sure about dataset/database rights) Ghuron (talk) 14:09, 4 August 2020 (UTC)[reply]
 Oppose We are supposed to be a collection of collaborative projects. But I have seen one more than one occasion en:WP people telling Commons that "this is the way we do it" and "this is the way it's always been done", however taxonomically and epistemologically suspect their way is. As said above, it's their problem. Rodhullandemu (talk) 16:25, 4 August 2020 (UTC)[reply]
If we do not have any description, and we can import one from en.wp - we should do it. I am not happy about en.wp community desigion either, but we must leave emotions aside. Carn (talk) 17:01, 4 August 2020 (UTC)[reply]
(ec)  Support for Wikidata items that do not have an English description yet,  Strong oppose overwriting existing descriptions. Yellowcard (talk) 17:03, 4 August 2020 (UTC)[reply]
Assuming copyright is not an issue, and the capitalizations can be fixed, I would be fine with importing descriptions to items that have no English description in wikidata. There may be some very minimal descriptions here that also would be fine to overwrite ("researcher", "organization", etc.) but in general I  Oppose overwriting of existing Wikidata descriptions, or any plan to keep them in "sync", there is no need for that. ArthurPSmith (talk) 18:16, 4 August 2020 (UTC)[reply]
 Oppose If WP:en has good reasons to not use WD data, the same reasons apply in WD when using short descriptions from WP:en. Then in my field of interest, I am cleaning items since sevral years now and I can confirm that in most WD items the concept is narrower that in WP articles. I strongly disagree to import short description from one WP even if an English label is missing because this can cause a discrepancy with the data in the item: concept description in WD should be defined from the data in the item and not from an interwiki which can be wrong or can evolve with time like in an WP article. Snipre (talk) 19:25, 4 August 2020 (UTC)[reply]

 DATA!: I've sampled 500 cases where short descriptions diverge b/w enwiki and wd. Here's a taste:

Short Description in enwiki and wikidata
Article enwiki wd len(enwiki) len(wd) diff(lenghts)
SUMS 15941 14868 1073
.223 Remington Firearms cartridge cartridge 18 9 9
1st Gnezdilovo Village in Kursk Oblast, Russia human settlement in Fatezhsky District, Kursk Oblast, Russia 31 60 -29
1,8-Cineole,NADPH:oxygen oxidoreductase index of enzymes associated with the same name Wikimedia disambiguation page 46 29 17
1 November 1954 Stadium (Batna) Multi-use stadium in Batna, Algeria building in Algeria 35 19 16
1 Police Plaza Office building in Manhattan, New York office building in Manhattan, USA 38 33 5

...and there are 495 more rows.

Subjectively, I would judge the WP descriptions to be be somewhat better than ours. They are quite obviously handcrafted in any cases, whereas WD descriptions are rather often careless collections of whatever facts the bot writing them has.
I've yet to encounter the term "human settlement" in real life, for example. But it's the most common term in descriptions of populated places. And while it was consciously decided to avoid the term "book" in favour of "literary work", it creates a cognitive barrier to understanding for a consumer of such data not already used to it. I did some blind rating with about a hundred items and came out at about 70:30, although it's hard to keep up the blinding because our descriptions follow patterns that soon become blindingly obvious. Matthias Winkelmann (talk) 05:19, 5 August 2020 (UTC)[reply]
The subset with "index of XYZ with the same name" could be automated? --SCIdude (talk) 08:07, 5 August 2020 (UTC)[reply]
 Neutral No problem with importing and overwriting, as long as (for the former) quality is assured and (for the latter) there is a human's oversight. --Matěj Suchánek (talk) 09:00, 6 August 2020 (UTC)[reply]
 Support syncing as a general concept. (In the spirit of syncing, this comment is mostly copied from my one at en-WP.) Wikidata descriptions and en-WP short descriptions are fundamentally the same thing: short descriptions. Are there occasional instances where they might properly diverge? Sure. But does that mean they should be totally split, leading to probably thousands of hours of duplicated (i.e. wasted) editor effort to create the same thing in two separate places? Absolutely not.
It's important to connect this to the bigger picture here. The success of the Wikimedia movement is fundamentally predicated on having enough people to do the work (that's the main reason Wikipedia deletes non-notable pages). Whenever we choose to fork, that literally doubles the amount of work to be done, which when you multiply by 6 million, comes out to a gargantuan cost in editor effort. Thus, preventing forks needs to be one of our highest priorities. Worse, once a fork has been made, re-integrating becomes harder and harder over time. I recognize that there are a lot of challenges to doing so here, both because of the initial reasons for the fork and because of the hurdles from the divergence so far, but at a fundamental level, that is the path we need to be on.
Regarding the specific proposals, importing en-WP descriptions where we have none would seem to make sense, although as some have pointed out above, there are challenges we'd need to overcome. I share the impression that where they diverge, en-WP descriptions tend to be better (due to the larger user base). I think what's likely to happen is that at some point (perhaps now, perhaps in a few years) the quality gulf will grow wide enough that we'll seek to adopt en-WP descriptions here. The question then becomes how to handle the situations where they are supposed to be different. Some further discussion is needed specifically on that question to define what exactly those circumstances are, and how best to handle them (probably through some technical modification, so that e.g. "for the verb use Q12345 instead" can be tacked on to a en-WP short description). {{u|Sdkb}}talk 07:34, 7 August 2020 (UTC)[reply]
English Wikipedia short descriptions are often different, but I disagree that they are in any way better on a large scale---exceptions in both directions may exist of course. It is pretty much personal flavor what one considers "better", but this is clearly not a basis on which a global sync should be enforced. ---MisterSynergy (talk) 08:26, 7 August 2020 (UTC)[reply]
The sample table Matthias Winkelmann compiled above is extremely compelling. Feel free to look through it yourself, but it paints a very clear picture to my eyes. I will very happily stand behind Military unit as a better description for 1st Marine Regiment (Q1778866) than alligator. {{u|Sdkb}}talk 08:36, 7 August 2020 (UTC)[reply]
Yeah I said that there are exceptions. So what? Apart from few exceptions, there is nothing which convinces me that the enwiki short descriptions are better in any way. Actually, they seem to be on average more detailed and a little longer, also maybe a little more individual---but there is nothing in them which Wikidata needs. Wikidata descriptions are disambiguators in the first place; they do not need to be detailed. ---MisterSynergy (talk) 08:47, 7 August 2020 (UTC)[reply]
They has to be detailed enough to clearly identify item among those, that are named similarly. And when you are automatically populating descriptions via quickstatement (the way we have >90% item descriptions) this goal cannot be seriously taken into consideration Ghuron (talk) 09:45, 7 August 2020 (UTC)[reply]
@MisterSynergy: The two million better-quality largely hand-crafted enW short descriptions aren't "a few exceptions": they would almost always work better for Wikidata that the bot-generated texts that Wikidata has at present (though of course not in absolutely every case). The many, many bot-generated texts such as "species of insect", "English writer" or "human habitation" are at such a high level that they hardly provide any useful information at all. And if all you want is disambiguation, there would even be no need for "species of insect" against any insect item that is already labelled with a unique binomial name. MichaelMaggs (talk) 12:25, 7 August 2020 (UTC)[reply]
The more standardized they are, the more useful they are. Mind that the way the descriptions are used in Enwiki (and other Wikipedias) is *not* the main purpose of Wikidata descriptions. They have just been used for this job as nothing else was available at the time this functionality was introduced to Wikipedias by WMF around five years ago. The Wikidata descriptions may not be optimal for Wikipedia, yet they are useful for Wikidata---particularly in the relatively standardized way we have them due to the prevalent automatic generation via bots. Any attempt to "sync" both descriptions is a bad idea for Wikidata. ---MisterSynergy (talk) 12:40, 7 August 2020 (UTC)[reply]

Looked the third example above, is this really a disambiguation page? I think we already have Wikimedia set index article (Q15623926), so why not changing its P31 value? --Liuxinyu970226 (talk) 14:13, 5 August 2020 (UTC)[reply]

To be honest, I really see no real difference between en.wiki 'set index' and a normal 'disambiguation page'. I think in most Wikis there is no such distinction and I don't understand why there is no Wikimedia set index article (Q15623926)subclass of (P279)Wikimedia disambiguation page (Q4167410). Wostr (talk) 15:48, 5 August 2020 (UTC)[reply]
You may ask @Avatar6, Michiel1972, Infovarius: for why. --Liuxinyu970226 (talk) 05:43, 7 August 2020 (UTC)[reply]

Discrepancy statistics

I was now able to compile a fairly complete comparison of English Wikidata descriptions and English Wikipedia "short descriptions" (limited to non-redirects in namespace 0, around 6.13M article pages in enwiki, data as of today):

result relative share absolute article pages note
case-sensitive identical 07.5% 460k
case-insensitive identical 03.8% 235k does not include the case-sensitive set obviously
different 25.1% 1.54M 68.9% of cases where both a Wikidata and enwiki description exist
WD desc missing, EN short desc existing 03.5% 215k could potentially be imported, as Mike proposed
EN short desc missing, WD desc existing 40.0% 2.46M
description missing in WD and enwiki 20.1% 1.23M

I think it is safe to claim that from the very beginning, short descriptions and Wikidata descriptions have taken quite different paths. —MisterSynergy (talk) 23:46, 7 August 2020 (UTC)[reply]

P1659 "See also"

Is property related property (P1659) broken somehow? When I've tried to add a statement using it, it refuses to match any value I enter, so it cannot be used. It's been like this for the last week at least. I don't have a problem with any other properties! Thanks for any insight, DrThneed (talk) 00:11, 6 August 2020 (UTC)[reply]

  • It's a property to link properties together. It's not designed to link items together. ChristianKl15:11, 6 August 2020 (UTC)[reply]
  • Gah it's right there in the description too. And I'm sure I knew that once! Thanks! But now I need a way to link related items. E.g. a place that has a house, the gardens of the house, and the instance of the house during it's use as a hospital, as three different items. Currently there is no way for anyone else to tell these things relate, and I Can't spot what the right property is to link them. Any suggestions?
    • You tell how items relate to each other by thinking about the relationship that the items have with each other and then choose the appropriate property. If you have an example of items where you can't find the appropriate properties, feel free to share the example. ChristianKl09:43, 8 August 2020 (UTC)[reply]

Model identifiers of products

What's the best way to represent specific model identifiers of a product? MacBook Pro 16-inch (Q78982844) is an example of an item where it would be useful. That line of MacBook Pros has (for now) a single model with identifiers "MacBookPro16,1" and "A2141". It has four sub-models with these model designations: "MVVJ2LL/A", "MVVK2LL/A", "MVVL2LL/A", "MVVM2LL/A". Cf en:MacBook Pro#Technical specifications 5. Ehn (talk) 15:51, 6 August 2020 (UTC)[reply]

If the item is supposed to represent all these variants, add them as aliases I guess. Ghouston (talk) 04:34, 7 August 2020 (UTC)[reply]
What is the right level for Wikidata? Should there be one item for each variant or one for all of them? Or both? If one item should cover all the variants, it should arguably be an instance of model series (Q811701) rather than computer model (Q55990535). But that creates issues with adding specification, as a lot of the properties in Wikidata property for items about computer hardware (Q22969262) don't like being added to instances of model series (Q811701). Ehn (talk) 09:20, 8 August 2020 (UTC)[reply]
I don't really know. Generally you need separate Wikidata items for each device where you want to record different properties. Nested "series" are difficult, e.g., Samsung SM-G975F is one of the Samsung Galaxy S10+ models, which is one of the Samsung Galaxy S10 variants, which is in the Samsung Galaxy S series of phones which are Samsung Galaxy phones, and "Galaxy" is also a brand that includes some devices like tablets and a few cameras. Ghouston (talk) 13:03, 8 August 2020 (UTC)[reply]
@Ghouston: What is the problem with the (deeply) nested series in your Galaxy example? It seems like we would want to represent that hierarchy if it exists in reality. Ehn (talk) 17:45, 8 August 2020 (UTC)[reply]
Yeah, I guess you are right, it's just picking the best way to represent it in Wikidata. I suppose we can just have Samsung Galaxy S10+ (Q66688566) part of the series (P179) Samsung Galaxy S10 (Q60021939) and Samsung Galaxy S10 (Q60021939) part of the series (P179) Samsung Galaxy S series (Q73389) and Samsung Galaxy S series (Q73389) part of the series (P179) Samsung Galaxy (Q493064), although at the bottom of the pile, we aren't really sure what's a "model" and what's a "model series", and at the top of the pile, Samsung Galaxy (Q493064) is really just a brand, not a series of products, since phones/tablets/cameras don't form a single series. Ghouston (talk) 00:48, 9 August 2020 (UTC)[reply]

Galaxy is arguably still a model series (of handheld devices, say) as well as a brand.

In iPhone land, you would then have something like iPhone 11 Pro part of iPhone 11 Pro (series) (since it also contains iPhone 11 Pro Max) part of iPhone (series), which is perhaps not what people expect, but correct. Assuming this is the best way to represent reality, on what level should specifications go?

For example, the iPhone 11 Pro and iPhone 11 Pro Max use the same CPU but have different screen sizes. Should the CPU only be specified on the iPhone 11 Pro (series) level or on each instance thereof? Only stating common specs on the series level reduces redundancy, saves time (for the data recorder), reduces the risk of inaccurate and inconsistent data (in case an error was found and fixed in one place but not the others). Stating common specs on each device makes the data easier to consume. An extreme example, if you go for making common claims only at the highest level, Apple should only be stated as the developer on the iPhone (series) level. That is probably not what people expect.

What's the best practice for handling "inheritance" in Wikidata? Ehn (talk) 08:12, 10 August 2020 (UTC)[reply]

Basically Wikidata doesn't have any inheritance. The only data the can be queried on a given item is what's actually on the item, and any inference from a parent class has to be done by the user. They'd have to decide themselves which properties on the parent class are relevant and which are not. Ghouston (talk) 08:19, 10 August 2020 (UTC)[reply]
Should this be read as "Wikidata doesn't currently have any inheritance" or "Wikidata is intentionally designed to not have any inheritance"? Ie, should we put all relevant claims on each item regardless of redundancy or hold off for infrastructure support? If the former, it seems anyone who would want to work on getting good coverage for even rather narrow product categories into Wikidata would have to build quite a bit of supportive user-side automation so as not to die of boredom (and make frequent mistakes). Ehn (talk) 08:29, 10 August 2020 (UTC)[reply]
Intentionally so. See Wikidata:Item classification --Oravrattas (talk) 12:52, 10 August 2020 (UTC)[reply]

@Oravrattas: Thanks for the pointer! That page says:

The implications of this model of classification is that while statements about a class are not inherited by its instances and subclasses, properties that are valid for a given class (see Wikidata:Domain and Wikidata:Range) are also automatically valid for all subclasses and instances of that class.

Is there a best practice for what properties should go on a class item vs its subclasses/instances? Presumably not everything should be duplicated, nor should everything be pushed to the leaves alone, as that would leave us with a hierarchy of mostly empty, non-descriptive classes and extremely detailed leaf nodes. Ehn (talk) 04:28, 11 August 2020 (UTC)[reply]

  • "MVVJ2LL/A", "MVVK2LL/A", "MVVL2LL/A", "MVVM2LL/A" all has part(s) (P527) "A2141" because they're different "retail packages" that contain different chargers, etc but the same laptop: "A2141". --Dhx1 (talk) 13:50, 7 August 2020 (UTC)[reply]
    This is incorrect. They represent different colors, CPUs, storage, probably other things. I guess Apple has yet another identifier (SKU?) for the retail packaging variants you mention. Ehn (talk) 09:23, 8 August 2020 (UTC)[reply]
    @Ehn: What causes the "M code" to change? Optional memory expansion for MacBooks, different colour, different country charger included in the box? everymac.com has some additional information that may assist. A retail box including charger will also likely have Global Trade Item Number (P3962) as a unique identifier. --Dhx1 (talk) 10:43, 8 August 2020 (UTC)[reply]

    @Dhx1: The "M codes" represent base models/configurations within a product generation. EveryMac.com refers to these strings as order numbers, Wikipedia as model numbers, whereas it seems Apple uses the term item number, at least on its receipts. For example, within the model sub-series that Wikipedia refers to as "fifth-generation MacBook Pro models", in late 2019, Apple launched a 16-inch version, which received the model identifier "MacBookPro16,1", model number "A2141" and EMC "3347". This version comes in four base models (with item numbers "MVVJ2LL/A", "MVVK2LL/A", "MVVL2LL/A", "MVVM2LL/A"), which differ in CPU and storage (2.6 GHz 6-core Intel Core i7 and 512 GB or 2.3 GHz 8-core Intel Core i9 and 1 TB) and color ("Silver" or "Space Gray").

    To further complicate matters, these base configurations can be customized and built to order, in which case the delivered laptop would, if I understand correctly, not have one of these four order numbers. It would still have the same model identifier, model number, and EMC. Given that, representing every order number may be too fine-grained, and won't include every user-configurable option anyway. Perhaps model identifier is the right level. Ehn (talk) 17:42, 8 August 2020 (UTC)[reply]

    We have the same thing with vehicles, which often come in numerous variations for a single "model", such as engine, body type and colour, and CPUs, which sometimes have a large range differing in clock speed, cache size, fabrication technique etc. Ghouston (talk) 00:54, 9 August 2020 (UTC)[reply]

Would anyone be interested in helping resolve some tricky interwiki links?

I've been working on sychronising the Commons category links on enwp with the Commons sitelinks here. Mostly it's going well (20k+ already resolved), but there are some particularly tricky items, would anyone be interested in helping with them? There needs to be a clear (and single) link between the item that sitelinks to enwp and the item that sitelinks to Commons, normally through topic's main category (P910)/category's main topic (P301) or list related to category (P1753)/category related to list (P1754). You can find the items via:

Thanks. Mike Peel (talk) 20:09, 7 August 2020 (UTC)[reply]

@Mike Peel: I'm working on something else right now, but wanted to at least cheer you on. This is very valuable work. --99of9 (talk) 08:00, 9 August 2020 (UTC)[reply]

Scientific papers help

Hi all,

I'm start to conduct a new GLAM, the idea is to digitalise Scientific papers.

Even after seeing a lot of scientific papers described here, I do not know how:

  • Abstract
How to include they?

This is important for scientific articles, and yes, I know, is not so much a data, however, it is important.

  • Bibliography
How to include they as plain text?

The Bibliography, I understand that would be nice to have a new entry for all of it, but, every article have around 30 references, and a great part of it, the Institution do not have the even ISNN, so create a new entry for it, would make the project impossible to accomplish, and loose this information seems not close to good.


Thank you :*

Rodrigo Tetsuo Argenton (talk) 21:19, 7 August 2020 (UTC)[reply]

I don't think it will be possible to include the abstract here. If the paper is in the public domain (or a suitable free licence), it can be placed on Wikisource and then a Wikidata item can link to it. If the paper is not in the public domain or available under a free licence, it will be a copyright violation to place the abstract on any Wikimedia site. From Hill To Shore (talk) 21:50, 7 August 2020 (UTC)[reply]

Value Constraint for owner of (P1830)

I tried starting adding value constraints for owner of (P1830) and added pretty high level values. I'm unsure whether the constraint picks up the P279 inheritence right. There might also a bunch of other classes that should be added to the constraint. ChristianKl11:46, 8 August 2020 (UTC)[reply]

Connections between district and appellate courts?

Are there Wikidata properties to relate district and appellate courts? If yes what are they?

For example, appeals from United States District Court for the District of Kansas (Q7889773) go to United States Court of Appeals for the Tenth Circuit (Q286918), but I don't see that information in either of these Wikidata items. That information is in their corresponding Wikipedia articles. Might it also be good and appropriate to have it in their Wikidata items as well?

I'm not an attorney. In this particular case, appeals from a "District" court go to a "Circuit court of appeals". I don't know technically how cases get to the US Supreme Court. I'm guessing that they would be appealed from a Circuit court of appeals, but I don't know that. And appeals systems in other countries are doubtless different from the US, and I don't know how that works.

I suppose we could say that

However, neither of these feel quite right to me. Are there properties for this? If not, should they be created?

What do you think?

Thanks, DavidMCEddy (talk) 03:20, 9 August 2020 (UTC)[reply]

@DavidMCEddy: I agree that it would be useful to have standard way to link a lower court to its superior/appellate court. Currently this seems to be done in a variety of different ways, none of which are ideal:
SELECT ?p ?pLabel (COUNT(*) AS ?count) 
WHERE {
  ?s ?pd ?o .
  ?p wikibase:directClaim ?pd .
  ?s wdt:P31/wdt:P279* wd:Q41487 .
  ?o wdt:P31/wdt:P279* wd:Q4959031 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} 
GROUP BY ?p ?pLabel 
ORDER BY DESC(?count)
Try it!
Of those next higher rank (P3730) seems closest for this type of relation, even though not very widely used here yet. Do you think this adequately captures the relationship? I agree that part of (P361) seems wrong, though I can see why people would gravitate to using it in the absence of an obviously better property. --Oravrattas (talk) 06:05, 10 August 2020 (UTC)[reply]

Merging items

Hello, I'm trying to merge Q21506829 into Q19810348, but I'm having a bit of trouble. I read the instructions and tried the automatic method, but this didn't work. Hence I did it the manual way. I've moved over the data into Q19810348 and emptied Q21506829. However, I can't get Q21506829 to redirect to Q19810348. Both items are about the name Mika. Mikalagrand (talk) 10:19, 9 August 2020 (UTC)[reply]

Possibly because they are linked to each other. But one seems to be for use as a given name, and the other a surname. Ghouston (talk) 10:26, 9 August 2020 (UTC)[reply]
The articles linked to the items cover all uses of the name Mika; given name, nickname and surname. Mikalagrand (talk) 11:01, 9 August 2020 (UTC)[reply]
Do not merge family name with given name. --HarryNº2 (talk) 11:03, 9 August 2020 (UTC)[reply]
Hello HarryNº2 (talk), before you undo everything. Why do you think Q21506829 shouldn't be merged into Q19810348? I don't think there is any need for seperate items and articles for given names and surnames. If you look at the linked articles, they cover every use of the name Mika. Mikalagrand (talk) 11:15, 9 August 2020 (UTC)[reply]
Is everything right, the interwikilinks linked now here: Mika (Q1158495). For more information please look here: Wikidata:WikiProject_Names#Sample_items. There you can also ask your questions if something is incomprehensible. Greetings, HarryNº2 (talk) 11:29, 9 August 2020 (UTC)[reply]
Okay so if I understand correctly: there are supposed to be multiple items for a name. Though, since the articles cover all uses (given name, nickname, family name) they are all linked at one of these items. I understand why you would have seperate items for names of people and disambiguation articles, but not why you would have a seperate item for given name, nickname and family name. I feel like this increases the chance of error (ie new language articles not being linked at the correct item), for no obvious advantage. Greetings, Mikalagrand (talk) 11:48, 9 August 2020 (UTC)[reply]
Please discuss this here: Wikidata talk:WikiProject Names. --HarryNº2 (talk) 11:55, 9 August 2020 (UTC)[reply]
@Mikalagrand: Wikipedia articles aren't central to Wikidata items. Wikidata generally distinguishes more different entities. ChristianKl18:53, 9 August 2020 (UTC)[reply]

.jpg --> .png

Would somebody kindly change the .jpg abomination currently in Q3269011 to c:File:ARToolKit logo.png? --Palosirkka (talk) 18:56, 9 August 2020 (UTC)[reply]

✓ Erledigt, thank you.

Treaty website v. content?

Treaty on the Prohibition of Nuclear Weapons (Q28130514) documents the w:Treaty on the Prohibition of Nuclear Weapons, including linking to the Wikipedia article on it AND identifying all the countries that have signed and ratified it. In that Wikipedia article, the website for the treaty is referenced as {{Cite web |url=https://treaties.un.org/Pages/ViewDetails.aspx?src=TREATY&mtdsg_no=XXVI-9&chapter=26&clang=_en |title=Chapter XXVI: Disarmament – No. 9 Treaty on the Prohibition of Nuclear Weapons |publisher=United Nations Treaty Collection |date=2019-07-06 |accessdate=2017-09-21}}.

Is there any reason I should NOT create a separate Wikidata item with the same name to cite that webpage directly?

Doing so should make it easier to cite it elsewhere while also making it easier to maintain, e.g., against w:link rot.

Thanks, DavidMCEddy (talk) 19:50, 9 August 2020 (UTC)[reply]


DarTar (talk) 08:28, 19 May 2018 (UTC) Daniel Mietchen (talk) 11:24, 19 May 2018 (UTC) Maxlath (talk) 11:33, 19 May 2018 (UTC) Jumtist (talk) 11:34, 19 May 2018 (UTC) Pintoch (talk) 11:40, 19 May 2018 (UTC) JakobVoss (talk) 11:44, 19 May 2018 (UTC) PKM (talk) 20:12, 19 May 2018 (UTC) ArthurPSmith (talk) 13:47, 22 May 2018 (UTC) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits Vladimir Alexiev (talk) 12:43, 27 November 2018 (UTC) Ivanhercaz (Talk) 11:55, 3 February 2019 (UTC) Epìdosis 11:23, 15 April 2019 (UTC) Tris T7 TT me Kpjas (talk) 07:45, 2 March 2021 (UTC)[reply]

Notified participants of WikiProject Wikipedia Sources Kindly requesting some clarification with this topic please. Wallacegromit1 (talk) 20:32, 10 August 2020 (UTC)[reply]

I've created many Wikidata items for documents and referenced them using w:Template:Cite Q.
A few days ago when I searched for "Treaty on the Prohibition of Nuclear Weapons", I found there was already a Wikidata item with that title but referring to the treaty itself, not the document. I hadn't noticed a Wikidata item like that before, so I asked two of the people maintaining that Wikidata item if they saw any problem with creating another Wikidata item with the same title that referred to the document. User:Wallacegromit1 asked me to repost here.
To me this seems crudely analogous to the discussion above re., "Do islands need to have separate items for its administrative territory? (2)". One response there is that, "you want at least two items for an island. One for the administrative unit, one for the island, both will have a different P31."
??? Thanks, DavidMCEddy (talk) 22:39, 10 August 2020 (UTC)[reply]

The Source MetaData WikiProject does not exist. Please correct the name. Kindly requesting some clarification with this topic please. Thanks for the patience @DavidMCEddy: Wallacegromit1 (talk) 03:59, 13 August 2020 (UTC)[reply]

  • There is no best practice established. Do whatever you like. If you wish, write some documentation and propose it as a best practice. Probably the most developed similar discourse is how many Wikidata items to make for a book. Books exist as a concept of the work itself, then also as texts with different content in various editions and translations, then possibly also as particular unique printed copies of the book. I think in this case you are talking about the concept of the treaty as a work, then also as a text source. Do anything you wish, except implementing what you do at scale. Eventually we need a collection of cases to set a guideline. Blue Rasberry (talk) 13:58, 13 August 2020 (UTC)[reply]

Q96410061

Will a bot eventually fill in the data for Tom Gallagher (Q96410061), or do I need to do it by hand? --RAN (talk) 21:36, 9 August 2020 (UTC)[reply]

Looks like it is now filled. --RAN (talk) 13:19, 11 August 2020 (UTC)[reply]

Occupations and their Discontents

I am trying to wrap my head around the reasoning behind various odd things being defined as occupation (Q12737077), from 8 (Q340894) (and number 8 (Q2270369)) to the more relevant criminal (Q2159907) and serial killer (Q484188). There is no reason why these and should be occupations, and the talk pages are full of complaints and lacking in substantive replies. Yet the practice persists and any change is, sooner or later, reverted.

My understanding of the term is reflected in the dictionaries' definitions. Ignoring military occupations as another, entirely different concept, here are the relevant entries from the people who invented overthinking the English language:

(1) a person’s job: He listed his occupation on the form as "teacher."
Cambridge Dictionary

The primary sense of occupation here is synonymous with job or vocation. And while there is something like a "career criminal", that aspect makes up a rather small part of the meaning of the generic "criminal". {{P|31} requires, with few exceptions, the instances to make up a subset of the class. For "criminal" there is some small overlap at best. Murder is financially motived in only a fraction of cases, and serial murderers almost never are.

(2) An occupation is also a regular activity:
Sailing was his favorite weekend occupation.
Cambridge Dictionary

It is true that occupation isn't always financially motivated. But this sense cannot be equated with, basically, "anything anyone does", or we would need to add "cook" and "parent" to a whole lot of items, as those are activities many people spent far more time on than Ronald DeFeo Jr. (Q933143) ever did on murder. It is used only for regular activity, and most often activities undertaken for leisure. It gets close to hobby, but with a bit more focus on the act of engaging in something rather than the field itself. The related term "to occupy oneself" also hints at a sort of relaxed wastefulness (of time) here.

Again, none of the items get even close to the meaning of the word. It is quite telling that none of these items have categories at either enwiki or dewiki supporting such a claim. My understanding of other languages is limited, but I have not seen any indication that this is all caused by some Babylonian miscommunication, either.

Examining the languages, we find @Jeblad: bitterly complaining here, and mentions that this issue is a major problem for nowiki. @Neo-Jay: gives the most substantive defense for the practice I found: it "refers to an activity on which time is spent or something that somebody does in his/her free time, not just the job by which somebody earns a living", they say. This aligns with the second dictionary definition, above. I am not a native speaker and my intuition may be off, but I just don't see how that definition, or any other, naturally fits with Dylann Roof (Q20203314)'s decision on how to spent his free time on the afternoon of June 17, 2015. Neo-Jay continues: "Many items have the statement: "occupation: serial killer". Changing "serial killer" from "occupation" to "human activity" caused lots of errors, which makes the argument entirely self-contained. Those statements are obviously just as wrong, and would need to be changed as in, for example, Ronald DeFeo Jr. (Q933143)convicted of (P1399)murder (Q132821).

I believe this is French surprise seeing conspiracy theorist (Q19831149) used as occupation. That case is actually slightly more plausible. Yet I would agree with @Korg:, as it seems to me that fearing alien abductions is not (necessarily) an activity, but closer to a believe system. Although it's probably the aliens making me think that.

Back to actual murderers, here is the same issue being raised by @קיפודנחש:, who I assume is a speaker of Hebrew. They also assert to have surveyed other languages and repeatedly finding this pattern. @Urjanhai: speaks Finnish, I believe, and [add the Oxford Dictionary's] definition to my Cambridge version from above.

Reading these, and the replies, I get the sense that everyone thinks some other language wants or needs this. If so, I haven't found it.

The problem is larger than just murder. Running this query will give you a list of 10,000 "occupations", the majority of which would get you more strange look than serial killer. Among them are "METROPOLIS" -Sicherungsstück Nr. 1: Negative of the restored and reconstructed version 2001 (Q28028253) (a copy of a movie), Joseph Prielozny (Q66084615) (purveyor of "Christian hip hop" and also, apparently, something that can be done), a mover (Q27105472) (correct) and a Shaker (the religion). Most prominent, however, is the endless list of sports positions, whose categorisation as occupations is also tenuous, which brings me to the end (Q5375780).

--Matthias Winkelmann (talk) 17:47, 9 August 2020 (UTC)[reply]

Please adhere to good faith! (“we find @Jeblad: bitterly complaining here”) Jeblad (talk) 10:14, 13 August 2020 (UTC)[reply]
Clearly the issue is that if you're trying to describe a human (Q5), after you fill in their name, birthdate and country, the next thing you want to add is what they're known for doing.
In the case of Dylann Roof (Q20203314), nobody knows or cares that he had recently been paid as landscaper (Q43184282). (In fact I'd claim it's not a notable fact at all.)
So I'd say that the sense of the word "occupation" as "how one spends one's time" is the better one to apply, and this is why P106 has as its primary label "occupation" and not "profession".
I think the issue here is that we are still very much building a structured database that backs (or is derived from) an encyclopedia. We are not building a statistically-valid database of every human. When we ask someone like Al Capone (Q80048) what his occupation is, we are not looking for the same answer as a census worker or an employment statistician.
(In other words, I'd say you can think of P106 as the slot for filling in the Q-number(s) for whatever profession, occupation, pastime, or activity you find in the first sentence of the subject's Wikipedia article, or Wikidata description.) —Scs (talk) 11:24, 10 August 2020 (UTC)[reply]
« it seems to me that fearing alien abductions is not (necessarily) an activity, but closer to a believe system. » →‎ The current English-language Description of conspiracy theorist (Q19831149) item is « person promoting conspiracy theories »: conspiracy theorist (person promoting conspiracy theories). And Alex Jones (Q319121) does not fear or believe anything; it's only for the money. Visite fortuitement prolongée (talk) 19:32, 10 August 2020 (UTC)[reply]
Teolemon
Netoholic
econterms
Jneubert
Moebeus
Albertvillanovadelmoral
salgo60
Epìdosis

Notified participants of WikiProject Occupations and professions

It seems like what we are looking for is something very similar to instance of (P31) that captures the is-a relationship for humans. I agree that occupation (Q12737077) is probably not the correct term for many of the cases you listed. --Hannes Röst (talk) 21:31, 10 August 2020 (UTC)[reply]
instance of (P31) has sometimes been used for this purpose, and I used to agree enough to do it myself–only to see it being moved back to occupation (P106) shortly thereafter.
I've come to feel uneasy about instance of (P31) while working on the import of data from Central Database of Shoah Victims' Names (Q59522549): A statement such as (Q94975069instance of (P31)Holocaust victim (Q5883980), while technically correct, doesn't quite feel right. It reduces that person and their life to only the circumstances of their death and perpetuates their murderers' power over them. And even though serial killers deserve less sympathy, it just doesn't satisfy the curious mind to reduce people to such shallow description. Using more than one value is obviously possible, but anything specific is often not known, and adding human (Q5) almost feels like saying the silent part out loud: "...but still human". Plus, this just invites long discussions where to draw the line between instance of (P31) and the lesser properties: my intuition says we would be more likely to consider people to "be" writer (Q36180)s than landscaper (Q43184282), even where as much time is spent on the latter activity as the first (which, unlike Scs, I do find interesting in relation to Dylann Roof (Q20203314), and, even if not, could see how its inclusion could be useful to others, and have therefore added). But how much do I have to write/publish/sell to be "a" writer, and not just someone who writes? It's a continuum, making it impossible to set any fixed criteria, let alone ones everyone agrees with.
Action item: Find a term that works with these values, but less definitive than "is a"? Or can completely do without them with individual solutions, such as Dylann Roof (Q20203314)convicted of (P1399)murder (Q132821)number confirmed (P1674)"9"?
The Holocaust victim (Q5883980) example came up here a few weeks ago, and it was pointed out that this should definitely not be on instance of (P31), as humans should never have anything other than human (Q5) for that. I don't think anything better was suggested, however. Perhaps significant event (P793) might be better for some of these? --Oravrattas (talk) 05:19, 11 August 2020 (UTC)[reply]

Launching Wikiproject tabular data

Commons has a namespace dedicated to data stored in a Json like format. While they do not seem to offer anything Wikibase cannot do at all, they offer an attractive alternative in some cases, notably because they are:

  • more lightweight
  • more flexible in terms of license (supportd CC-by and CC-by-sa).

Demographical time series are a clear example of data we could never really get to upload to Wikidata and would be suited for tabular data. We are currently planning to transfer data about French municipalities from frwiki Lua modules to Commons' tabular data.

Those data would be much more usable if they we linked from Wikidata, and if they were standardized, probably using Wikidata qids as semantic identifiers.

I have created a stubbish Wikidata:WikiProject Tabular data. Anyone willing to help ? --Zolo (talk) 06:49, 10 August 2020 (UTC)[reply]

In my opinion the major flaw of tabular data is lacking a way to denote how should the data be interpreted.--GZWDer (talk) 06:59, 10 August 2020 (UTC)[reply]
We could build a bridge by for example indicate in a dataset item the mapping between a column of the tabular data the corresponding property if possible, provide a way to map the item-corresponding values to Wikidata items (an identifier property ?) — for example if a column contains INSEE municipality code (P374) View with SQID in a dataset related to france we have a natural way to find the corresponding items). If a group of column can be translated into a statement we could specify in the metadata (subject item column, or subject item of the whole dataset if relevant, property for the main value, main value column, qualifiers column …)
But it might be more complex, involve mapping values with maths expressions or queries, properties that will be rejected on Wikidata and so on … Maybe this would need to refer to properties outside of Wikidata in the web of data world …
Maybe we could include datas needed to configure a tool like openrefine to do the reconciliation easily (ideally and ultimately, you load an item dataset into openrefine and it loads the datas, starts the reconciliation work …) author  TomT0m / talk page 09:05, 10 August 2020 (UTC)[reply]
My idea was to start with a dataset for French municipalites that would use properties ids in field names and Qids in values when applicable. See c:Data:Population FR 01001 L'Abergement-Clémenciat.tab or the similar c:Data:Taipei Beitou District Population.tab by user:S-1-5-7. We definitely need to agree on standared data structures so that the data can be used as broadly as possible, starting at Wikipedias. -Zolo (talk) 11:05, 10 August 2020 (UTC)[reply]

Strategy transition design draft

I have been working for more than a month in the Strategy transition design group - a body of about 20 people who were working together to establish the principles to be used to design the events to implement the strategy recommendations. (Do not even ask me how I ended up to be part of the group). Anyway, now we have produced the draft: meta:Strategy/Wikimedia movement/2018-20/Transition/Events Outline/Draft. It is written for the whole movement, not just for the projects, and certainly not just for Wikidata, so the language from our perspective can look a bit bureaucratic, and the text a bit (or sometimes too) unspecific. However, I would encourage all the users interested in the relation between projects and WMF, and generally in the development of the Wikimedia movement, to have a look, and, specifically, to look at whether the communities (in the language of the document, online communities) will be involved enough, how they will be involved, and how this involvement can be stimulated and improved. Whereas obviously there were many people involved in the creation of the draft, and these people have very different interests, the importance of involving the projects has been recognized by everybody as a crucial issue. What we are trying to avoid is the (unfortunately, common) situation when the projects are completely decoupled from the process, the process runs on, and at some point some decision taken without even thinking about the projects comes out of the blue and gets a (predictable) very negative reaction.

The draft has been posted on Thursday 6 August and will be open for comments until 20 August (my apologies for posting here only now, I was on holidays this week and just returned home). You are welcome to leave the comments on the talk page of the draft on Meta (where it will be directly read by the WMF people running the process), or here. I will be watching both pages anyway, and will somehow make sure that useful comments do not get lost. I can probably clarify things if needed. There is also some discussion ongoing on the English Wikipedia, w:en:Wikipedia:Village pump (WMF)#Strategy transition design draft, which might (or might not) clarify some issues.

For the full disclosure, whereas the process has been run by the WMF, I was never paid by the WMF, nor ever been a member of any affiliate. I participate in the group solely in my volunteer capacity.--Ymblanter (talk) 08:48, 10 August 2020 (UTC)[reply]

Certain countries have P31s in the wiki pages but not in query result

I'm trying to find a list of all countries contained within countries but certain countries, namely England (Q21) and Scotland (Q22), seem to be missing all but one instance of (P31)'s, most critically for me country (Q6256). This isn't the case however for Wales (Q25). Does anyone know what's causing this?

Compare the result of the following query to the page for England (Q21).

SELECT DISTINCT ?p31 ?p31Label
WHERE
{
  wd:Q21 wdt:P31 ?p31 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
Try it!

Cdo256 (talk) 09:11, 10 August 2020 (UTC)[reply]

@Cdo56: This is because these values have "preferred" rank - the "wdt:" format of the query says "give me the best answers", which is all the entries of preferred rank, and if there aren't any, all the entries of normal rank. If you look on England (Q21), you'll see that there's a different icon just to the left of the "constituent country" entry, with the top arrow highlighted, to indicate that it's preferred.
To get all values, you can use this syntax:
SELECT DISTINCT ?p31 ?p31Label
WHERE
{
  wd:Q21 p:P31 ?statement . ?statement ps:P31 ?p31 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
Try it!
You'll see that it now returns four answers - preferred plus the three normal. Andrew Gray (talk) 09:39, 10 August 2020 (UTC)[reply]
Ah that makes sense. Thank you! --Cdo256 (talk) 09:46, 10 August 2020 (UTC)[reply]

Range of serie numbers

I created Q98207872. I would like to add the vehicle number range 501-528, mentioned in fr:Matériel roulant de l'ELRT. How can I do this?Smiley.toerist (talk) 10:33, 10 August 2020 (UTC)[reply]

Error merging two items

Hi! I'm trying to merge this entity: Q24677919 (2016) with Q158822 (2012). However, I'm experiencing an error as one (SV) of the two projects connected currently has two separate pages for that entity (when they're the same entity). I'm not sure if this is the right channel to ping this, but does anybody have recommendations on how to resolve this? Thanks! — Infogapp1 (talk) 11:42, 10 August 2020 (UTC)[reply]

Every wikipage has it's own wikidata entry. When one language version has two pages, it has two wikidata items as well. If they are truely the same, then the pages should be merged/redirected on sv-wiki first. Edoderoo (talk) 12:12, 10 August 2020 (UTC)[reply]
@Edoderoo: I reached out to one of the admins in the SV project. Thanks! — Infogapp1 (talk) 19:55, 10 August 2020 (UTC)[reply]
Not the same - Pontcysyllte Aqueduct and Canal (Q24677919) is part of Llangollen Canal (Q1545923), and includes Pontcysyllte Aqueduct (Q158822) and Chirk Aqueduct (Q5101943). Peter James (talk) 22:24, 10 August 2020 (UTC)[reply]

Q19423984

At William Scudder Stryker (Q19423984) there are two ways to format "described by" for the same source, which should we harmonize on? --RAN (talk) 13:26, 10 August 2020 (UTC)[reply]

Add an additional qualifier, statement is subject of (P805) = Appletons' Cyclopædia of American Biography/Stryker, William Scudder (Q89921249), to the statement described by source (P1343) = Appletons' Cyclopædia of American Biography (Q12912667).
Also, on Appletons' Cyclopædia of American Biography/Stryker, William Scudder (Q89921249) add main subject (P921) = William Scudder Stryker (Q19423984). Jheald (talk) 15:05, 10 August 2020 (UTC)[reply]

Wikidata weekly summary #428

Marking logo images that exist but are not in public domain

I recently uploaded an old seal image (P158) of an organization that has fallen into the public domain. I marked its start date and end date, but to make extra sure no one thinks it's the current logo, I would like to somehow mark that it has a more recent seal, but that it just isn't in the public domain yet. How can I do that when there's no image on Commons to link to? "Unknown value" and "no value" both don't seem fully appropriate, since there is a known value, just not one I can represent with a Commons file. Thoughts? {{u|Sdkb}}talk 21:55, 10 August 2020 (UTC)[reply]

If it's a logo that is no longer in use, but the organization is extant, and it would be misleading for the logo to appear in any of the robot sources that mindlessly suck from Wikidata (e.g. if it would be the default image from a simple Google search) then it might be marked as deprecated, and reason for deprecated rank (P2241) perhaps given as anachronism (Q189203) oder property is not suitable for entity (Q86191979). Or maybe there's a better way, if so, do that: Wikidata is still the Wild West. Anything goes, buckaroo. -Animalparty (talk) 02:04, 11 August 2020 (UTC)[reply]
@Animalparty: Sounds alright. From a longer-term perspective, we might want to find a way to note the existence of fair use images, since I doubt this is the only instance that's ever come up where it's been desired to link to one. {{u|Sdkb}}talk 03:03, 11 August 2020 (UTC)[reply]
deprecated rank isn't meant for this. --- Jura 08:54, 12 August 2020 (UTC)[reply]
@Sdkb: User @Animalparty: is wrong here. Deprecated rank should not be used for old information which is factual. An end time (P582) and end cause (P1534) qualifier are both suitable for such a statement. There may be a suitable end cause to show that the logo is replaced by a newer one. As for a new statement to add, I would say that "unknown value" with a matching start date is most appropriate. It's really more like "some value", just that it is unknown to whoever retrieves it from Wikidata (as it's not stored here, which can be for many reasons). --SilentSpike (talk) 10:27, 12 August 2020 (UTC)[reply]
SilentSpike Cool. Is there a place where this is explicitly stated on Wikidata? I've been on Wikdata for years and most formal guidance, if there is any, seems obscure and hard to find. Hence, wild wild west. -Animalparty (talk) 01:41, 13 August 2020 (UTC)[reply]
Ranks are explained at Help:Ranking. --- Jura 05:29, 13 August 2020 (UTC)[reply]
@Jura1: Looking at that page, the right thing to do seems to be to give the current value (of unknown, per User:SilentSpike) preferred ranking, and just leave the old one at normal. Suggestion for Wikidata's interface: Help:Ranking should be linked underneath the options anytime someone opens the selection for them. Is there anything specific I should use for reason for preferred rank (P7452)? {{u|Sdkb}}talk 16:21, 13 August 2020 (UTC)[reply]
@Sdkb: Could use most recent value (Q71533355) --SilentSpike (talk) 18:00, 13 August 2020 (UTC)[reply]

To what degree should this item be used on identifiers representing social media accounts? --Trade (talk) 07:21, 11 August 2020 (UTC)[reply]

probably not at all if the social media account is public. --Hannes Röst (talk) 16:20, 11 August 2020 (UTC)[reply]

Migrate sports team colors from enwiki

I've started a discussion on English Wikipedia about migrating some sports team color data to Wikidata (it's currently stored in a local Lua module on enwiki): en:Module talk:College color#Proposal: Migrate to Wikidata. –IagoQnsi (talk) 07:38, 11 August 2020 (UTC)[reply]

Wikidata: edit the number of registered users/contributors

In the Wikidata graph of Wikidata, it claims that the site has 2,565,510 number of registered users/contributors (P1833) but offers no citation. This is probably from Special:Statistics (from 4 years ago).

I'd like to add a new number at a new point in time (today) and properly cite. Unfortunately, the help on editing isn't very forthright about editing semi-protected entries. Can anyone provide any direction?  – The preceding unsigned comment was added by Schmudde (talk • contribs) at 08:30, 11 August 2020 (UTC).[reply]

Hi Schmudde, semi-protection is described in the page protection policy. Hazard-SJ (talk) 05:51, 15 August 2020 (UTC)[reply]

Wikidata Bridge v1 to be deployed on Catalan Wikipedia

Hello all,

Screenshot of the test Bridge

As you may know, in the past months, the Wikidata development team has been working on the first version of the Wikidata Bridge, the feature that will allow Wikipedia editors to edit Wikidata content directly from their infoboxes on the Wikipedia interface.

The first version is about to be deployed on Catalan Wikipedia, the first Wikipedia community to try it, who’s been providing great enthusiasm and support towards the project. The deployment will take place on August 18th and we’ll carefully monitor the possible issues during the upcoming weeks.

The first version of the Bridge includes minimal features, such as editing string datatype values, displaying the existing Wikidata references, adapting the rank depending on the reason for the edit (fix or update), and redirecting to Wikidata for actions that are not yet possible with the Bridge. It is not the final version, many improvements will be made in the future, but we wanted to have a first version out as soon as possible, to see if it works for the editors and collect feedback.

On Wikidata, edits coming from the Bridge will be flagged with the tag Data Bridge. We also made sure to check the different levels of protections, so items that are protected on Wikidata cannot be edited from Wikipedia, and users who are blocked on Wikidata cannot edit from Wikipedia.

It’s an exciting step in the direction of more collaboration between Wikidata and Wikipedia editors, and we’re looking forward to welcoming more Wikipedia editors on Wikidata!

You can follow the progress of the feature on this documentation page and on the related talk page. If you have any questions or suggestions, let us know. Cheers, Lea Lacroix (WMDE) (talk) 11:59, 11 August 2020 (UTC)[reply]

A series of queries .. work in progress. --- Jura 12:06, 12 August 2020 (UTC)[reply]

Q11705477

Hello,

I just made some edits to Q11705477 and I would like to double check if the edits are set in there correctly before I do the 71 others. I used the help articles and an example, however better to be sure then need to clean up 72 articles if it goes wrong.

I didn't find a helpdesk like page, so I hope my post here is correctly places. Thanks! Ziminiar (talk) 17:37, 12 August 2020 (UTC)[reply]

Weather Database and WikiProject Weather

Good afternoon. I'd like to bring your attention to a proposal for a WikiProject Weather, to organize all weather articles under a single project umbrella. There are many areas that Wikipedia has missed, maybe because it was a long time ago, or it is in a country that doesn't speak English. Also, given climate denialism, I believe it is important to establish a Worldwide Weather Database. Google already uses Wikipedia's infoboxes for their search results, and Alexa often uses Wikipedia (partly because it's more accurate, partly because of how comprehensive it is). I'm not exactly sure the best way to implement it, but I believe that WikiData, and the proposed Abstract Wikipedia, could be a part of this endeavor.

As part of uniting all weather articles, each event would be tagged with categories such as location, date, fatalities, injuries, damage total, and weather event - all the basic stuff we usually include in infoboxes. The international disasters database - [3] - already kinda does this, but not everyone knows to look there. Wikipedia, on the other hand, is known as an institution at this point.

Thanks for your time reading this. Any thoughts and feedback would be appreciated. If this proposal is in the wrong place, please direct me to the correct location. Thanks a lot. Hurricanehink (talk) 18:29, 12 August 2020 (UTC)[reply]

@Hurricanehink: Yes, this is a good place to leave such a message. However, you may want to leave additional messages at Wikidata:WikiProject Weather observations/en and Wikidata:WikiProject Climate Change. From Hill To Shore (talk) 20:08, 12 August 2020 (UTC)[reply]
Consider also Wikidata:WikiProject Humanitarian Wikidata. Blue Rasberry (talk) 13:59, 13 August 2020 (UTC)[reply]

What is the property for the organization in charge of publication when not publisher?

There seems to sometimes be a third publishing layer between author and publisher, and I'm not clear on what property to use to name it: namely, when the work is written for an organization. In this WorldCat record, they describe it as "Responsibility: Peter W.C. Uhlig, James Baker;" "Issued by: Ontario Forest Research Institute;" "Publisher: Ontario Ministry of Natural Resources." So it's a publication of the Forest Research Institute, written by two researchers working at the institute, published by the Ministry. — Levana Taylor (talk) 02:36, 13 August 2020 (UTC)[reply]

Would using approved by (P790) as a qualifier on publisher (P123) do what you need? From Hill To Shore (talk) 05:32, 13 August 2020 (UTC)[reply]
Well, the actual author is the researchers. I suppose one could list the organization as an additional author but it'd need some sort of qualifier. — Levana Taylor (talk) 18:03, 13 August 2020 (UTC)[reply]
Probably not so far off since there are actual humans that wrote the text but the text may represent the opinion of a government agency so in that sense it is a corporate author. Note that no corporation ever writes a text, its always a human who writes the text on behalf of an organization. Maybe we can use issued by (P2378) and expands its definition to not just identifiers but any sort of publication or document? There is also editor (P98) but I dont think it fits.--Hannes Röst (talk) 03:48, 14 August 2020 (UTC)[reply]

Linking Medieval Sources (mostly from the Regesta Imperii (Q316838))

Full disclosure: I work for the Regesta Imperii.

  • We from the RI have reworked some of our published and unpublished indices of Persons and Places and identified existing entities in our registries and added new entities on Wikidata for which we have enough source material.
  • As a next step, we want to link the URIs of our Regesta to the respective Entities on Wikidata via the described by source (P1343) property. I have started with that, but there have been some.. challenges. Sometimes it worked, sometimes it created empty statements.
  • Structurally, the Regesta Imperii has 14 departments structured around dynasties and emperors, representing around 180.000 Regesta, sourced from charters between the 9th and 16th century. From these 180.000, ~30.000 correspond to identified entities on Wikidata. We are in the process of creating entities for our collections, for example Regesta Imperii XIII (Q97879676) and linking the respective Regesta to it's department, so the user can differentiate.
  • The main benefit from linking the Regesta in the first place is, besides of linking to sources, the ability to trace the activity of a noble family, for example, over the centuries. We have linked individual nobles to their respective noble families, and when the Regesta are linked to the respective nobles, one could ascertain what members of these noble families have been up to during the reigns of different emperors and dynasties.
  • A secondary benefit, which we have no direct control over, is to link other collections of medieval sources via the same property, so users can get the contemporary breadth besides the temporal one. We are currently in talks with the Germania Sacra project (Germania Sacra (Q1514123)) to link Regesta to the entities they have identified in their indices.
  • Based on the experiences from the test runs, we have decided to only link to Person entities identified in our indices for now, because since most of our places are cities, there is an disproportionately amount of Regesta pertaining to these cities, wihtout meaningful distinction as of yet. We have to come up with a better ontology for these cases and would love some input for that. In general, we want to start with more general statements and work up to more specific ones, as the quality of our data improves.
  • Excellent idea to integrate these here. Just a few comments:
described by source (P1343) you are using is generally meant for encyclopedias or other reference works that include an article about the subject. E.g. Q150575#P1343 would have articles about the emperor. It's not meant as a way to include every time a person is mentioned in a document. It can be an entry in an online database.
Maybe we should clearly distinguish a couple of elements:
  • (a) the website www.regesta-imperii.de
  • (b) available indexes on the website (1 entry per person, place etc.)
  • (c) the series Regesta Imperii (Q316838) (or Regesta Imperii XIII (Q97879676) ) or reproduced on www.regesta-imperii.de
  • (d) individual documents in the series
  • (e) persons and places mentioned in these documents (or signatories of them)
  • (f) the regesta about the document (d) in the series
As mentioned on your talk page, there are essentially two ways of doing that:
  1. one links (b) with described by source (P1343)
  2. one creates items for (d) and uses the appropriate properties to link persons, places etc. (similar to Magna Carta (Q12519) )
I'm not entirely sure which approach I'd suggest. It seems clear that [4] or [5] isn't ideal.
BTW 180000 would be (d) and 30000 (e)? --- Jura 13:17, 13 August 2020 (UTC)[reply]
This sounds like a great project, and I agree with Jura that we do not want a collection of outlinks added to individual items such as Ferdinand III, Holy Roman Emperor (Q150575) where the entity is simply mentioned. It does not seem like there is a 1:1 mapping between entities in WD and your entities, e.g. you dont have an "article" about Ferdinand III, Holy Roman Emperor (Q150575) but you have correspondence that are "tagged" with "Wien" (either written in Wien or related to Wien). It seems there is this but this is auto-generated. As Jura pointed out, in this case it would be best to create a new item that now has a 1:1 mapping to your entry, for example create an item titled "[RI XIII] Friedrich III. (1440-1493) - [RI XIII] H. 26" and then link it to http://www.regesta-imperii.de/id/1474-01-17_1_0_13_8_0_12354_346 -- best using a Property with an external identifier (1474-01-17_1_0_13_8_0_12354_346). You can then use all available Wikidata properties to link that item to other items in Wikidata, such as using author (P50) for the author of a document. Do you have an example document that you could create an item for so that we can give a concrete example? --Hannes Röst (talk) 14:47, 13 August 2020 (UTC)[reply]
I agree described by source (P1343) does not fit particularly well. But do we want more sprecific properties to model issuer (Q780558), witness (Q196939), addressee (Q28008314) etc. ? A much more general term such as mentioned in clearly won't do here (to high a risk for abuse IMO). Perhaps FactGrid (Q90405608) could be a better starting point? --HHill (talk) 15:16, 13 August 2020 (UTC)[reply]
We have significant person (P3342) (with high potential for abuse) that could be used here. But I agree, more specific properties would be better. --Hannes Röst (talk) 16:37, 13 August 2020 (UTC)[reply]

@Vicwestric: I have made an example item Q98380333 based on http://www.regesta-imperii.de/id/1474-01-17_1_0_13_8_0_12354_346 - do you think that contains all the information you need? There are a few questions, for example whether we should have two items here: one for the Regest and one for the document (I think that is a bit overkill). I would opt for Q98380333instance of (P31)document (Q49848) and not for Q98380333instance of (P31)Regesta (Q1933248). Best regards --Hannes Röst (talk) 21:01, 13 August 2020 (UTC)[reply]

@Hannes Röst: Thank you very much, I think that works well. We will implement that for a test set. Thank's everybody else for the input as well. We would of course also be interested in implementing more specific properties later on. Again, excuse the confusion, I'm very excited.
@Vicwestric: I suggest to do a few hundred examples and then look at which properties you are missing and we can propose these properties or find ways to use existing properties to model what you would like to model. For Q98380333 I have found that most properties that I needed were already there, but probably there would be a better way to model the people involved other than significant person (P3342). --Hannes Röst (talk) 15:08, 14 August 2020 (UTC)[reply]

@Vicwestric, Hannes Röst: Maybe I should have added (f) in my enumeration above, i.e. the Regesta about document (e). As mentioned by Hannes, this would mainly change the instance of (P31) for these. On items for documents described by source (P1343) linking to the entry about a Regesta would probably be perfect. Not sure though if doing items about (f) or (e) is better. --- Jura 07:01, 14 August 2020 (UTC)[reply]

@Vicwestric, Jura1: I have added point (f) in your list above. I am not sure if we need that really, it seems a lot to create a new item for an abstract only. I think we can use described by source (P1343)Q316838subject has role (P2868)Q1933248{{{5}}} for example. But a very detailed modelling approach would of course model the document itself and the Regesta as two different items, but that seems like overkill to me. I definitely think we should start to create items about the documents (d) themselves first since these are the items of interest and then we could decide whether its necessary to also do (f). I think you can do (d) without (f) but you cannot do (f) without (d). --Hannes Röst (talk) 15:08, 14 August 2020 (UTC)[reply]

SPARQL, how to show references

Folks, who can help me to get the references (links) under the tree of heritage designation (P1435) with archaeological heritage monument in Bavaria (Q97154904). I’m looking for the links which are related to archaeological heritage monument in Bavaria (Q97154904).

SELECT DISTINCT ?item ?itemLabel ?SG WHERE {
  { SELECT ?item ?BLfDID WHERE { ?item (wdt:P1435/(wdt:P279*)) wd:Q97154904. } }
  ?item (wdt:P131*) wd:Q10451;
    wdt:P1435 ?SG.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

Many thanks for your help. --Derzno (talk) 13:43, 13 August 2020 (UTC)[reply]

Constraint violation (subclass)

I have a question regarding this constraint violation: it states that "Values of programming language statements should be instances of one of the following classes (or of one of their subclasses), but C++11 currently isn't: " -- however C++11 (Q1061570) is a subclass of (P279) of C++ (Q2407) which does fulfil the constraint. How should we deal with these cases? Is this an error in modelling and C++11 (Q1061570) should be instance of (P31) programming language (Q9143) or is it an error in the constraint checking code an attributes in the parent should be applied to the subclasses as well and thus C++11 (Q1061570) implicitly already is instance of (P31) programming language (Q9143). I tend towards the latter but I would be interested to hear other people's opinion. --Hannes Röst (talk) 20:12, 13 August 2020 (UTC)[reply]

Strange. C++11 has got the property subclass of C++ and C++ is both defined as a subclass and as an instance of programming language (Q9143) (which is also awkward by the way, shouldn't it have only one of those properties?). As the constraint explicitly says, "or of one of their subclasses", I don't understand how this could be a violation. Bever (talk) 22:49, 13 August 2020 (UTC)[reply]
Fixed. The warning text only states that the property takes subclasses, but actually it took only instances. You have to look at the actual constraint statement. --SCIdude (talk) 08:07, 14 August 2020 (UTC)[reply]

Covid deaths, should many of these people have Wikidata Items?

I'm wondering about some Wikidata Items I've seen created recently. Should A'zariah Akira Dorsey (Q97350167), Adrienne Eugina Doolin Howard (Q95882581), Alexandra Louise Polansky (Q95890156) really have Wikidata Items, I doubt any of them will ever have links to other Wikimedia projects.*Treker (talk) 21:02, 13 August 2020 (UTC)[reply]

I dont see a problem with them passing Wikidata:Notability and therefore they should stay. They are clearly identifiable human people with a serious reference (NY times). However I think some of the descriptions need to be adapted, for example "Had a passion for soul food, cooking, music and her church." is not something I would keep. --Hannes Röst (talk) 21:13, 13 August 2020 (UTC)[reply]
Agreed. While the Wikidata inclusion criteria is ludicrously low (basically anything or anyone to which a few words are thrown in any source ever, from Covid victim #195,483 to Donald Trump's hair (Q27493213)), Wikidata is not an obituary or memorial site, and descriptions should be succinct and neutral. -Animalparty (talk) 21:26, 13 August 2020 (UTC)[reply]
And yes, eventually, all of us here will be items in Wikidata, during our lives or after, whether we want it or not. It will happen, maybe not today or this century, but eventually it will. The Data demands it (blessed be its name). All of humanity and all of creation will be joined eternally in structured Data, so that robots and future alien visitors can forever query the Earth as it is and as it was in human times. -Animalparty (talk) 21:39, 13 August 2020 (UTC)[reply]
I think this leads to a good dataset that likely wouldn't be available elsewhere. ChristianKl20:49, 14 August 2020 (UTC)[reply]

Pearle vs Pearle vision

Hi,

I noticed that Pearle (europe) and Pearle Vision (US) don't have seperate wikidata pages. I was wondering if those should be split (seeing as they are a different division and have different branding as well)

Cheers, Thibault  – The preceding unsigned comment was added by Thibaultmol (talk • contribs) at 22:37, 13 August 2020 (UTC).[reply]

@Thibaultmol: Hi. Can you please give the Q number for the item page you found? You can post it here using the {{Q}} template For example, {{Q|12345}}. Once we know the page, someone can offer you advice. From Hill To Shore (talk) 22:53, 13 August 2020 (UTC)[reply]
@From Hill To Shore: This is the item page Pearle Vision (Q2231148) that covers both right now. But I feel like it should be split maybe
It even seems like the European stores are not owned by the same company: en:Pearle_Vision : "The Pearle chain of opticians in Europe is now part of Grandvision and has more than 1000 branches ". So yes, should be split. --Hannes Röst (talk) 02:39, 14 August 2020 (UTC)[reply]

Asking merging two items

I would like to ask to merge Item Template:Pretenders to the Korean throne (Q19694643) to Template:House of Yi (Q20163606) - George6VI (talk) 07:12, 14 August 2020 (UTC)[reply]

Historic monuments of France

Could one of our French colleagues please take a look at monument historique inscrit (Q10387575) and Historical Monument (Q916475), and their Wikipedia links, and clarify the relationship between them (or merge them)?

fr:Pont_Vieux_(Cluses), for example, says the bridge is a fr:Monument historique (France) (aka Q916475); while Wikidata says it is a Q10387575, which doesn't even have a fr.Wikipedia article... Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:03, 14 August 2020 (UTC)[reply]

There are two types of Historical Monument (Q916475): classified historical monument (Q10387684) and monument historique inscrit (Q10387575). Both latter are filed as subclass of (P279) of the former, which seems appropriate. Jean-Fred (talk) 09:16, 14 August 2020 (UTC)[reply]
(This topic is in the scope of Wikidata:WikiProject France/Monuments historiques − I added maintained by WikiProject (P6104) to all three items. Jean-Fred (talk) 09:18, 14 August 2020 (UTC))[reply]
Jean-Fred: Thank you; so does "
L'édifice est inscrit au titre des monuments historiques en 1975.
" in fr:Pont_Vieux_(Cluses) need to be made more specific? And are el:Πρόσθετος κατάλογος ιστορικών μνημείων, eo:Registrita Historia Monumento (Francio) and es:Inventario suplementario de monumentos históricos appropriate interwiki links from Q10387575? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:56, 14 August 2020 (UTC)[reply]
Jean-Fred (talk) 13:01, 14 August 2020 (UTC)[reply]

Model item (El Comité 1973 (Q53644712)) uses country (= P17) but I think country of origin (= P495) should be used (description: country of origin of this item (creative work, food, phrase, product, etc.)). Chrzwzcz (talk) 14:14, 14 August 2020 (UTC)[reply]

Yes. Creative work should use country of origin (P495). Thierry Caro (talk) 09:12, 15 August 2020 (UTC)[reply]
OK, thanks, and should we initiate change P17→P495 for every instance of Q41298? Chrzwzcz (talk) 10:11, 15 August 2020 (UTC)[reply]

Wonder if that property should be applied only to true explosions/blasts, explosions that have occured (like 2020 Beirut explosions (Q98073118))// or could apply to weapon (Q728) things (like Blue Peacock (Q885837)) : weapons that did not explode or simply sort of weapons, not specific weapons. Bouzinac (talk) 18:49, 14 August 2020 (UTC)[reply]

Currently it does not seem to cover it, but I think we should expand the scope of explosive energy equivalent (P2145) to not just cover explosions but also types of weapons that could produce explosions of a certain size. --Hannes Röst (talk) 19:16, 14 August 2020 (UTC)[reply]

Q96785431

Was killer of (Q96785431) created to describe a property that once existed or was once planned? Is it normal to describe a property we do not have? This is my first time encountering a situation like this. --RAN (talk) 19:11, 14 August 2020 (UTC)[reply]

  • It has nothing to do with a property being planned or previously existed. It's a inverse property label item (Q65932995). Those are used by the relatedItems gadget. It seems to me quite clear with the existing description and instance of (P31) statement. As it's not the only time that this confusion exist, it might be good to do something to make it more clear. Do you have an idea of what the item would need for it to be clear to you what it's about? ChristianKl20:46, 14 August 2020 (UTC)[reply]

Azerbaijani help -- merging to items with separate AZ wiki entries

I believe that these two entries are for the same person, and I'd like some advice (or better: help) on how to proceed, as they cannot be merged as-is due to conflicts:

They both have Azerbaijani Wikipedia articles, so the Wikidata entries cannot be merged, but I believe they are different names for the same person. It seems to me that the sensible thing to do would be to combine the articles in the AZ wiki, then merge the Wikidata, but I feel like that ought to be in the hands of someone with the right language skills, which I decidedly lack. It appears that the subject is known by enough different names that someone presumably created the newer article not realizing that the subject was already covered.

The reasons I think they are the same are:

Anyone know how we could proceed with this? Thanks -Kenirwin (talk) 23:44, 14 August 2020 (UTC)[reply]

Entity Explosion - a new browser extension driven by Wikidata

I'm delighted to announce that Entity Explosion is now live. It allows you to discover links and information about a topic you are browsing on other sites. It works on all Wikimedia sites, but also everywhere we have linked external IDs. It wouldn't have been possible without everyone in the community's hard work. Thank you!

Download here: https://chrome.google.com/webstore/detail/entity-explosion/bbcffeclligkmfiocanodamdjclgejcn

The on-wiki project page has more details Wikidata:Entity Explosion.

It's not just for Wikidatans, but I hope you like it! --99of9 (talk) 06:04, 15 August 2020 (UTC)[reply]

Senator Harris

Can someone cleanup the three statements about her tenure in senate?

Apparently there is said to be some consensus on how to do it, but the statements on Q10853588#P39 seem somewhat messy. There is no position called "United States senator (116th Congress)". --- Jura 07:34, 15 August 2020 (UTC)[reply]

@Jura1: Can you provide sources to clarify your claim? I plan on cleaning this up. Adding Q98077491 is part of this process. I think the driver should be both accuracy and descriptiveness. The more descriptive the more info can be gleaned from the source data.
Also, I plan on cleaning this up for all historical senators. Doing this piecemeal by senator is not the right approach in my opinion. -- Gettinwikiwidit (talk) 10:20, 15 August 2020 (UTC)[reply]
Can you revert the mess and then outline what you plan to do? It's not ok to do experiments on live items. --- Jura 10:00, 15 August 2020 (UTC)[reply]
@Jura1: I'm sorry. I don't understand why you insist on hurling insults and making this conversation unproductive. It makes it very difficult to add value in that setting. -- Gettinwikiwidit (talk) 10:20, 15 August 2020 (UTC)[reply]
You seem to agree that cleaning up is required. How would you describe the situation? Unclean? Dirty? --- Jura 10:42, 15 August 2020 (UTC)[reply]
@Jura1: You're not engaging with any of the information I'm providing. You're simply declaring yourself the arbiter of right and wrong. I'm happy to have a productive conversation about the data but not to genuflect. When you're ready for the former, please let me know. -- Gettinwikiwidit (talk) 10:47, 15 August 2020 (UTC)[reply]
I'm ok with a mere revert. Did you seek consensus for the change you made? As you are aware, in the meantime I listed Q98077491 for deletion. --- Jura 10:49, 15 August 2020 (UTC)[reply]

Bots not required to be Open Source?

Have I missed something? I thought bots were required to point to their source and I'm doing this but, when I skimmed over a few bot user pages nothing could be found. How can you sensibly review a bot if no source is given? On the other hand, is there a single reason why you would not require this? --SCIdude (talk) 09:20, 15 August 2020 (UTC)[reply]

  • Where did you get this from?
Bots are usually approved based on test edits and operators history with automated edits (or refused based on messes they created before). --- Jura 09:33, 15 August 2020 (UTC)[reply]
What's wrong with publishing these, e.g. using gist? Not only does it create trust, it helps other people, those writing themselves, or scientists doing surveys about them. I still haven't seen a good reason for not requiring it. --SCIdude (talk) 10:04, 15 August 2020 (UTC)[reply]
I generally think it's a good idea to make bot code open source (pi bot's is), and it's required if you run anything on toolforge. However, some other bot operators have issues with making code open source, e.g., see phab:T189747 (constraint violation updates). Thanks. Mike Peel (talk) 10:32, 15 August 2020 (UTC)[reply]

Countries vs. sovereign states

We currently have a number of items like Spain (Q29), which has instance of (P31)=sovereign state (Q3624078) but not instance of (P31)=country (Q6256). I noticed this as commons:Module:WikidataIB's 'location' code looks for instance of (P31)=country (Q6256) to know where to stop following location properties; without this you end up with cases like "Adeje, Santa Cruz de Tenerife Province, Canary Islands, Spain, Iberian Peninsula, Europe, Northern Hemisphere" in the infobox at commons:Category:Casa Fuerte de Adeje - it should stop after 'Spain'.

It seems that country (Q6256) has been removed from these items initially randomly and then systematically (see [7], with the rationale that sovereign state (Q3624078) has subclass of (P279)=country (Q6256). One solution to the infobox problem would be to add a check for sovereign state (Q3624078) as well. However, I don't think that the modelling here is correct, as it seems to confuse a government system with a territory. So I'd like to remove subclass of (P279)=country (Q6256), and add back instance of (P31)=country (Q6256) to the affected items (and removing any 'preferred' ranks to avoid shadowing of the country (Q6256) value). Does that make sense?

Pinging @RexxS, Oravrattas: as this follows up on previous discussions with them. Thanks. Mike Peel (talk) 10:45, 15 August 2020 (UTC)[reply]