Page MenuHomePhabricator

Manually evaluate section-level image suggestions
Closed, ResolvedPublic

Description

Using the tool built for this purpose in T316149, conduct the following manual evaluation of section-level image suggestions:

  • Evaluate results in English, Portuguese, Indonesian, Russian, Arabic, Czech, Bengali, French and Spanish Wikipedias
  • Evaluate evaluate 500 random section-level image suggestions across 500 random different articles, per wiki
    • Ambassadors will need to count and keep track of how many suggestions they have evaluated in their language -- the tool will not capture that.
  • For each result for each unillustrated article, manually decide whether the match is good, okay, or bad. Evaluators also have the option to choose "unsure" if they're not confident in their selection.
  • General comments or questions during evaluation can be posted as comments in this ticket.

The estimated time of work for manual evaluation is 3 hours for the 500 images. However, if the 3 hours are passed without finishing the test, please leave a comment.

  • As a result of the evaluation done in this ticket, product managers will determine what confidence scores to move forward with and if any algorithm changes are necessary.
  • We will also evaluate what percentage of bad matches come from section alignment, and which wikis specifically, so we can decide if there is a next step here
  • We will also evaluate whether excluding images that are not .jpgs in order to prevent icons from being suggested is something we want to do in the production dataset

Evaluation note

  • We first want to evaluate small datasets iteratively internally on the SD&Research and PM team before involving ambassadors

Event Timeline

We've run into image suggestions in English and Arabic that were for articles that already included the same image within an infobox. So far this issue and the previously discussed issue with images being suggested for sections that have charts / tables are the main issues I've seen.
I'm sure Growth Ambassadors will have other feedback before Monday though.

In Spanish Wikipedia, I am finding different situations:

  • The article is very short in Spanish, and since it already has an infobox with an image, does not need more images. In these cases, I am selecting "Section should not have an image".
  • When the suggested image is already used in an equivalent section in other language, sometimes the content of that section is different in Spanish, so it could happen that there is no point in suggesting/adding that image. In these cases, I am selecting "Not good"
  • The article itself has already many images similar to the one suggested and, when the section is very short, there is no point in adding an image, since the article itself has enough. I am choosing "Section should not have an image".
  • Sometimes, the image suggested is already in the article. Eg. this article and this image.
  • In the original article, images are part of a graphic or table (like this one). In Spanish, it could happen that images are not used in the same schema, so there is no point to add just one. I am selecting "Section should not have an image".
  • The same article is suggested more than one time (like this one).
  • IMO, a number (or date) should not be the trigger to suggest an image.
  • Many times, for a Demographics section it is suggested an image related to a landscape. For example, in the article Condado de Letcher it was suggested t[[ https://commons.wikimedia.org/wiki/File:View_from_Pine_Mountain_(Kentucky).jpg | his image ]] in the Demographics alleging that is also used in the same section in the enwiki article (which is not true). Probably, that image would fit better in other section.
  • In some cases, like in this article, the image is suggested in a wrong section (in this case, Condecoraciones when would have more sense in the Later life section).

Thank you!

Here is feedback from my evaluation (from cswiki perspective):

  • In some cases, the suggested image is a good fit. However, it should be included in the top of the article (not in the section), as the article doesnt have an image at all. An example is this cswiki article.
  • In many cases, an image is a good fit topic-wise, but a similar image is already present (in the infobox or other prominent value). As such, the suggested image adds little to no value to the article, and should not be added.
  • In many cases, an image is technically a good fit, but the section is too short to have an image (alternatively, the article is too short to fit another image). I was labelling those as "Section should not have an image", although most of the time, the problem is not in the section itself.
  • In some cases, an image is suggested to a section listing several similar items (cast of a movie, awarded people, people of a certain occupation, etc). Using a single image in this type of section would give more weight to one of the listed topics. Adding images would be ok, if there are multiple images shown next to other (see example in enwiki), avoiding the "this item is more important" added bias.
  • For some "this image is used in equivalent section at xxx wiki", the other wiki's section mentions some additional topic (losening the connection between the image and the article). This often happens when the image is in a subsection which is not present in both articles (enwiki's Aurangzeb has Reign -> Military equipment, for which the image is all right, but cswiki's article has a Reign section, but does not mention anything about military equipment).
  • Many times, a link in an article section only means the section talks about that topic tangentially. Image should ideally illustrate the main point of the section.

Hope it helps!

In bnwiki,

  • Most of the suggestions were like 2019 in India, or the cast section of a film, or the list of universities in Ivy League type article, where just a list of names (of actors/players/award winners/scientists) are listed.
  • For a few cases, the suggested image was already present in the infobox (e.g., Zeid bin Hussein)
  • I selected unsure for an image, where the section contained a redirect to another article only. The suggested image was indeed relevant, but the section didn't describe anything.
  • Only mentioning a topic in a section doesn't make the image of that topic relevant to the section, but this type of images were also suggested.
  • The image was suggested regarding the result of an election, and the prime minister, other ministers, were mentioned. However, the image was suggested for one person only (not the prime minister); which I think should not happen. I selected ""Section should not have an image".
  • The good fit images were generally suggested in a text-based article (almost never in a list-type one).
  • The number of suggested images were below 100 for bnwiki (I got repeated suggestions and selected some of them again, reaching around 60). Then I stopped as I kept getting suggestions for only two articles (গোয়া -- image and দুর্যোধন -- image) again and again.

@Ankan_WMF there should be more suggestions available for bnwiki now

Evaluation from ar:wp

  • Some sections shouldn't have images: like == See also == or section with multiple columns;
  • Some images are suggested because they exist in another article in a different language, but this later is so developed, that the image is relevant only in this language;
  • Very short articles, with no other images, shouldn't be suggested;
  • Very good articles (labeled ones) shouldn't be suggested, because the community will not easily accept changes on them, especially from newbies;
  • Articles with long info boxes creates some issues with the display of the article;
  • Collage images (image with multiple sources) shouldn't be suggested in sections;
  • How to choose an image from wikidata item? Which one? (See mariage);
  • Some sections have already a media like 3D models (stl files), so we don't have to add an image to them.

Here is my feedback after testing section-level image suggestions for Indonesian Wikipedia:

  • If article is short (stub), it is better to not suggested any images.
  • Any featured articles should not be suggested, because communities have evaluated and checked the quality of articles there (usually featured articles already have many images).
  • At "see also" sections (in Indonesian: == Lihat pula ==), should not be suggested.
  • The this image is used in equivalent section at xxx wiki message is helpful, but sometimes the suggestions are not related to the article's main topic.
  • The suggestions were already available on infobox, but it rarely shows.

But overall, this tool is great to add images more quickly than before. Thank you for all your hard work!

My observation, from RuWP perspective:

  • The tool doesn't correctly select image suggestions for sections. Perhaps this is because the section has links to other articles, and based on them, the tool suggests photos, but not all suggestions are correct.
  • Some sections, such as list-only sections, shouldn't contain images, and therefore the tool shouldn't suggest images for them.
  • In some cases, the tool suggests images in sections that already have images.

For now, this is all from me.
Thanks!

CBogen claimed this task.

Based on the evaluation results, we will be moving forward with the section alignment and intersection based suggestions, which all scored very well. We will not be moving forward with depicts-based suggestions. We will be pausing on P18 (Wikidata image) based suggestions until we can do more work to refine and evaluate those results -- while those will still remain in the pipeline data, we can remove them on the client side.

We will also be working to further refine our code that removes suggestions for sections that are tables/lists, and removes suggestions from sections that already have images.

Follow up tickets to reflect all of this work will be created shortly. Meanwhile, this round of evaluation can be considered complete.

My observation:

One of the issues I have observed in using the tool is that the high rate of images with little or no description becomes a problem when it comes to assessing whether they are a correct suggestion or not. It works well for topics we know, but it's very difficult when we don't know the subject. This note is for images with no description at all or images that only have a description in the local language. In other words, the description plays an important role in contextualising the image.