Jump to content

User talk:InternetArchiveBot

Add topic
From Meta, a Wikimedia project coordination wiki


Archive
Archives

Connect with the developers and other users[edit]

Telegram IRC (irc.libera.chat #iabot)

Operation status[edit]

For the most up to date information see the run pages or Wiki Operations Summary on Airtable

  • 🟢 InternetArchiveBot is currently running on 300+ Wikimedia wikis.
  • 🟢 We have moved the management interface to a new server. Please start using iabot.wmcloud.org instead of iabot.toolforge.org. Please let us know if anything broke during this process.
  • 🟡 Testing is stalled on Alemannisch Wikipedia (als), Asturian Wikipedia (ast), and Japanese Wikipedia (ja).
  • 🔴 Bot is approved but disabled indefinitely pending software improvements on French Wikipedia (fr), MediaWiki.org, Norwegian Nynorsk Wikipedia (nn), Polish Wikipedia (pl), and Portuguese Wikipedia (pt).

Last updated: 15:55, 14 June 2024 (UTC)

How this page works[edit]

  1. Ask your question in any language. Questions in English or German will receive the fastest responses.
  2. Our team will try to respond within seven days.
  3. Seven days after our response we will mark the thread as resolved. This queues the thread for archiving.
    If our response does not answer your question, you are welcome to remove the "section resolved" tag and write an additional comment.
  4. Seven days after the thread is marked as resolved, it will be archived. Once a thread is archived, it should not be un-archived. Instead, create a new thread and link to the old one.


SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days.


Inaccessible link[edit]

IABot “fixed” a link that it reported as inaccessible here: https://nl.wikipedia.org/w/index.php?title=Chora_%28Patmos%29&diff=66511300&oldid=66454976

However, the link works fine with http on my end. Now I do agree that https is safer (although in this case it was hardly an improvement), but that's no reason to treat a link as “inaccessible”. Mondo (talk) 18:15, 13 December 2023 (UTC)Reply

Hello Mondo. The bot did not necessarily declare the link inaccessible, though the edit summary would indicate that because the bot's edit summaries are very imprecise. The bot upgrades HTTP links to HTTPS where possible, separately from its process of fixing dead links. Harej (talk) 18:20, 13 December 2023 (UTC)Reply
Hello Harej, in that case, it's against the guidelines of the Dutch Wikipedia. We have the guideline “bij twijfel niet inhalen”, which is similar to the one on EN:WP called If it ain't broke, don't fix it, except that ours is much more detailed. The link was not broken and https hardly made a difference with this specific link, therefore it was against the guideline. I have reverted IABot and added the article to the deny list, but I hope this can be fixed, because this will happen again on other pages. Mondo (talk) 18:23, 13 December 2023 (UTC)Reply
.
3750 2409:4081:2E1B:10CF:C8EC:865C:203E:844A 06:20, 16 April 2024 (UTC)Reply

It's not resolved. I explained what the issue was back in December and nothing has changed. Mondo (talk) 19:27, 3 April 2024 (UTC)Reply

Mondo, as explained above, it is our practice to replace HTTP with HTTPS on all wikis, and we are not changing that. Continuing to remove the "section resolved" template will not change this. If changing HTTP to HTTPS is in fact against policy, please cite the policy. Harej (talk) 20:09, 3 April 2024 (UTC)Reply
If you wanted me to cite the policy, it would've been nice to know that when I posted my last comment instead of not responding to me for months. But here you go:

https://nl.wikipedia.org/wiki/Wikipedia:Bij_twijfel_niet_inhalen
“De ene goede variant door de andere goede variant vervangen is geen verbetering of verslechtering, maar een neutrale bewerking. Dergelijke bewerkingen zijn ongewenst”

Which translates to: “Replacing one good variant with another is not an improvement nor the opposite. It's a neutral edit. Such edits are undesirable.

Replacing http with https is exactly that: http works fine, i.e. it's a good variant, which makes it against policy. Now I could see it being somewhat useful if it's a URL where security is of the utmost importance, but in this case it's a link to a spreadsheet file. There's nothing that https will do to protect the user in this case. (Or if the http link was dead and replaced with https.) Mondo (talk) 20:18, 3 April 2024 (UTC)Reply

The bot keep adding archive link where it isn't required.[edit]

Hello, The bot always try to add this link but it isn't needed. It happened like more than 3 times and I had to cancel the change every time. https://web.archive.org/web/20211012034604/https://incubator.wikimedia.org/w/index.php?hidebots=1&translations=filter&hidecategorization=1&hideWikibase=1&limit=50&days=3&title=Special%3ARecentChanges&testwiki=wp%2Fryu&urlversion=2

The unwanted modifications occurs on this page: https://incubator.wikimedia.org/wiki/Wp/ryu/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8

And this is an example of the unwanted modification. https://incubator.wikimedia.org/w/index.php?title=Wp/ryu/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8&diff=prev&oldid=6254326 Patronus95 (talk) 04:51, 2 May 2024 (UTC)Reply

I just looked and this seems to have fixed itself. Is there anything more you need me to look at? —CYBERPOWER (Chat) 14:54, 15 May 2024 (UTC)Reply
Patronus95, pinging for your attention. Harej (talk) 20:10, 3 July 2024 (UTC)Reply
Hi, no everything so far is fine now. Thank you.
--Patronus95 (talk) 05:48, 12 July 2024 (UTC)Reply

Can't archive[edit]

Hi IAB admins, so I'm having an issue. With one of my articles, en:Aston Martin Rapide, i've been trying to archive the sources, but it comes up with this. No links were analyzed for some reason. Any reason as to why? 750h+ (talk) 05:08, 25 May 2024 (UTC)Reply

I can't see what you are referring. The image is not showing for me.—CYBERPOWER (Chat) 21:30, 29 May 2024 (UTC)Reply
750h+, pinging for your attention. Harej (talk) 20:12, 3 July 2024 (UTC)Reply
No i fixed the problem. Sorry about that. 750h+ (talk) 03:46, 4 July 2024 (UTC)Reply

The bot wrongly claims links are dead[edit]

Hi! The bot keeps claiming that links in the following article are dead, but they are ok, as far as I see. https://el.wikipedia.org/wiki/%CE%95%CE%B8%CE%BD%CE%B9%CE%BA%CE%AE_%CE%95%CE%BB%CE%BB%CE%AC%CE%B4%CE%B1%CF%82_(%CE%A6%CE%B5%CE%BD%CF%84_%CE%9A%CE%B1%CF%80) Can you do something, please? Thank you. --Harry Deconstructing (talk) 12:11, 29 May 2024 (UTC)Reply

Can you please provide more concrete examples?—CYBERPOWER (Chat) 21:32, 29 May 2024 (UTC)Reply
If you check the references, most of the links are deemed dead. However they work. For example
Reference No 25: Greece - Mexico 1 - 2 (1983)[νεκρός σύνδεσμος]
The link is https://www.billiejeankingcup.com/en/draws-and-results/W-FC-1983-WG-M-GRE-MEX-01?matchId=itf_2610164d79ebc202150c3ed3669cb0b6 Harry Deconstructing (talk) 00:32, 30 May 2024 (UTC)Reply
User:Harry Deconstructing, the website is returning error codes despite otherwise having content on them, so we added it to our permalive list so the bot will treat the website as alive. Harej (talk) 20:21, 3 July 2024 (UTC)Reply

Category:CS1 maint: url-status at EN Wikipedia[edit]

Hello, I was wondering if InternetArchiveBot could go through CS1_maint:_url-status on EN Wikipedia. I checked some of them from the list. Articles like Alan Barinholtz and Football at the 2024 Summer Olympics have a working URL and don't need |url-status=live. So far, I haven't seen an |url-status=dead parameter that's missing an archived URL and archive date. There's over 2,000 articles to check. Thanks! MrLinkinPark333 (talk) 23:33, 30 May 2024 (UTC)Reply

MrLinkinPark333, the bot has been queued on those pages, but we can't guarantee the bot will fix those pages that are members of the maintenance category. It should fix the ones that are explicitly marked url-status=dead. Harej (talk) 20:29, 3 July 2024 (UTC)Reply

False positive dead link[edit]

Hi, https://geoportal.rsd.cz (see https://cs.wikipedia.org/w/index.php?title=D%C3%A1lnice_D1&diff=prev&oldid=23963893) is not dead, maybe is just geo-restricted. --Harold (talk) 15:53, 31 May 2024 (UTC)Reply

User:Harold, it is indeed georestricted. We have marked the rsd.cz domain as permalive so that the bot will treat them as alive. Harej (talk) 20:37, 3 July 2024 (UTC)Reply

Ambiguous message[edit]

Hi,

There's this message:

Once the search results load, which can take time, select the domains it found from the list that you want to modify and push "Submit" on the bottom of the page.

What does "that you want to modify" refer to? The domains or the list?

If it's the domains, could it perhaps be written like this:

Once the search results load, which can take time, select the domains it found and that you want to modify from the list and push "Submit" on the bottom of the page.

I don't know the tool well, and I'm just guessing and it's possible that I'm wrong :) Amir E. Aharoni (talk) 19:56, 1 June 2024 (UTC)Reply

User:Amire80, you are correct, feel free to file a pull request. Harej (talk) 20:44, 3 July 2024 (UTC)Reply
Thanks: https://github.com/internetarchive/internetarchivebot/pull/153 Amir E. Aharoni (talk) 14:27, 7 July 2024 (UTC)Reply

A message with "from all wikis"[edit]

There's this message:

All citation templates are listed here from all wikis. Format it as if you are transcluding the template, and put new templates on a new line. Failure to follow correct formatting may break the bot.

What does "from all wikis" mean here?

Could it perhaps be rephrased as "All citation templates from all wikis are listed here."? Amir E. Aharoni (talk) 19:44, 2 June 2024 (UTC)Reply

@Amire80:, please file a pull request for this as well. Harej (talk) 20:46, 3 July 2024 (UTC)Reply
Thanks: https://github.com/internetarchive/internetarchivebot/pull/154 Amir E. Aharoni (talk) 14:30, 7 July 2024 (UTC)Reply

non-critical DB[edit]

Hello again :)

There's this message:

A non-critical DB has returned an error. This may have minor impacts on reliability.<br>Error {{errno}}: {{errormessage}}

It may be correct, but I wanted to make sure: Is it really a non-critical DB? Like, is there a critical DB and a non-critical DB?

I'm asking because the same sentence also mentions error, and it's much more common in software user interfaces that the errors are critical or non-critical, and not the DBs.

If the message is correct as is, everything's fine :) Amir E. Aharoni (talk) 22:19, 3 June 2024 (UTC)Reply

@Amire80:, the distinction is that replicas are non-critical. The main database that the UI works off is a critical database. It's a caution that the user interface might be slower because the databases that supply redundancy aren't working right. Harej (talk) 20:50, 3 July 2024 (UTC)Reply
Thanks! Amir E. Aharoni (talk) 14:23, 7 July 2024 (UTC)Reply

Weird "%20" added to archive URL[edit]

Hello, I'm relatively short on time at the moment due to being on holiday among other things, but in this edit on the English Wikipedia, the bot took the URL http://www.washingtontimes.com/news/2010/feb/27/us-clinches-medals-total-canada-most-golds/ and added an archive link https://web.archive.org/web/20181225175051/https://www.washingtontimes.com/news/2010/feb/27/us-clinches-medals-total-canada-most-golds/%20/ (which doesn't work). The correct archive link should be https://web.archive.org/web/20190203234549/https://www.washingtontimes.com/news/2010/feb/27/us-clinches-medals-total-canada-most-golds/ but when I try to modify the URL data in Internet archive bot, it says that URL doesn't match the original link. Graham87 (talk) 19:49, 6 June 2024 (UTC)Reply

@Graham87: we checked the link database and can confirm the archive link is now correct, without the %20 encoded space. We weren't able to reproduce the original edit that caused the encoded space to be added to begin with, so we think this is a transient error. Please let us know if you see it happen again. Harej (talk) 21:04, 3 July 2024 (UTC)Reply

A few more message corrections[edit]

Hi!

Something slightly different this time.

I've completed the translation of the bot into Hebrew on translatewiki. Along the way, I sent a few more message corrections. There are now six pull requests at GitHub. It would be nice to review them.

Thanks! :) Amir E. Aharoni (talk) 20:40, 7 June 2024 (UTC)Reply

Blank page[edit]

Hi and thank you for your bot. I am running it on a giant article that used to take maybe 5 minutes. Now after a couple minutes the screen goes blank and doesn't recover. I guess due to lack of patience I ended up with three different IABot edits. You need a better way to tell the user when the bot is done. P.S. Clearing my cache does not help. -SusanLesch (talk) 23:33, 7 June 2024 (UTC)Reply

To follow up, today the bot ran correctly. No blank page. Thank you. -SusanLesch (talk) 20:22, 24 June 2024 (UTC)Reply

Enable bot for Asturian Wikipedia (astwiki)[edit]

Hi everyone. I'd like to enable back the bot for the Asturian Wikipedia. According to the current bot status for astwiki, it was disabled in October 2023 with the reason: "dead link template still not working right". I checked the template (as well as other related ones), and they are mostly the same as on Spanish Wikipedia (eswiki), where the bot is indeed active. I'd like to ask some intel on what config or template needs review for the bot to be enabled back in astwiki. Thanks in advance! YoaR (talk) 15:19, 10 June 2024 (UTC)Reply

Non sta funzionando bene[edit]

segnalo Skyfall (talk) 21:50, 10 June 2024 (UTC)Reply

"Rescued"[edit]

Hello. On the justification "Rescuing 1 sources and tagging 0 as dead.", the bot changed a correct link to an incorrect link here: [1]. Please identify the issue that caused this so that it never happens again. I will go ahead and revert the bot's edit since it unequivocally ruined the original edit. Thanks for your hard work! Geographyinitiative (talk) 10:21, 11 June 2024 (UTC)Reply

Internet Archive snapshots of Google Books pages: valid or not?[edit]

It is my understanding that Google Books is globally deprecated as an archive and all links to it are treated as permanently dead by the bot. However, I was able to manually find an Internet Archive snapshot of a Google books link tagged as dead. Do you encourage or discourage adding such snapshots to the bot's archive database? Huntthetroll (talk) 21:46, 14 June 2024 (UTC)Reply

IABot refusing to archive?[edit]

Hey there, I've tried to run the bot several times on en:Howl's Moving Castle (film), but it never ends up adding any archives a simply ends the attempt after a few seconds. I'm not sure what the cause of this is, and I'd appreciate nay help. Let me know if you have any questions! TechnoSquirrel69 (sigh) 18:56, 19 June 2024 (UTC)Reply

I had a similar issue with en:Greystanes, New South Wales a few days ago, but it was working for other articles. Adam Black talkcontributions 20:09, 23 June 2024 (UTC)Reply

dewiki[edit]

IABot is adding webarchive to links which are still online. For example: de:Spezial:Diff/245966106, but i've seen this some more times since last month. Please fix that. Thanks for your help, TenWhile6 (talk | SWMT) 20:43, 19 June 2024 (UTC)Reply

some more: de:Spezial:Diff/245905312, de:Spezial:Diff/245997642. TenWhile6 (talk | SWMT) 20:46, 19 June 2024 (UTC)Reply

Removal of the ref text that wasn't redundant[edit]

The bot made this edit in ukwiki. It cleared the content of a tag "<ref group="к" name=":0">" that wasn't redundant, which caused an error (FAQ claims that "The bot often makes maintenance edits to articles in the course of its work. This includes removing redundant citations from articles."). I assume it's because there's a same name tag that is in another group. MonX94 (talk) 11:49, 23 June 2024 (UTC)Reply

Bot breaking references, bot duplicating references[edit]

In this edit the bot caused harv/sfn no-target errors for "U.S. Department of State / Office of the Historian (1950b)", and "Mitterrand (1990)", and a multiple-target error for "Wörner (1991)". I have undone it, but would appreciate it if the bot were taught not to screw up references. I can be contacted on en-wiki. DuncanHill (talk) 12:12, 28 June 2024 (UTC)Reply

It's just done it again. If you can't be bothered to answer then I'll try to get the bot blocked on enwiki. DuncanHill (talk) 09:22, 4 July 2024 (UTC)Reply

Paged blanked in original source and archive.org. New page available.[edit]

I take the example es:Cuenca del río Copiapó. I used the source

*{{cita libro
 |apellido     = Niemeyer F.
 |nombre       = Hans
 |enlace-autor = Hans Niemeyer
 |título      = Hoyas hidrográficas de Chile, Tercera Región
 |ubicación   = Santiago de Chile
 |editorial    = [[Ministerio de Obras Públicas (Chile)]], Dirección General de Aguas
 |año         = 
 |edición     = 
 |url          = http://documentos.dga.cl/CUH2886v3.pdf
 |ref          = harv
 |fechaacceso  = 25 de julio de 2019
 |urlarchivo   = https://web.archive.org/web/20181111130729/http://documentos.dga.cl/CUH2886v3.pdf
 |fechaarchivo = 11 de noviembre de 2018
}}

There are 12 such documents of Hans Niemeyer F., one for every of the (1980) Regions of Chile.

Most of the rivers, lakes, salars, geisers and many other objects of Chile have one of these documents as bibliography, I guess, tehy appear so 1500 times in the Spanish Wikipedia.

The "urlarchivo"-data has been added by your bot.

All these sources in the servers of "Ministerio de Obras Públicas (Chile)" have been deleted and archive.org also deleted these documents, I suppose on behalf of the ministry.

Now, a "DGA-0065.rar" (23.52 MB) file appeared with all these documents under

I saved this (meta-data) file in

and the content of the rar-file is saved in in

This file contents all 12 pdf-files and I hope they will not change the link again for some time.

Is it possible that you re-read the article and change if "http://documentos.dga.cl/CUH2886v*.pdf" to "https://bibliotecadigital.ciren.cl/items/052fafdb-960c-4be1-a5ff-9c3458500220" ?

That will lead to the metadata page with the link to the content-rar-pdf-file!. Great Thanks, Juan Villalobos (talk) 12:49, 30 June 2024 (UTC)Reply

Request to add IAbot to three Miraheze wikis[edit]

The request is already at T340089 but perhaps that's no longer the place to make it? The three Miraheze wikis are:

Please could InternetArchiveBot be enabled on these wikis? It already has bot user rights on all three, from previously guarding against link rot, (but with moving to WikiTide and back something got turned off or broken). Rob Kam (talk) 22:43, 3 July 2024 (UTC)Reply

changing (working) https to insecure http and linking dead url as live[edit]

See here - the bot claims that the original link is still live (which it isn't), and changes the archive link (archived via trove, not the internet archive) from https to http. Nigel Ish (talk) 09:38, 4 July 2024 (UTC)Reply