Wikipedia:Bots/Noticeboard: Difference between revisions

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 18:54, 14 April 2020

Bots noticeboard

Shortcuts

Here we coordinate and discuss Wikipedia issues related to bots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about our bot policy and know where to post your issue.

Do not post here if you came to

discuss non-urgent bot issues, bugs and suggestions for improvement. Do that at the bot operator's talk page
discuss urgent/major bot issues. Do that according to instructions at WP:BOTISSUE
discuss general questions about the MediaWiki software and syntax. We have the village pump's technical section for that
request approval for your new bot. Here is where you should do it
request new functionality for bots. Share your ideas at the dedicated page

New Pywikibot release 3.0.20200326

(Pywikibot) A new pywikibot release 3.0.20200226 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and the „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:

When executing a script all required external modules are checked.
Data attributes of WikibaseEntity were refactored. (Task T233406)
Disfunctional cgi_interface.py was deleted. (Task T248292, Task T248250, Task T193978)
The compat submodul has been deleted. (Task T183085)
Python 2.6 supporting backports.py has been deleted. (Task 244664)

The following code cleanup changes are announced for one of the next releases:

Test Site should be invokend by Site('test', 'wikipedia') instead of the deprecated Site('test', 'test').
MediaWiki versions prior to (LTS) 1.19 will no loner supported (Task 245350).
For Python 2 the package ipaddress is mandatory (Task 243171); the submodule tools.ip will be deleted.
Featured articles interwiki link related functions will be desupported.

All changes are visible in the history file, e.g. here

Best @xqt 12:04, 27 March 2020 (UTC)[reply]

Re-examination of ListeriaBot

I think that this Bot's BRFA was defective and that this deficiency has been shown repeatedly. I first encountered the bot when it was used by an editor to create a bunch of list articles in mainspace (see entries which start List of museums). This bot did not seem to be approved to operate in mainspace which seems to have since been confirmed in this BOTN discussion earlier this year. Today there is evidence of repeated problems with the bot and non-free images. From today: Bot operator's talk page &AN earlier discussions: User talk:ListeriaBot/Archive 1#Adding non-free images, Wikipedia:Village pump (technical)/Archive 154#ListeriaBot adding non-free images to Wikipedia namespace page, Wikipedia:Village pump (technical)/Archive 158#Listeria bot and non-free images, User talk:Magnus Manske/Archive 6#Non-free images being added by Lysteria bot and Wikipedia:Village pump (technical)/Archive 159#Lysteria bot and shadowing. This bot, even during the approval process, was operated outside of policies and the operator does not seem to be responsive when concerns are raised. I ask for two actions:

The bot be partially blocked from article space so that it may not edit there
Changes be made to the bot, either by the current operator or by a new operator, to ensure that future non-free image problems do not occur and to the extent that they do that changes are made in a responsive manner

I recognize that this bot is important to the running of many behind the scenes tasks and do not want to disrupt those; it's only because of that that I am not asking for the bot's approval to be revoked until action 2 can be completed. However, I believe that this bot has, since before its approval, been operating outside of the bot policy and that this continues to the present time and in disruptive ways. Best, Barkeep49 (talk) 14:34, 11 April 2020 (UTC)[reply]

Partial block was floated last time, this time I'll implement. Artice-space only, but at the moment that seems to be the main soncern. Should stop all the bickering at {{Wikidata list}} as well. Primefac (talk) 15:02, 11 April 2020 (UTC)[reply]

Thanks Primefac, but I do wish to note that today's issue around non-free images was caused in userspace. Hence action request 2. Best, Barkeep49 (talk) 15:06, 11 April 2020 (UTC)[reply]

Fair enough, though I'd like to hear other input before blocking wholesale again (unless you think only allowing in WP-space until this is sorted isn't too unreasonable).

Actually, that could still be an issue in WP-space. Do we know if things are properly set up on places like WiR? Primefac (talk) 15:12, 11 April 2020 (UTC)[reply]

If there are issues with the bot, a complete block makes the most sense until we can ascertain what needs to be fixed. Note I don’t think it’d be considered wheel warring because a consensus is emerging that it was a bad unblock. TonyBallioni (talk) 15:55, 11 April 2020 (UTC)[reply]
As approved, this bot should operate only within a handful of talk pages. Editing outside of that space without approval upsets the careful balance of community approval and oversight required for bots, a balance that was struck after a great deal of bot-related disruptions in the past. Good block, and there would be no issue extending this to a full block or to all of the namespaces other than user/user talk. Best, Kevin (alt of L235 · t · c) 16:19, 11 April 2020 (UTC)[reply]
Do you mean namespaces other than WP/WPT? Seems like WiR is the main user of this bot's functionality. Primefac (talk) 16:41, 11 April 2020 (UTC)[reply]
Bot is working just fine in many different wikis including this one. If people make a list that include images, the bot will only use images from Commons. The English Wikipedia shouldn't have unfree images with the same name and if it happens than an easy fix is to rename the local image here (just append "(non-free)" to the name). I don't expect any bot code will be updated just for this edge case. Probably best if someone just publishes a collision report here for images that have the same name here and on Commons and the local one is not free. Bonus points for including if the image is used on a wikidata item. People who care a lot about non-free images can keep an eye on this report and take care of any collisions. Multichill (talk) 16:27, 11 April 2020 (UTC)[reply]
- @Multichill: Speaking of the bot being used on other wikis, it hasn't been approved for global use and its use on small wikis has thus drowned out the recent changes queue (which is limited to the last 1000 edits) for patrollers trying to keep those wikis clear of spam and vandalism. And saying that you don't expect any bot code will be updated just for this edge case seems pretty backwards; bots ultimately exist to relieve Wikipedians of doing tasks that we'd otherwise have to do, not create greater problems for us. I completely agree that in a perfect world we'd have eliminated all the enwiki/commons conflicts long, long ago, but unless you're volunteering, I think it'd be a good start to comply with the copyright laws and copyright policy. Kevin (alt of L235 · t · c) 16:38, 11 April 2020 (UTC)[reply]
  And I am sure you know that Multichill was doing this single-handedly for a long time and stopped doing this in I believe 2012 or 2013 when he was told his work is not appreciated, and nobody in the community stood up for him.--Ymblanter (talk) 17:44, 11 April 2020 (UTC)[reply]
  @Ymblanter: I'm sorry, I didn't know that. I was unnecessarily cold in my message and I regret that, Multichill. My core point is that dealing with these cases is the responsibility of the bot operator whose bot is adding non-free images in contravention of policy. Best, Kevin (alt of L235 · t · c) 17:48, 11 April 2020 (UTC)[reply]
I concur with Multichill. I might even go one step further. Force all nonfree images to have a tag to that effect included in the filename. But that would need to be discussed elsewhere. Anyway the suggestion of Multichill would solve the Listeriabot problem from a practical point of view. Agathoclea (talk) 16:45, 11 April 2020 (UTC)[reply]
- As Kevin says, bots should be reducing the burden on other editors, not increasing it. It would be better to update the bot once to not violate policy on the use of non-free images, rather than requiring editors to continuously watch for images with the same name but different licensing. The bot doesn't care about checking one extra thing for the images it uses, it takes a few CPU cycles. If the bot isn't able to operate without wasting volunteer time to make sure it doesn't violate copyright, then the bot shouldn't be operating. ST47 (talk) 16:46, 11 April 2020 (UTC)[reply]
The bot is used by a number of users and project pages. This alone is enough to prove that there is wide consensus of its usefulness. Nemo 20:59, 11 April 2020 (UTC)[reply]
I believe completly blocking ListeriaBot should be a last resort. It is used quite a lot and people clearly find it useful. I recognize that it probably will be difficult to prevent the bot from adding non-free images since the botop is inactive, but there are still ways to remedy the issue. My first idea would be making JJMC89 bot add {{bots|deny=ListeriaBot}} to pages with {{Wikidata list}} if it has to remove non-free content and logging it somewhere. If that was done ListeriaBot could continue as usual while avoiding copyright concerns. Pinging JJMC89. ‑‑Trialpears (talk) 21:19, 11 April 2020 (UTC)[reply]
Unless or until the bot is configured to comply with copyright laws of the United States and our local policies, it should not be approved to run. That is the bare minimum we should expect of a responsible bot operator. If the operator cannot or will not do so, the bot should have its flag revoked and blocked from all editing until the threat to the project is resolved. — Wug·a·po·des 21:22, 11 April 2020 (UTC)[reply]
Unblock and restore approval. The bot performs a very useful function (building Wikipedia-space lists of articles linked on other Wikipedias but missing from this one, in aid of people looking for articles in need of creation) and the issue causing the block is minor, incidental, and caused by other problems unrelated to the bot (we should not have non-free images that shadow free ones and it is unsurprising that when we do it will confuse both the bot and human editors). If possible, the bot should be made to recognize this situation (I am not sure how difficult this task is and how much extra load it will put on the bot — currently it gets its information from a single SparQL query and this would likely mean that it would have to perform hundreds of additional queries per page to check the status of images) but I see that as a long-range enhancement to aim for and not a reason for blocking. Any images identified as causing problems for the bot can be moved to better names, fixing the problem without any need to resort to a block. —David Eppstein (talk) 23:05, 11 April 2020 (UTC)[reply]
It's impossible to write a bot whose edits are guaranteed to comply with copyright laws, just as no editor can make that guarantee for their edits. If you add a free image from Commons to an article, you have no way of guaranteeing that somebody will not then upload a different non-free image locally with the same filename. The image in the article now breaches fair use, so who is responsible for the breach? Is it the editor who originally added the free image from Commons? or the editor who uploaded the non-free image locally? You solve those problems, as well as the one the bot experiences, by making sure that non-free images held locally have a designation in the filename – e.g. (NF) – . That would need a greater consensus than we can create here, but there would be huge incidental benefits for re-users of our content who would immediately know what part of our content isn't available under CC-BY-SA. --RexxS (talk) 23:21, 11 April 2020 (UTC)[reply]
RexxS, as it happens that is discussed at Wikipedia talk:Non-free content#Requiring non-free content to indicate that in their filenames. ‑‑Trialpears (talk) 23:29, 11 April 2020 (UTC)[reply]

RexxS, “breaches fair use” You mean breaches our non-free content criteria. Fair use is a copyright violation defense, not a rigid set of laws. —TheDJ (talk • contribs) 08:50, 12 April 2020 (UTC)[reply]

@RexxS: As explained in detail at the other discussion renaming the files would not accurately inform people of the copyright status of any image (not all free images are cc-by-sa, and even the ones that are are not all the same version; non-free files may be free for a re-user depending on their commercial status, geographical location and whether they need to allow for the possibility of derivatives). It would however actively mislead reusers into thinking the filename was all they need to know. Thryduulf (talk) 10:06, 12 April 2020 (UTC)[reply]
@TheDJ: I'm pretty certain everyone will be aware of what would be breached despite my shorthand.
@Thryduulf: I can see you point, but I'm not sure I agree with it. To all intents and purposes, a file hosted on Commons is usable anywhere by third parties given that they meet the license conditions. The minutiae of the licence does not change it from a free-to-use file into a non-free-to-use file. A recognisable token in the filename of a non-free-to-use file hosted on enwiki would at least indicate to which of two groups they belong, and a third-party would be able to differentiate immediately which content is not available from content that may be available upon meeting the conditions of its licence. --RexxS (talk) 16:48, 12 April 2020 (UTC)[reply]
I'm still amazed that the whole problem here is that an En.wp and Commons users/sysops cannot get together to delete or move either one of the colliding files, which would clearly be the least disruptive, most effective and easiest solution to a relatively rare problem and something that should happen, wether the file is in use or not honestly. but no if it aint my boathouse ill just block the bot. —TheDJ (talk • contribs) 09:10, 12 April 2020 (UTC)[reply]
@TheDJ: Actually, that is the most common solution to a not-so-rare problem. But sometimes a deletion (or two deletions) are a better move and in that case the problem lingers for a while. More importantly, someone has to notice the shadowing situation, first, and GreenC bot doesn't do so instantaneously ... in a way it's a race condition between Listeriabot and GreenC bot (+the admins who process Category:Wikipedia files that shadow a file on Wikimedia Commons) Jo-Jo Eumerus (talk) 09:21, 12 April 2020 (UTC)[reply]
A bot should never be edit warring with another bot. When one of those bots is enforcing enwiki policy, and the other is violating it, it is clear where the problem is. This bot should remain blocked until the maintainer corrects the issue. Bot policy requires that maintainers be responsive to issues that may arise. I don't fault the operator for allowing this to happen the first time, as there are indeed a lot of edge cases in our little encyclopedia. However, once it has been discovered, the problem needs to be fixed, or the BAG should withdraw approval from this bot. ST47 (talk) 15:49, 12 April 2020 (UTC)[reply]
@ST47: In this case, it would actually be better if the Listeria edits were allowed to stand, and generic NFC removal bots were stood down on Listeria pages, because the presence of a NFC file on a Listeria page is a good, very specific, really easy to detect tell-tale for a filename with the file-shadowing problem, that is otherwise rather hard to detect, and which is best dealt with by specific action to rename one or more of the filenames and resolve the collision, which surfacing the problem enables. Here, removing the tell-tale is not helpful. Jheald (talk) 11:46, 13 April 2020 (UTC)[reply]
@Jheald: How do you anticipate the issue would get flagged if NFC bots are not involved? Nikkimaria (talk) 00:15, 14 April 2020 (UTC)[reply]
@Nikkimaria: The issue that causes the problem here is very specific. It is not your generic NFC problem of somebody having purposely added an NFC image to an article or list where it shouldn't be, that the bots typically deal with. Instead, ListeriaBot is trying to show a completely legitimate Commons image, but the issue is that there is a local NFC image which unfortunately has the same name. GreenC bot is already on the case, adding such images to Category:Wikipedia files that shadow a file on Wikimedia Commons, for humans to decide the best new filenames to resolve the issue. (Which then ensures it cannot occur again for that file). But if this is not enough, it would be straightforward to run a specific SQL query to find images that are on Listeria pages, and also in Category:All non-free media, which could also be used to flag the files into the category for them to get the attention they need. Jheald (talk) 14:32, 14 April 2020 (UTC)[reply]
I concur with User:Multichill, and I'm actually appalled by the attitude of this community towards Wikidata and its tools, since it goes against every and all principles we have about cooperation, mutuality and good faith. I've been silent this whole time, since Wikidata's very inception, so I beg you to forgive me this burst of annoyance. --Sannita - not just another it.wiki sysop 13:48, 14 April 2020 (UTC)[reply]

let us consider what Listeria lists stands for

This discussion has been closed. Please do not modify it.

The following discussion has been closed. Please do not modify it.

I am quite happy to have the bot flag of ListeriaBot considered but lets do it properly. When you really, really, really want to go this way. Let us consider what Listeria lists stands, its Wikipedia alternatives. Disambiguation, red links blue links and black links. Most importantly how we share in the sum of all knowledge and how English Wikipedia can play a vital role in it. Let's include images linked to people, the role Commons can play in this. How English Wikipedia can keep its non free images and inform on the images that it keeps in this way. Let this conversation not be about an edge case.

By all means discuss a bot flag for ListeriaBot but do present a serious alternative. Serious not in intentions but serious in that it will serve us in a way that is imho missing in what Wikipedia stands for in its dismissal of collaboration on multi project and multi language levels. Without a reasonable outcome branding us all as Wikipedia is mostly painful because of what we could stand for together. Thanks, GerardM (talk) 18:03, 11 April 2020 (UTC)[reply]

GerardM, I don't say this often, but you just wrote a lot of text without saying anything. What are you trying to say? Primefac (talk) 18:12, 11 April 2020 (UTC)[reply]

Primefac maybe you understand my blogpost better.. For me this episode is another reason why I do not want to be associated with Wikipedia. What is it with you people? Thanks, GerardM (talk) 08:44, 12 April 2020 (UTC)[reply]

@GerardM: Your blog post also doesn't say anything that is useful to this situation. Everybody agrees that the core job this bot does is useful, so simply repeating that it is useful and explaining why it is useful adds nothing of value. The issue is that the bot will not be unblocked unless and until it is reprogrammed so that it doesn't edit outside of what it has authorisation to do. This is exactly how every other bot that has bugs is treated - if the operator does not stop it then it is blocked. Listiera bot is not special, it is being held to same standards required of every bot that operates on the English Wikipedia. Thryduulf (talk) 12:15, 12 April 2020 (UTC)[reply]

@Thryduulf: at issue is that English Wikipedia has pictures that are not free. These pictures only show on English Wikipedia. What we can do is include a link to the Wikidata item on Commons and only show pictures marked that way. It has additional benefits because those pictures will be easier to find including in "other" languages like Russian, Kannada, Comanche. It says so in the blogpost.

Also do you not think that this is where English Wikipedia untouchables take a position where the penalty to our community is excessive. What are you guys thinking??

Also, when are we going to discuss and act on issues with quality on English Wikipedia. At least 4% of list entries in English Wikipedia are erroneous and the quality of maintenance by hand is substantially less than Listeria maintained lists. Together we will do a better job. Thanks, GerardM (talk) 12:31, 12 April 2020 (UTC)[reply]

How is any of that relevant to this discussion? It doesn't matter what else Listeria bot does, can or could do, it will not be unblocked unless and until it is reprogrammed so that it doesn't do anything it does not have community consensus to do. Thryduulf (talk) 12:37, 12 April 2020 (UTC)[reply]

How is it that you are only willing to consider a perceived wrong and not willing to consider the meat of the matter, quality? When you insist on branding ListeriaBot as ill behaved because of a corner case, a four percent improvement fixing the false friends in Wikipedia lists is quite substantial and should have your attention. Thanks, GerardM (talk) 13:10, 12 April 2020 (UTC)[reply]

Simply put, the harm done by one avoidable copyright violation outweighs all the other good things the bot does. Human editors that behave in the way this bot does (knowingly or recklessly introducing copyright problems, editing otherwise than in accordance with consensus, ignoring editing restrictions) are regularly blocked. The good they do elsewhere is not regarded as justifying the harm they cause. Listeriabot is not special and there is no reason to treat it differently than any other bot or editor would be treated. Thryduulf (talk) 14:23, 12 April 2020 (UTC)[reply]

Simply put, substantiate your claims. You argument is about a corner case where a procedure exists to overcome issues. We are talking about issues at Commons, a project that is more stern in its maintenance of copyright then English Wikipedia is. On the other hand, an error rate of 4% of all English Wikipedia lists is substantial, there is no mitigation the case is well argued. In addition Magnus has demonstrated that Listeria lists are better maintained than the average manually maintained list. Now consensus is something to hide behind when arguments fail you. Such behaviour is not special and harms our cause. Is quality of English Wikipedia a consideration at all? Thanks, GerardM (talk) 14:47, 12 April 2020 (UTC)[reply]

Accidental usage of non-free images

The issue raised here seems to be centered on the accidental use of non-free images via the edits of this bot. I've posted a proposal at Wikipedia_talk:Non-free_content#Requiring_non-free_content_to_indicate_that_in_their_filenames that would resolve this without requiring changes to this bot. Please comment there! Thanks. Mike Peel (talk) 18:45, 11 April 2020 (UTC)[reply]

You do realize that until that RFC concludes you're essentially saying you're okay with situations like this, where two bots mindlessly edit war for no reason other than the fact that one bot is not "behaving" correctly? Primefac (talk) 20:33, 11 April 2020 (UTC)[reply]

@Primefac: I'm suggesting a broader solution that would fix this issue while simultaneously avoiding any similar situation arising in the future. I could implement it tomorrow if there's consensus to do so. Thanks. Mike Peel (talk) 20:43, 11 April 2020 (UTC)[reply]

Then by all means, please fix the bot so that it doesn't add non-free files to non-articles. Primefac (talk) 21:13, 11 April 2020 (UTC)[reply]

@Primefac: I'm suggesting fixing enwp so the bot wouldn't cause problems. Mike Peel (talk) 21:15, 11 April 2020 (UTC)[reply]

While I realize it's only been a few hours, there is currently no consensus to implement your plan. I am genuinely curious, why is there such reluctance to make this change? Primefac (talk) 21:18, 11 April 2020 (UTC)[reply]

It is not entirely obvious to me that this change is an easy one to make. How many additional server queries would be required to detect commons-images-shadowed-by-non-free-local-images, compared to the queries the bot already makes to do its work, how much extra load would be caused by the queries, and what information from the server is available for the bot to determine whether a local image that shadows a commons image is unfree? Have you actually done this analysis? Or do you just assume that because you can do it as a human by clicking on and using your human ability at natural languages to read a few pages that it will be equally easy for a bot? I don't actually know that it's difficult, but I don't know that it's easy, and I don't see convincing evidence that you do either. —David Eppstein (talk) 23:47, 11 April 2020 (UTC)[reply]

The bot would just need to check whether the image is in Category:All non-free media. This can easily be done using the categorymembers API, or if the bot runs on toolforge, using the database mirrors. ST47 (talk) 23:53, 11 April 2020 (UTC)[reply]

@David Eppstein and ST47: ST47 beat me to it. Glancing at the source, the bot makes ample use of SQL queries. A patch for this issue would be no more than 10 lines of code (perhaps I'll create a pull request...). --Mdaniels5757 (talk) 00:06, 12 April 2020 (UTC)[reply]

@Mdaniels5757: I suspect you may find it will take a bit more than that, and be rather more server-intensive than you seem to think, to deal with a problem of shadowed file-names that shouldn't exist anyway. Jheald (talk) 12:41, 12 April 2020 (UTC)[reply]

I don't think that the bot has to detect shadowed files; it needs just to detect whether the enwiki filepage is non-free and that can be done by checking for the All non-free media categories category or the {{Non-free media}} template. It's not really reasonable to expect a bot (or even a human) to detect incorrectly licenced files on either project; I'd file these under GIGO and let editors take care of them as they come across them. Jo-Jo Eumerus (talk) 08:52, 13 April 2020 (UTC)[reply]

@Jo-Jo Eumerus: Sure. I don't disagree.

But (as I suggest in more detail in the section below) what will add to the complexity of the bot, and the load on the servers, is having to detect when it is adding files at all, and then having to run a SQL request to check each one of them -- and having to do so in a way that is specific to en-wiki distinct from any of the other 70 wikis the code is serving, fracturing what otherwise is relatively simple single unified code.

Moreover, again as argued below, the most relevant point I think is it may actually be beneficial that the bot is surfacing files with this shadowing issue, that (once the edit is made) can then be rather easily picked up by an SQL intersection of images in the non-free category and images on Listeria pages, so that the underlying filename problem can then be identified and fixed, rather than it continuing to fester under the surface. Jheald (talk) 09:18, 13 April 2020 (UTC)[reply]

But again, why does it need to display the image for that purpose? As repeated ad nauseam through all three discussions, the fix isn't some kind of lengthy recoding; it's literally the addition of a single colon so the relevant section of the reports generates [[:File:Filename.jpg]] instead of the current [[File:Filename.jpg]]. ‑ Iridescent 09:58, 13 April 2020 (UTC)[reply]

(ec) @Iridescent: As I understand it, that is not the fix that Jo-Jo was suggesting. He was suggesting making that change only for the files that are local non-free ones. Which is a much larger coding job, with the issues noted above.

What you suggest would remove display of all the images in all the Listeria lists - all 2500 of them on enwiki - to deal with a transient incidental issue that affects at most only a handful of images at a time out of all of those lists, and in talk-space not main space. It means for example, in a Listeria blue-link/red-link list of paintings by an artist, Listeria would no longer show what the paintings were, which can be hugely useful for the identification of paintings that may go by various different names, or for the identification of duplicates; it makes it hard to identify paintings that may have substandard images, or eg to prioritise article creation for paintings that have really good images. So I do think that the blanket turning off of all images on Listeria pages is a step to be avoided if we possibly can.

Also, it means that the mechanism described above, of being able to identify shadowed images by a simple SQL query through their being used on a Listeria page would fail, because they would no longer be being used on a Listeria page. Jheald (talk) 10:21, 13 April 2020 (UTC)[reply]

Mind you, we already have a bot that flags shadowed files (GreenC bot), so I wouldn't consider another bot doing the same as a large advantage. Jo-Jo Eumerus (talk) 10:17, 13 April 2020 (UTC)[reply]

If it's doing such a good job, then why do we have this problem? Jheald (talk) 11:36, 13 April 2020 (UTC)[reply]

Probably because ShadowsCommons situations were not really time sensitive matters until ListeriaBot began getting confused by them. I am also not sure how the latter helps "surfacing" the shadowed files. Also, I think that I and Iridescent are approaching different ends of Listeriabot in an attempt to resolve this issue (I am looking at the input, Iridescent is proposing a change to the output) Jo-Jo Eumerus (talk) 12:36, 13 April 2020 (UTC)[reply]

@Jo-Jo Eumerus: Thinking about this a bit more, from a strictly coding point of view, the cleanest approach (if there really is a problem here that needs to be dealt with) might be to let the existing code make its edit, wait a few seconds for the SQL tables to update, then run an extra script to make an SQL query to see whether any of the images now on the page were also in the non-free images category, and if so then edit the page to insert a colon to turn the displayed file into a link, and also make sure that the file was categorised in Category:Wikipedia files that shadow a file on Wikimedia Commons.

This would have the advantage of requiring only the most minimal changes to the existing main Listeria script, that runs across 71 wikis; and also making sure that the identified files were in the category for fixing. But it would come at the cost of an extra edit, and of the files still appearing where they shouldn't for a few seconds. Do you think such an approach would be (a) workable, and might be (b) acceptable? Jheald (talk) 13:59, 13 April 2020 (UTC)[reply]

Status of bot

I've restored the full block based on consensus here and at AN that the bot was operating outside policies in multiple areas and that the initial block was good, and that the unblock did not coincide with our normal practices on bot unblocks. The only objection raised at AN was procedural (wait 24 hours), and since that thread has been closed and this one is dealing more with the technical issues, I felt that it was best to act on the community consensus while there was still an obvious link to the discussion. I feel that this fulfills the requirements of WP:WHEEL that discussion occur first and consensus be reached. If there is consensus that I have acted out of line and that I am misinterpreting policy, another administrator is free to reverse. TonyBallioni (talk) 21:31, 11 April 2020 (UTC)[reply]
- "The only objection raised at AN was procedural (wait 24 hours)" That is not true. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:02, 12 April 2020 (UTC)[reply]
  - Special:Permalink/950518164#Reblock_discussion TonyBallioni (talk) 14:37, 12 April 2020 (UTC)[reply]
    - Your link does not substantiate your claim. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:11, 12 April 2020 (UTC)[reply]
The bot indeed does not have approval to violate our fair use policy and should remain blocked on enwiki until the copyright issues are solved. That it is deemed useful does not make copyright policy optional. Headbomb {t · c · p · b} 13:24, 12 April 2020 (UTC)[reply]

What happens to Wikidata updates?

Apologies if I've missed this, but what's going to happen with the Wikidata list updates from now on (apart from them not happening)? Thanks. Lugnuts ^{Fire Walk with Me} 07:27, 12 April 2020 (UTC)[reply]

As it is, they happen. Just not on English Wikipedia. Thanks, GerardM (talk) 08:39, 12 April 2020 (UTC)[reply]

Lugnuts, which Wikidata lists specifically, can you give examples ? There are many possible interpretations of your phrasing, and more exact answers require more exact questions. —TheDJ (talk • contribs) 08:46, 12 April 2020 (UTC)[reply]

@TheDJ:, ones such as WP:WIROLY. This would be updated every day or so, removing links that now have wiki articles. Lugnuts ^{Fire Walk with Me} 08:56, 12 April 2020 (UTC)[reply]

Lugnuts, the only effect is that updates won’t happen. And you cant make new lists either. —TheDJ (talk • contribs) 09:03, 12 April 2020 (UTC)[reply]

What happen is that the hundreds, if not thousands, of editors whose work is assisted by Listeriabot, and by whose consensus it has operated for years, get badly inconvenienced. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:00, 12 April 2020 (UTC)[reply]

Inconvenience in non-essential tasks is a small price to pay when the alternative is mass copyright violation and bot operator that cannot and/or will not follow basic bot policy. Thryduulf (talk) 09:58, 12 April 2020 (UTC)[reply]

Citation needed. Mass copyright violation, REALLY who are you fooling! Thanks, GerardM (talk) 11:27, 12 April 2020 (UTC)[reply]

@Thryduulf: There simply isn't mass copyright violation. That's bullshit, and really you are better than this. There are a handful of edge cases, that would be well-handled by renaming the images so they don't clash with the Commons names. (Something we ought to be doing and ought to have been doing anyway). Jheald (talk) 12:02, 12 April 2020 (UTC)[reply]

There is nothing stopping the bot as currently programmed from committing mass copyright violations, and given the bot operator does not see this as a problem with the bot, then I'm sorry but the encyclopaedia is better off without the bot. Thryduulf (talk) 12:07, 12 April 2020 (UTC)[reply]

@Thryduulf: The bot isn't committing "mass" copyright violations. I don't know if you've looked at the bot's contribution history and scrolled back through the last 7 days to get an idea of the number of projects and contributors using this bot to organise and present their workflows, but it's quite a number. So no, the encyclopedia is not "better off without this bot". The present block is a wildly disproportionate over-the-top response to deal with a tiny handful of edge cases that shouldn't exist anyway if en-wiki had been doing its job. Jheald (talk) 12:37, 12 April 2020 (UTC)[reply]

@Jheald: the correct number of copyright violations is zero. Any bot making greater than that many copyright violations is better off blocked regardless of what else it does - if it is important then the bot will be fixed or someone else will code a replacement bot that doesn't violate core policies. It is always the responsibility of a bot operator to ensure that it operates in accordance with policy and consensus, it is never the job of the English Wikipedia to change policies or practices to make allowances for a badly coded bot. Thryduulf (talk) 12:49, 12 April 2020 (UTC)[reply]

@Jheald: fixing the ping. Thryduulf (talk) 12:49, 12 April 2020 (UTC)[reply]

Others have dealt with your fatuous "mass copyright violation" claim. Who gets to decide which tasks are "non-essential"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:29, 12 April 2020 (UTC)[reply]

A task is essential if (a) the encyclopaedia will cease to function if it ceases, or (b) the encyclopaedia or its editors will suffer real-world harm if it ceases. So removing copyright violations is essential, adding them is not. That applies whether you regard repeatedly introducing multiple copyright violations to multiple pages for multiple years as "mass" or not. Thryduulf (talk) 15:50, 12 April 2020 (UTC)[reply]

I note that you didn't answer my question, but instead indicate that virtually nothing is essential, in your own recokoning, and by logical extension it OK to inconvenience - to greatly inconvenience, in this case - anyone and everyone not working on that very limited set of tasks that you deem essential. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:58, 12 April 2020 (UTC)[reply]

What else could reasonably be deemed essential? Inconveniencing people, greatly or otherwise, is unfortunate but as Headbomb points out this does not make complying with policy optional. Thryduulf (talk) 18:40, 12 April 2020 (UTC)[reply]

I need to use Listeria to replace my manually edited page List of Catholic churches in Salvador, Bahia. I can't do this manually anymore, it's too difficult. How do I get started with Listeria? Thanks! Prburley (talk)

That would not have been perimtted even before ListeriaBot was blocked. * Pppery * _{it has begun...} 14:43, 14 April 2020 (UTC)[reply]

@Pppery: Why not? It's just too overwhelming to edit these lists manually, and information on historic heritage sites is a vital part of the WP mission--not to mention that they're being damaged or destroyed frequently. Prburley (talk)

@Prburley: Because ListeriaBot is not approved for that. Headbomb {t · c · p · b} 17:09, 14 April 2020 (UTC)[reply]

@Headbomb: What's the process for having it approved? The solution below will work, but I think Listeria is what I'm looking for. Prburley (talk)

@Prburley: See WP:BOTAPPROVAL. Headbomb {t · c · p · b} 17:40, 14 April 2020 (UTC)[reply]

@Prburley: It could use a bit of tidying-up, but as a basic Listeria list, if you put something like this on a page in your own user-space, Listeria would then update it to give you something like this. Having hand-checked it, you could then copy & paste it to a live page in article space (and similiarly with updates down the line). As Pppery says, Listeria currently isn't allowed to edit or update article space on en-wiki directly (unlike pt-wiki where this week this Listeria list made "featured list" status), but as I understand it, there is no objection to using Listeria to create a list in your own user-space and then copying that to main-space, so long as you have personally hand-checked it, and it is appropriately referenced.

There are various tweaks that could be added to the pretty quick and rough example above -- for example the coordinates could be presented more attractively, a notes column could be added and populating by adding described by source (P1343) statements to the Wikidata entry for each church, some English labels or English sitelinks may be missing, we could probably do better for "location", etc; and there may be entries missing from the list because they don't have items, or aren't currently identified as churches; or don't currently have the right diocese (P708) information. But I hope it gives some idea at least of what is possible. And of course, having curated it for English wikipedia, the data is then also immediately available for anyone wanting to get Listeria to make a version of the page in Portuguese or any other language. Jheald (talk) 15:24, 14 April 2020 (UTC)[reply]

Also probably worth noting that the page above was generated on Wikidata. If it was generated on en-wiki, then blue-links would be to en-wiki articles, with a choice of red-links or plain text for items not matched to en-wiki. Jheald (talk) 15:36, 14 April 2020 (UTC)[reply]

Version with what Wikidata has at the moment for location (P276) now added [1]. Currently a bit sparse, but could easily be improved. Jheald (talk) 15:33, 14 April 2020 (UTC)[reply]

Thank you for your amazing suggestions! I really appreciate it. Prburley (talk)

Forking the bot

Everyone seems to agree that the bot should be updated to fix the issue, but that isn't possible since the operator is inactive and hasn't fixed the issue the other times it's been brought up. The solution then seems to be forking the bot and implementing a patch. Since the source is public on bitbucket it shouldn't be too hard. Anyone who would volunteer to take on the task? ‑‑Trialpears (talk) 10:01, 12 April 2020 (UTC)[reply]

@Trialpears: I'm not quite sure who the "everyone" is that you are referring to. I see a number of editors above saying that the more appropriate solution would be to deal with the handful of files with names that shadow Commons, that it would be worth fixing anyway. Jheald (talk) 12:48, 12 April 2020 (UTC)[reply]

@Jheald: The point repeatedly made and ignored is that we already do deal with those images, and that no other bot has any issues with the status quo. Thryduulf (talk) 12:51, 12 April 2020 (UTC)[reply]

@Thryduulf: If en-wiki already is dealing with these images with shadow names, then the block here is even more a gratuitous unnecessary overkill than was first apparent. If even the handful of problems that have caused this fuss are already getting resolved within a few days, then why block the bot? If the issue is already in hand, and gets resolved routinely, then why all this fuss? Jheald (talk) 13:59, 12 April 2020 (UTC)[reply]

Corollary - user is found adding copyrighted content to a page. They are reverted and warned. They do it again, and they are reverted. Repeat ad nauseam. Let's say every time they add the copyrighted content they are reverted within a few days.

After how many reverts and warnings would we block the user? 1? 3? 10? My personal experience says 3, and from what I've seen the bot has previously done this on a single page more than a dozen times.

Yes, the images the bot is trying to place are being removed. HOWEVER, the bot shouldn't be placing them in the first place. It shouldn't be editing in article space (which is a secondary/minor issue being brought up again in this thread). "Living" editors get blocked all the time for this sort of behaviour, regardless of how otherwise useful their edits may be. Thus, it only makes sense to block a bot that is performing in the same manner. Primefac (talk) 14:04, 12 April 2020 (UTC)[reply]

@Primefac: Oh I'm sorry, I thought when User:Thryduulf said "we already do deal with those images" he meant that something was actively being done to prevent the problem recurring, by renaming the badly named local images on en-wiki. That fixes the problem. The purpose of this bot is to show what the SPARQL query returns, producing a facility that hundreds of users (or thousands, if you include all wikis) are using. If you're just removing the images from the page, then you're not solving the problem. On the other hand, if you solve the actual problem, by renaming the image (which ought to be renamed anyway), rather than covering up the conflict, then the issue goes away for good, and no change is needed to the bot. Jheald (talk) 14:20, 12 April 2020 (UTC)[reply]

At this point I think we're talking past each other. Yes, the images can/should/are being renamed when they are found. I'm not saying that shouldn't happen, because it already happens. But the bot should not be adding them to pages anyway. Why does it have to be one or the other? Why can't it be both? Primefac (talk) 14:36, 12 April 2020 (UTC)[reply]

(edit conflict) @Jheald: The reason for the block has been explained several times. The copyright issue has been ongoing and ignored since at least 2017 - that's far, far, far longer than anyone has a right to expect to ignore a problem without sanction, and that ignores the other issues of the bot operator not responding to multiple other complaints about editing beyond its authorisation. There are two ways that shadow images can happen, the first is for a new file to be uploaded to enwp but this does not happen as only sysops can do that and they get a warning about it. The other way is for a new file on Commons to be uploaded with the same name as a file here - there is no way that en.wp can be anything other than reactive to that situation (technologically or otherwise) so we have implemented a mitigation strategy that seemingly works for every single other bot on the project and, as multiple people more knowledgeable than me about coding, have said would be trivial to implement. This is an issue that would not have arisen had the operator of listeriabot operated in accordance with the rules every other bot operator has to follow. If you choose to base your workflow on a bot that operates outside of policy then that's a risk you choose to take. Thryduulf (talk) 14:16, 12 April 2020 (UTC)[reply]

@Thryduulf: As you say, there are multiple people more knowledgeable than you about coding. I suspect that such change would not be as trivial to implement as some people may have airily pronounced above (without any diff to back up their assertions), because the lists on the Listeria pages are not being generated from SQL queries that could be extended, but directly from WDQS via a SPARQL query -- and moreover, not a SPARQL query specified by the bot creator, but from whatever SPARQL query the user creating the list chooses to submit. Converting the results of that query (plus a couple of helpful macros) straight to wikitext is pretty straightforward, and makes for a nice clean straightforwardly-coded bot. Having to fish around in those results to see whether any of the columns returned are for a file, then to run a SQL query to check each file for a list that may be several hundred entries long is a significant coding overhead, adding unnecessary messiness and complication, non-negligible additional load on the servers, make it more likely that a particular update may as a result fail for a given list, and make the cause of such failures harder to diagnose.

I submit that that is not worth the candle for an issue which is unintended and transient and having its underlying cause already being taken care of by other bots. In fact, it sounds as if, by surfacing these filename collisions, which are then fixed, the bot in its present form may actually be doing some useful service.

In recent years Magnus has been being extraordinarily productive and creative, constantly producing and refining a non-stop stream of tools that are now underpinning a vast quantity of projects and work across Wikipedias, Commons, and Wikidata, as well as personally maintaining Mix'n'match which has now reached 3,500 different catalogues of identifiers, all being actively matched and cross-referenced. Listeria works. It does what it is meant to, displaying the results of a SPARQL query on a wiki page. If in the process it exposes some bad en-wiki filenames so that they can then be fixed, then so much the better. Given how much Magnus is achieving with his time at the moment, and how many new sorts of work he is making possible, and how useful Listeria is as it currently is, I would not seek to waste one moment of his precious limited time on an issue that is transient, is exposing fixes to filenames that need to be made anyway, and which according to you already get dealt with, when there is so much more he could be achieving doing other things. Jheald (talk) 15:14, 12 April 2020 (UTC)[reply]

Your entire argument is predicated on the problems the bot is causing being trivial. They are not. Copyvios, even transiently, are a big effing deal. A bot editing pages it is not authorised to edit is a big effing deal. If Magnus is not able to properly maintain the bot then he shouldn't be operating it, exactly the same as any other bot operator. Why they are not able to properly maintain it is irrelevant. Nobody's time is too valuable to edit in accordance with consensus, and if they think it isn't then they should not be editing at all. Good contributions elsewhere do not, and cannot, justify disruption. Claiming that these errors should be allowed to stand because they cause work for other editors fixing problems that may (or may not) need to be fixed by others is disrupting Wikipedia to prove a point. Thryduulf (talk) 15:43, 12 April 2020 (UTC)[reply]

Off-topic discussion. Relevant discussion moved below this close

Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:08, 12 April 2020 (UTC)[reply]

Are you seriously trying to argue that a bot making copyright infringing edits to the encyclopaedia is not a problem with the bot!? I'm sorry but that's the most ridiculous argument I've seen so far! 18:43, 12 April 2020 (UTC) - — Preceding unsigned comment added by Thryduulf (talk • contribs) 19:43, 12 April 2020 (UTC)[reply]

I'm seriously suggesting that you don't know what you're talking about. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:49, 12 April 2020 (UTC)[reply]

This is a blatant personal attack which must lead to a block, but unfortunately Pigsonthewing is unblockable on the English Wikipedia.--Ymblanter (talk) 19:22, 12 April 2020 (UTC)[reply]

That's not the first time you've falsley accused me of making a personal attack; I suggest you desist. To continue to do so would betray a lack of understanding of what constitutes a personal attack. In other words, it would demonstrate that you don't know what you're talking about. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)[reply]

I think it is pretty clear that you continue your uncivil behavior only because I said several days ago that I am involved with you and will not block you on the English Wikipedia, and you clearly expect that your wikifrends will cover this behavior, as it happened multiple times in the past. It does not make it more civil though.--Ymblanter (talk) 12:11, 14 April 2020 (UTC)[reply]

I think it is clear who is being uncivil; and making personal attacks; and bring a grudge from elsewhere; and is not here to contribute. And it is not me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:29, 14 April 2020 (UTC)[reply]

Let's all calm down a bit.

Andy: could you please elaborate on how the issues are not "being caused by the bot"? As I see it, there is a very clear link between the bot making edits and non-free content appearing in a manner prohibited by policy. Mdaniels5757 (talk) 19:50, 12 April 2020 (UTC)[reply]

As I said; this has been explained previously. The bot makes a good-faith edit, at the request of a random editor, to link to a free image that exists on Commons. This fails, because this project stupidly allows a different, non-free, image to exist, using the same file name. (Notwithstanding that in one case a non-free image had apparently been placed on Commons by someone else; you can no more blame the bot operator or requesting editor for that, than you would a human who inadvertently included it on a page.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)[reply]

@Thryduulf: Wikipedia has a reputation for being sparing with non-free content, which is hard-won and has been achieved with some pain, but is very very valuable and useful. But let's dial the rhetoric back to reality, because over-dramatisation really isn't helpful. In terms of legal risk construed narrowly, such as might "cause the encyclopaedia or its editors [to] suffer real-world harm", that's quite the extreme bit of knicker-wringing above, and we would do better to keep the discussion here rather more grounded in reality. In narrow legal terms, most of what the bot might inadvertantly add due to the filename confusion would probably be protected as fair use or fair dealing anyway, even though it would fall outwith the narrower limits of policy. Moreover, because of the notice-and-takedown protections granted to content hosts, legally the clock only starts running once a notice has been received, if the site is slow on acting on it. That is unlikely, given (i) that, according to you, procedures are in place to fix the name conflicts as soon as they become visible; and (ii) even if they had slipped though, an insta-fix (renaming the file) would be available as soon as any notice was achieved to bring it to awareness. So legal consequences are far-fetched. That leaves the potential for reputational consequences. I in no way dismiss the significance of such consequences just for being reputational -- I think most of us would agree that Wikipedia's overall reputation is orders of magnitude more important than whether we happen to be upheld or not in a single legel case. But I think there is room to differ on whether allowing Listeria to surface a handful of files transiently for a few days until an organised procedure fixes the problem by renaming them (a renaming which I think we agree is desirable anyway in its own right) actually has any reputational significance. I frankly don't see it. Others might differ, but if we do have an organised procedure which is identifying and fixing these filename collisions that have the potential to confuse users, even at the cost of those files as a result sometimes being made visible for a few days where they shouldn't, then to me I think that's actually quite positive to Wikipedia's reputation: we have identified a problem of potential filename confusion, and have an active and effective system in place that helps identify cases and deal with them. To me, that actually seems reputation-positive, rather then reputation-negative. Looked at cold, I don't think there is either a legal or a reputational risk here, so long as the filename issues being exposed are indeed getting rapidly dealt with.

Further on that point, it's increasingly clear that when Listeria surfaces one of these filename collisions, that is actually helpful -- because in general these collisions are rather hard to find, whereas the intersection of files that are non-free and files that are on Listeria pages are rather easy to identify, with a cause that is very clear, making this rather a useful way for them to be surfaced, so that they can then be fixed (which is something we want to do). So it would be helpful if bots that auto-remove non-free content would ignore Listeria pages, so that this files with this very specific issue can be left in place, so that they can then be identified and picked up by the established procedure that is specifically appropriate for them. Shooting the messenger, or killing the canary in the coalmine, is not actually helpful if what we are wanting to do is to identify and fix these collisions.

Finally, it is worth noting that the Listeria currently operates across 71 different wikis (keeping in all over 66,000 different lists actively updated), all from the same code. From a maintenance point of view it is not good design to make the code more complicated than it needs to be. It is not good design to make the bot mask filename problems rather than expose them, so they can be more readily identified and fixed. And, particularly when thinking about when a deep maintenance change may be required (such as recently when the wbterms table was retired), it is absolutely not good design to fragment that single piece of code into a multitude of different scripts, each specialised to a different wiki, that then all have to be updated separately. Such a change is not something to enter into lightly.

Luckily in this case no such change is actually necessary, because (as discussed above) in the present case the way the bot is surfacing filename issues is not just tolerable, it is actually useful, and should be retained. Jheald (talk) 23:41, 12 April 2020 (UTC)[reply]

That's a hell of a lot words to say "Please can we knowingly violate the non-free content policy because the bot is useful and fixing it would be a lot of work?". The answer to that can only be "No. Complying with policy is not optional.". Your comment about set the of non-free files on Listeria lists being visible to bots is also interesting, because that requires bots to easily be able to distinguish free and non-free files - a task that those arguing for this policy exception claim is very complicated and/or impossible for Listeriabot. Which is it? You make grand noises about safe harbour, protection from legal harm, etc. but that's not the point at all - those only apply because we take reasonable steps to minimise the likelihood of copyright violations happening in the first place (that's the point of the NFCC), we can't abandon that and still claim protection. Finally, you say that the images appearing in the lists are probably fair use anyway - no, they aren't. A non-free image in a list cannot be being used for critical commentary or parody of the image. Thryduulf (talk) 00:39, 13 April 2020 (UTC)[reply]

@Thryduulf: WP:IAR: Understand the purpose of rules, and do what is best for the encyclopedia, rather than apply them blindly.

In this case, where the root problem is the name collision, the community has taken the view that it needs human intervention to hand-choose appropriate new names, so a bot can't fix the underlying problem.

Identifying images that are on Listeria pages and in the non-free category after Listeria has operated (and referring them for human intervention) is comparatively easy -- and, I am putting to you, the preferable option in any case, because then we fix the actual underlying problem. But it relies on Listeria having made the edit, so that the images are on the page, and can therefore be found by the SQL service as being on a Listeria page.

Changing Listeria so it doesn't add the image at all can't use this simple approach, would fragment the Listeria code with the consequences discussed above, and -- the key point -- is undesirable in its own terms because it doesn't end up with the underlying problem of the bad filenames getting fixed.

As to the narrow legal point, what you assert above is simply not the law. Unless you are facilitating piracy on the scale of something like The Pirate Bay, where assisting piracy is the very purpose of the site, the obligation laid on platforms by the law is to deal promptly with asserted copyright infringements as soon as the site is made aware of them by a DMCA notice or its equivalent. It is hugely to Wikipedia's credit, and to the huge benefit of our reputation, that we go way way beyond that, and as a result get very very few infringement notices. But we should look at this with clear eyes. Allowing a file (or at most a very small handful of files) to briefly surface in the wrong place, which we then rapidly fix by renaming the file, thereby definitively removing the possibility of any further confusion between the two files down the line, is a responsible course of action which is not going to damage WP's reputation, or in any way weaken the standing of the NFC policy. So long as the name collision is picked up quickly and then rapidly fixed, there is no more significance here than our practice, say, of leaving files briefly in place and in context while their appropriateness is reviewed at WP:FFD. So long as there is a efficient mechanism in place that is dealing with the name collisions once Listeria surfaces them, as you assure me there is, then Listeria's action is actually helping a useful process, and the prospect of legal or reputational harm by that process is non-existent. Asserting otherwise does not reflect reality. Jheald (talk) 08:19, 13 April 2020 (UTC)[reply]

Hat removed. @Headbomb: This is not a red-herring discussion. It directly pertains to what is the right way forward here: what are the actual costs and benefits of the different ways forward that might be persued; and, indeed, whether the bot surfacing these shadow filename issues is actually a problem at all, or whether it may actually be helpful, as a beneficial element of fixing files with this issue. Those are very germane issues, worth wider discussion. Jheald (talk) 10:09, 13 April 2020 (UTC) [reply]

Sorry, IAR is only for uncommon situations where the encyclopaedia would definitely be improved. Non-free images in a list is not an improvement to the encyclopaedia, ignoring a rule on an ongoing basis is never acceptable (if the reason for wanting to do so is a good one then you will have no trouble getting consensus to change the rule in such a way that you don't have to ignore it), and it being easier to ignore the rule than comply with it are all reasons to say "no, you may not ignore this rule". Thryduulf (talk) 11:15, 13 April 2020 (UTC)[reply]

The fact that this non-issue was ignored for so long, is telling us that it is not an issue. The underlying issue needs to be solved better, but it is already sufficiantly taken care of so as not to be a real problem. I suspect the whole storm is not whipped up because of the non-free images issues, but using that as an excuse to shut the bot down because some people do not agree with the source of its data. If we want to have a functioning wikipedia in 10 years time we need to be foreward looking. Looking forward means also not to close your eyes to the solution. Agathoclea (talk) 12:01, 13 April 2020 (UTC)[reply]

@Agathoclea: I can't speak for everyone of course, but I'm one of the most vocal in support of this bot's block but I'm also a strong supporter of Wikidata and of the benefits its provices. The reason I'm making a fuss now is that this is the first time I've been aware that the non-free files problem has existed. Unlike the bot controller who has been aware since at least 2017. Thryduulf (talk) 12:15, 13 April 2020 (UTC)[reply]

"to say 'Please can we knowingly violate the non-free content policy..?'". That is not what is being said, and it is wrong and misleading of you to suggest that it is. What is being said is more like "If a bot very occasionally and inadvertently causes a thumbnail of a non-free image to be shown on a non-mainspace page, because this project stupidly allows file names that duplicate those of different images on Commons, please can we deal with that in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:16, 14 April 2020 (UTC)[reply]

You want the bot to be allowed to display a non-free image on a non-mainspace page. Displaying non-free images on non-mainspace pages is explicitly against the non-free content policy. You therefore want the bot to be allowed to violate the non-free content policy. You can try to weasel out of it by blaming others for not making fundamental changes to the core software, writing additional bots and/or taking other actions that mean this one bot wouldn't need to be fixed, but that does not alter the fundamental nature of your request. Thryduulf (talk) 14:22, 14 April 2020 (UTC)[reply]

I've just told you that your claim was wrong and misleading, and your response is to make another claim that is false and misleading? I want nothing of the kind. I want - as I just said - us to resolve such rare issues in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:35, 14 April 2020 (UTC)[reply]

No, you asserted that my claim was false and misleading. I simply demonstrated that it was accurate. It's true you only want to be allowed to do it temporality, but that doesn't change that you want to do it all. Yes you want the "errant" image renamed, but that request is being discussed at WT:NFC and is unrelated to what is being requested here. What is being requested here is permission to display the non-free image until it is renamed to something that does not shadow a free image from Commons. Doing that would require an exemption from the policy prohibiting the display of non-free images outside mainspace. Preventing images on en.wp and Commons having the same file name (the only way that shadowing can be prevented) would require a change to MediaWiki software - something that is completely outside the power of en.wp to implement and so irrelevant for the purposes of this discussion. Thryduulf (talk) 15:42, 14 April 2020 (UTC)[reply]

Note The above discussion has nothing to do with possible forks of ListeriaBot, do not re-open. Move it to WP:VP if you have to. Headbomb {t · c · p · b} 15:33, 13 April 2020 (UTC)[reply]

@Headbomb: It's not for you to close a discussion you are party to.

And yes, it is absolutely appropriate to look at the negative consequences that could arise from forking ListeriaBot, as well as the practicality of the suggestion. That's why we have discussions here, so people can raise and work through exactly such points.

So, reverted. Jheald (talk) 15:36, 13 April 2020 (UTC)[reply]

I am not a "party to this discussion", I'm a BAG member and as a BAG member, I can tell you that nothing above is pertinent to a forking discussion, nor do they tackle the "costs and benefits" of forking. There is basically three paths forward a) fixing ListeriaBot, which requires Magnus to communicate and update their code b) forking it so someone else can update the code and run it instead of Magnus c) fixing all filename collisions and preventing them from happening in the future. A) would be quick if Magnus gets around to fixing their code and let us know they've done so. B) Can be quick, someone just needs to fork the publicly-available code and make a WP:BFRA C) is the least-likely to occur, given it involves identifying and fixing all collisions, and preventing them from happening in the future. c) is not impossible to do, but it's a much bigger and slower effort than either a) or b) solutions. Headbomb {t · c · p · b} 15:34, 14 April 2020 (UTC)[reply]

You being a member of BAG gives you no authority to manage the discussion here as you have attempted to do, including closing discussions immediately after you have participated in them. You have just closed another section - [added] to which you were very much a party - where someone has falsely accused me of trolling, after I pointed out they were making provably incorrect claims, leaving me no avenue to refute that accustaion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:46, 14 April 2020 (UTC)[reply]

Refute it elsewhere. The conversation was going nowhere and is cluttering up the more-valid discussions about the bot and its activities. Primefac (talk) 16:53, 14 April 2020 (UTC)[reply]

@Headbomb: Other paths might include:

d) recognising that these NFC glitches are few and far between, get flagged for clearance pretty quickly already by Green C bot, and are essentially harmless.

e) creating a script specifically to identify and mop up these occasional glitches, perhaps along the lines suggested here, to run either periodically, or soon after each Listeria edit

f) some other approach that some experienced bot author may suggest here.

It is valuable to flag and recognise that complicating or forking Listeria are not zero-cost options. Listeria has infrastructure-level significance across 70 wikis, allowing any arbitrary SPARQL query to immediately be presented as a fully-formatted list on the wiki, that will then be kept updated. This currently supports 65,000 live pages across those wikis, including wide use for project management, wide use for individual curation projects by individual users, and wide use to present the results of collaborative curations with external partner institutions. So this is code that it is important to keep as clean and maintainable and unified as possible. It is not code to mess around with lightly, not code to add complexity to unnecessarily; and it is code to avoid fragmenting as far as we possibly can, so that any fixes or updates immediately apply everywhere, and so that it continues to reliably work in the same way with the same syntax wherever it is used, so that the pages calling Listeria can continue to be portable between one wiki and another immediately without change.

Do these considerations trump all others? Not necessarily. But they are not small things either. Jheald (talk) 16:57, 14 April 2020 (UTC)[reply]

Those are all variations of c). Headbomb {t · c · p · b} 17:07, 14 April 2020 (UTC)[reply]

Not really. (d) suggests that this whole issue is a de minimis trifle, and not worth further worrying about; (e) suggests leaving Listeria unchanged, but identifying the issues case-by-case as they come up on the fly, which is rather different to up-front mass-fixing all filename collisions, and could be rather smaller and easier than (a) or (b); (f) might be something entirely different again. Jheald (talk) 17:22, 14 April 2020 (UTC)[reply]

That was going to be my next suggestion - to extract out the code that does the wiki-list updates and run that via another bot. I have no experience in this area, but I'm sure it can be done. Advanced thanks to anyone and everyone who can help with this. Lugnuts ^{Fire Walk with Me} 11:56, 12 April 2020 (UTC)[reply]

This is just a suggestion. I think the tool can be placed in Maintainer needed section in Phab. Adithyak1997 (talk) 12:09, 12 April 2020 (UTC)[reply]

If someone wishes to fork out the bot and put through a new BRFA, please do. Having an active bot operator and a clearly-defined bot task is much preferred over the current situation. Primefac (talk) 13:28, 12 April 2020 (UTC)[reply]

"Inactivity" of the operator

This is going nowhere. I will leave this un-hatted so participants can read it, but the consensus here is that the bot operator is essentially inactive on the English Wikipedia, which is the salient point for an en-wiki bot. Primefac (talk) 15:08, 14 April 2020 (UTC)[reply]
I'll pre-emptively add that whether two-months without an edit on Wikipedia qualifies as 'inactivity' is immaterial. The point here is that WP:BOTCOMM matters, and the expectations of the English Wikipedia community is that bot operators may not ignore communications that occur on the English Wikipedia, or require that issues are raised on a different forum. Headbomb {t · c · p · b} 15:20, 14 April 2020 (UTC)[reply]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Split from #Forking the bot above.

"the operator is inactive" Maybe you could get a clue about what your fellow volunteers contribute, before you disparage them with such ignorance? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:04, 12 April 2020 (UTC)[reply]

@Pigsonthewing: He could be sweating like a galley slave elsewhere, but as far as en.wp is concerned, he hasn't been active since (early) February. Which means: as afar as leaving a talk-page message in the traditional fashion goes, the likelihood of receiving a reply is receding rather than improving. HTH. ——SN 54129 18:40, 12 April 2020 (UTC) ——SN 54129 18:40, 12 April 2020 (UTC)[reply]

But that wasn't the claim made. Regarding your latter point, have you looked at his talk page? It appears that not one of the people loudly complaining about the bot has taken heed of the guidance there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:47, 12 April 2020 (UTC)[reply]

Unfortunately, not many editors here know (or probably care!) about bitbucket; why should they? Afterall, en.wp has plenty of ways and places itself for on-wiki communication. Specifically, saying Beats spreading [messages] over half a dozen talk pages is slightly disingenuous: there's only one talk page he needs to worry about, and it's that one. ——SN 54129 19:04, 12 April 2020 (UTC)[reply]

@Pigsonthewing:, it has been a longstanding principle of bot operation that enwiki users don't need to register elsewhere to take their concerns to a bot operator. Magnus hasn't been active on enwiki in 2 months. While the threshold of what exactly is "inactivity" will differ from people to people, saying that Magnus has been inactive in the past two months is hardly "disparaging them" or being "ignorant". So instead of complaining here that enwiki editors prefer to keep enwiki issues on enwiki, you could contact Magnus on BitBucket yourself if you think this will lead to a speedier resolution. Headbomb {t · c · p · b} 00:02, 13 April 2020 (UTC)[reply]

{{citation needed}} And don't attempt to close discussions immediately after posting to them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:13, 13 April 2020 (UTC)[reply]

Citation: WP:BOTCOMM: "Bot operators should take care in the design of communications, and ensure that they will be able to meet any inquiries resulting from the bot's operation cordially, promptly, and appropriately. This is a condition of operation of bots in general. At a minimum, the operator should ensure that other users will be willing and able to address any messages left in this way if they cannot be sure to do so themselves." WP:BOTACC: "All policies apply to a bot account in the same way as to any other user account." Thryduulf (talk) 12:26, 13 April 2020 (UTC)[reply]

Those are indeed citations. Just not for the claim that was made. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:00, 13 April 2020 (UTC)[reply]

Those citations support the claims immediately preceding your request for citations. If you were requesting citations for something else then you need to actually specify what that something else is, we cannot read your mind. Thryduulf (talk) 20:05, 13 April 2020 (UTC)[reply]

"Those citations support the claims immediately preceding your request for citations" They do not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)[reply]

Were he the operator of just one bot, on just one project, your point might have a shred of validity. As he is not, it does not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 13 April 2020 (UTC)[reply]

Incorrect, the same rules apply to everybody: if you want to operate a bot on the English Wikipedia you must be available to respond to issues on the English Wikipedia. What the operator does or does not do on other projects is irrelevant. If an operator is unable or unwilling to deal with issues related to their bot on the English Wikipedia then they will have their operator privileges for the English Wikipedia withdrawn, regardless of why they are unable or unwilling to follow basic policy. Thryduulf (talk) 12:20, 13 April 2020 (UTC)[reply]

Poppycock; try reading the post I was replying to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:24, 13 April 2020 (UTC)[reply]

I read, and have re-read, the post you were replying to. There is only one talk page he has to worry about regarding the English Wikipedia. If he has to pay attention to other talk pages for other business that's his choice, but it doesn't make either my or Serial Number 54129's posts incorrect. If there is too much for him to keep track of then he needs to either stop something or hand it over to someone else who can resolve the issues. 20:05, 13 April 2020 (UTC)[reply]

I note that the claim you now make is not the claim to which I replied; both 54129's and your earlier post are incorrect. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)[reply]

Andy, I am making the same claim in both posts just using different language because you apparently misinterpreted it the first time. Both make the same claim that I understand SN54129 was making. Additionally your comments in these discussions are getting increasingly towards a style of "I'm disagree with something you said, but I'm not going to tell you what it was or why I disagree with it, because I'm right and you are wrong." This is not how to resolve a dispute. Thryduulf (talk) 14:16, 14 April 2020 (UTC)[reply]

You are indeed making the same claim more than once. However it is different to the claim made by 54129, which you wrongly said I was incorrect to describe as not valid. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)[reply]

On the asumption you've indented correctly, yes you were relying to me. Incorrectly. As, I repeat, it does not atter how many bots he runs or where he does so: what he does on the Engish Wkipedia will be discussed on the Engish Wikipedia, there are literally no other two mays about it. You are either accidentally or deliberately misunderstanding what (multiple) editors are telling you; since you are clearly competent, it can only be assumed that the latter applies. ——SN 54129 14:28, 14 April 2020 (UTC)[reply]

I see that you, too, are now making a different claim to the one I originally described as invalid. The confusion, deliberate or otherwise, is not mine. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)[reply]

I can't imagine wy you feel the need to troll the discussion, but, here we are. ——SN 54129 15:05, 14 April 2020 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Recent coronavirus-related publicity for the bot

See this Hacker News thread on a Listeriabot-created list, of Wikidata-notable people who have died from the Coronavirus. It would be helpful if we could get this issue resolved so that the bot can continue keeping the list up-to-date; it is of current interest and rapidly changing. —David Eppstein (talk) 19:20, 13 April 2020 (UTC)[reply]

David Eppstein, That list is on wikidata and thus is not effected by the bot being blocked on English Wikipedia. The underlying issue doesn't really effect other sites since non-free content isn't used to the same extent (or at all) on other wikis. ‑‑Trialpears (talk) 20:01, 13 April 2020 (UTC)[reply]

Yes, I was just coming back to post a clarification about this. I'm skeptical that other Wikipedias don't use non-free content, but maybe one fix would be to migrate the various Listeriabot redlink lists to Wikidata? Or would that be unacceptable as the redlinks are targeted at a specific Wikipedia (the one for which they are redlinks)? —David Eppstein (talk) 20:04, 13 April 2020 (UTC)[reply]

I guess that would work as a temporary measure if something is important to have updated several times a week. I am quite confident that this will be resolved within a week and everything will be back to normal. ‑‑Trialpears (talk) 21:04, 13 April 2020 (UTC)[reply]

What a crock. Listeria does not link to any specific red links OTHER than to the red links on that very Wikipedia. Remember the same query will result in the same content. However, red links are different. Check out the COVID-19 deaths Listeria list on the Dutch Wikipedia. English Wikipedia is one of the exceedingly few Wikipedias that supports non-free imagery. This disaster is one of your own making. Thanks, GerardM (talk) 16:00, 14 April 2020 (UTC)[reply]

Gerard, this is factually incorrect. Among bigger projects, only Dutch, Spanish, and Swedish Wikipedias disallow fair use, and German is very restrictive. It is by far not the majority.--Ymblanter (talk) 16:20, 14 April 2020 (UTC)[reply]

Hatting of comments

Off-topic (non-admin closure) ——SN 54129 18:53, 14 April 2020 (UTC)[reply]

The following discussion has been closed. Please do not modify it.

My reply to Thryduulf that "Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam." is nether off topic nor irrelevant, but has been included in a section collapsed as such; including by an editor with whom I am in disagreement on that point, and by an involved admin who blocked the bot in question. Another section has been closed in a most partisan manner, by an editor whose earlier closure of that section was also reverted after he tried to use his closure to have the last word. It really is unacceptable for people on one side of a discussion to try to manage it in this manner. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:20, 14 April 2020 (UTC)[reply]

Oh wait, he didn't make the latest close to the latter; he just insrtered his comment after it was closed. Can we all do that, or is that too only for people on one side of the discussion? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:41, 14 April 2020 (UTC)[reply]

@@ Line 253: / Line 253: @@
 === Hatting of comments ===
+{{hat|reason =Off-topic {{nac}} [[User:Serial Number 54129|<span style="color:black">'''——'''</span>]][[Special:Contributions/Serial Number 54129|<span style="color:black">''SN''</span>]][[User talk:Serial Number 54129|<span style="color:#8B0000">54129</span>]] 18:53, 14 April 2020 (UTC)}}
 My reply to Thryduulf that {{tq|"Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam."}} is nether off topic nor irrelevant, but has been included in a section collapsed as such; including by an editor with whom I am in disagreement on that point, and by an involved admin who blocked the bot in question. Another section has been closed in a most partisan manner, by an editor whose earlier closure of that section was also reverted after he tried to use his closure to have the last word. It really is unacceptable for people on one side of a discussion to try to manage it in this manner. <span class="vcard"><span class="fn">[[User:Pigsonthewing|Andy Mabbett]]</span> (<span class="nickname">Pigsonthewing</span>); [[User talk:Pigsonthewing|Talk to Andy]]; [[Special:Contributions/Pigsonthewing|Andy's edits]]</span> 17:20, 14 April 2020 (UTC)
 :Oh wait, he didn't make the latest close to the latter; he just {{diff|Wikipedia:Bots/Noticeboard|950922265|950920561|insrtered his comment}} after it was closed. Can we all do that, or is that too only for people on one side of the discussion? <span class="vcard"><span class="fn">[[User:Pigsonthewing|Andy Mabbett]]</span> (<span class="nickname">Pigsonthewing</span>); [[User talk:Pigsonthewing|Talk to Andy]]; [[Special:Contributions/Pigsonthewing|Andy's edits]]</span> 17:41, 14 April 2020 (UTC)
+{{hab}}

v t e Noticeboards
Wikipedia's centralized discussion, request, and help venues. For a listing of ongoing discussions and current requests, see the dashboard. For a related set of forums which do not function as noticeboards see formal review processes.
General	Administrators Main Incidents Bots Bureaucrats Centralized discussion Closure requests Education Interface admins Main Page errors Open proxies VRT Oversight User permissions
Articles and content	Biographies of living persons Copyrights Questions on media Problems Dispute resolution External links Fringe theories Neutral point of view Original research Pending changes Reliable sources Resource requests Scalable vector graphics Spam Blacklist Whitelist Style Titleblacklist Translation
Page handling	History merges Mergers Splits Moves Protection Importation XfD Articles Redirects Categories Templates Files Miscellany Undeletion
User conduct	Conflict of interest Contributor copyright Edit warring and 3RR Sanctions Personal restrictions General sanctions Contentious topics Sockpuppets Usernames (Requests for comment) Vandalism
Other	Arbitration Committee noticeboard Requests Enforcement Edit filters Requested False positives Questions Help desk Teahouse Reference desk New articles Requests for comment Village pump Policy Technical Proposals Idea lab WMF Miscellaneous WikiProject proposals Discussions for discussion
Category:Wikipedia noticeboards

Bot-related archives
Noticeboard 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19
Bots (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22 Newer discussions at WP:BOTN since April 2021
Bot policy (talk) 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 29, 30 Pre-2007 archived under Bots (talk)
Bot requests 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86
Bot requests (talk) 1, 2 Newer discussions at WP:BOTN since April 2021
BRFA Old format: 1, 2, 3, 4 New format: Categorized Archive (All subpages)
BRFA (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15 Newer discussions at WP:BOTN since April 2021
Bot Approvals Group (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9 BAG Nominations
Wikipedia Wikipedia_talk
v t e