Wikipedia talk:Automated moderation

About this page
I created this page because of increasing interest in applying machine learning to support the Wikipedia community in monitoring and planning responses to misconduct. Associated with this page I just created Misconduct, a page listing types of misconduct which automated moderation might seek to identify.  Blue Rasberry  (talk)  14:27, 26 July 2019 (UTC)

Not sure of category
I put this into Category:Wikipedia enforcement policies but it is not actually a WP:Policy, instead this might be an "enforcement information guide". Perhaps it is something else. Ideas?  Blue Rasberry  (talk)  14:27, 26 July 2019 (UTC)

Need for list of misconduct
I just set up WP:Misconduct and Category:Wikipedia misconduct. The point of this is to better collect and document the kind of behavior which bots might try to detect. Does anyone have thoughts about grouping these behaviors together as a distinct concept?  Blue Rasberry  (talk)  14:32, 26 July 2019 (UTC)
 * Category:Wikipedia misconduct appears to be pretty close to a duplicate of Category:Wikipedia user conduct and its various subcats. Why not just use the original cat? Ivanvector (Talk/Edits) 15:15, 26 July 2019 (UTC)

Speculation and rumors - automated attacks on wiki
One application of automated moderation is to relieve labor burden of the existing moderation which humans do.

Another application is the use of automated moderation to counter automated attacks on wiki. This is a different use case because currently, we do not acknowledge automated attacks as a problem to address.

We do not currently have documented case studies of automated attacks, so far as I am aware. If anyone has documentation on past automated attacks then please share whatever exists. My department at my university does automated moderation research on Wikipedia. Incidentally and outside of our primary research objectives, we observed what appeared to be otherwise undetected automated attacks on Wikipedia. In talking with other researchers others are reporting the same observation.

I do not want to sound any alarm at this time because I am not aware of a major problem and also because I do not have data to describe this in any way better than speculation and rumors. I am posting here on this talk page because this is a relatively quiet place that someone can find if they also are thinking about this, but talking about this here should not unnecessarily cause fear, uncertainty, and doubt because this is low traffic and almost no one would come here.

The speculation is that some organizations - perhaps commercial or governmental - or perhaps some curious but misbehaving amateur individual researcher have group-registered cohorts of farmed user accounts then used automation to have them all engage in a similar non-human and detectable pattern of behavior. I am not entirely sure this is happening, but I do believe that is is possible and plausible.  Blue Rasberry  (talk)  18:34, 26 July 2019 (UTC)