MediaWiki talk:Spam-blacklist/archives/August 2019

drive.google.com


Cannot think of a responsible use of this other than for the Google Drive article. I see this being used to using original research or otherwise unreliable sources, or worse for malware/spam distribution.

Unfortunately, a number of articles are using Google Drive links as references or otherwise. I picked a random article to see how what kind of content was being used - Hyperinflation in Brazil seems to link to original research in the Google Drive link used there.

I would like to see additional input - I think it isn't a problem to use these in project or userspace, but I would say that 90% of mainspace usage would be problematic. Does the community have any other thoughts? Jon Kolbert (talk) 19:46, 5 July 2019 (UTC)
 * Upon further reflection this would probably be best as an edit filter to limit the blacklist to mainspace and allow extended-confirmed users to use it elsewhere since spam-blacklist is for every namespace. Jon Kolbert (talk) 20:12, 5 July 2019 (UTC)
 * Some of these seem to be historic documents - could they, should they, be transferred to archive.org ? Case in point, the NSA interview transcripts from "Rasterschlüssel 44"? DS (talk) 21:09, 5 July 2019 (UTC)
 * I feel like that would be a more stable, reliable solution than a link to a Google Drive folder whose owner can change the contents at any point in time. Jon Kolbert (talk) 21:48, 5 July 2019 (UTC)
 * 'Some of these seem to be historic documents' .. is this google drive the only place where they are available. I would argue that although it is certainly valuable to have a link to an online copy, it is not absolutely needed (as long as you uniquely describe the document).  And if they are out of copyright (so really historic) that they could easily be incorporated into WikiSource.  --Dirk Beetstra T  C 13:32, 7 July 2019 (UTC)
 * Comment I have alerted WP:RSN to this discussion as the above comment relates to reliability. I myself support this proposal for the following reasons:
 * It probably fails WP:ELNO.
 * Some pages also fail WP:ELNO.
 * It unambiguously fails the reliable sources criteria as user-generated content, and better sources are almost always available. –LaundryPizza03 ( d c̄ ) 22:47, 15 July 2019 (UTC)


 * The assumption that there is a specific kind of source involved is misguided. A public, general-purpose file storage service is ambiguous. Like YouTube, it can be used for reliable sources (primary or secondary) and appropriate external links, or inappropriate ones. This is why WP:YOUTUBE gives caution but says that "Links should be evaluated for inclusion with due care on a case-by-case basis." Just the other day I cited a magazine which distributes its back issues through Google Drive. Is there widespread abuse, compared to similar sites, that would justify the drastic step of blacklisting it? Kim Post (talk) 00:29, 16 July 2019 (UTC)
 * gauging abuse here is a difficult one. If 10% turns out to be (likely) copyright violations then yes, there is abuse.  Abuse in the term of spamming, I don't think so (but then we would not discuss this if that was the case).  I agree that the case seems similar to Youtube, but I don't know about the ratios - how many are copyright violations, how many are convenience, how many are not replaceable, etc. (noting that of the material on Youtube that is useful to Wikipedia the percentage of (likely) copyright violations is higher than the overall percentage on Youtube).  --Dirk Beetstra T  C 04:38, 16 July 2019 (UTC)


 * Special:Search/insource:"drive.google.com" shows 2,550 articles currently citing Google Drive. If only 90% of mainspace usage is problematic, it means 255 articles are using Google Drive as a legitimate source, which is too high for blacklisting. If the content of the sources is appropriate, though in the best format, an edit filter showing a warning message, or having a bot to undo additions by new users, is a better approach than blacklisting the link and requiring all uses to be whitelisted. feminist (talk) 01:58, 16 July 2019 (UTC)
 * 'If only 90% of mainspace usage is problematic' .. only? If that 90% of the cases has roughly 20% (likely) copyright violations (the first link I clicked on was link to a personal copy of an article copyrighted by Elsevier where I would consider that this is likely/maybe out of scope of what Elsevier allows, and, obviously, there is a proper link to the proper, albeit paywalled, article) then we are talking hundreds of copyright violations.  That is way too high to allow unlimited inclusion (and hence, blacklist might be appropriate).  (in short: you would need a full analysis of all, not just eyeballing 10% is fine, for all you know, it is only 1% that is fine, which is something that the whitelist can easily handle).  I could however agree with adding this to XLinkBot or an edit filter to step this up and reconsider blacklisting after a couple of months.  --Dirk Beetstra T  C 04:38, 16 July 2019 (UTC)

Very much in tow minds, yes it is no different from any other storage medium, but (as others have pointed out) it might also (as a storage medium have stuff that would pass RS. At this time I lean to no.Slatersteven (talk) 09:24, 16 July 2019 (UTC)
 * It's a storage medium, no more or less likely to contain good or bad content than any other arbitrary website. --GRuban (talk) 19:40, 16 July 2019 (UTC)
 * by the Devil's advocate: so it is just as likely to contain bad content as the website of the BBC, youtube, Elsevier, or blogger? --Dirk Beetstra T  C 20:30, 16 July 2019 (UTC)
 * Respectively, no, yes, no, and yes. The point is that the BBC and Elsevier exercise editorial control. Blogger and YouTube and Google Drive do not. So, yes, most stuff on YouTube and Blogger and Google Drive don't meet our criteria as reliable sources; but some does, so we shouldn't throw the baby out with the bathwater expert self-published opinions out with the overwhelming majority of self published opinion. --GRuban (talk) 21:38, 16 July 2019 (UTC)
 * The BBC are not a storage medium, they are a creator.Slatersteven (talk) 08:54, 17 July 2019 (UTC)
 * The BBC is a creator who stores their info on their own site, many people who are a creator and do not own an own site store it somewhere else, like on youtube, blogspot or on drive.google.com.
 * Exactly, Gruban. Blogger, YouTube and Google Drive do not have editorial control, and are generally unreliable.  With the first 2 of those we exhibit quite strong editorial control.  They are on XLinkBot, and we generally do not hesitate when abuse is so bad that material needs blacklisting (there are several blogger sites on the blacklist, and specific Youtube videos/channels.  Other of those 'free storage sites' we have blacklisted, like Hulu, examiner, based on a similar discussion as this one.  The question is whether the good material (material that is really needed) outweighs the bad material (rubbish, copyright violations, 'spam', etc.).  The point 'no more or less likely to contain good or bad content than any other arbitrary website', this one falls well in the range of blogger, Youtube, Hulu and examiner, and I wonder whether it is just as likely to have bad material as YouTube, or just as likely to have bad material like Examiner (to pick 2).  --Dirk Beetstra T  C 15:46, 17 July 2019 (UTC)


 * Support At Google drive they are user-generated content. Yes, some reliable source are offline source and/or behind paywall, but it is not the reason to re-publish them under google drive as pirate copy. Also, wikipedia should not use url that point to those pirate resource. Are there any genuine source that were hosted as Google Drive? Please point it out among 8,932 entries of Google Drive currently in wikipedia as a black swan. Matthew hk (talk) 11:57, 17 July 2019 (UTC)
 * If you really want to collect anecdotes: out-of-print issues of C3i Magazine, a publication about wargames, are hosted on Google Drive. The official website provides these links. More to the point, the bare fact that a website is open to the public, and so could be used for bad sources or external links, is not a reason to put it on the spam blacklist. Kim Post (talk) 03:14, 18 July 2019 (UTC)
 * It is not a question of could. The link used on Zohar e.g. leads to a pdf which looks like the printout of another website.  That is very likely a copyright violation, also because I can find the text elsewhere.  I saw another example like that earlier but it seems to have been removed.  It remains a question of balance between use and abuse, and how much abuse you want to take.  And in the area of spam, anything that could be abused eventually will be.  --Dirk Beetstra T  C 03:45, 18 July 2019 (UTC) (found the links to publications, see below  --Dirk Beetstra T  C 05:00, 18 July 2019 (UTC))

It gets more interesting, in e.g. diff, diff, diff &c. links consistently to work by the same authors and in all cases links to a google-drive copy of the work of the authors (not, as is more normal, using doi or a link to the publishing papers). Now looking at the IP (which is in New York University, New York) that does overlap with the stated location of one of the three editors. Looking at the personal copy document, it state: In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. I agree that the copyright status there is a grey area, but this is likely a case of someone promoting their work (i.e. spamming) using this website as the medium for the spam. --Dirk Beetstra T C 05:00, 18 July 2019 (UTC)
 * This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited.
 * At the very least this should be on an edit filter, but actually I think it should be blacklisted. Think about it: a Google Drive document can be updated by the owner at any time, is typically not archived by the archiving services, may be in violation of copyright, and we have no proof that it's an authentic copy of the material even if it's not. Oh, and several document types allow for infection with malware, which typically gets screened out by reputable online sources. Any original papers should be identified by DOI reference not by links to personal copies on file sharing platforms, and academics almost always have the ability to upload to space within their institution's own website. Linking to Google Drive, OneDrive and the like seems like an open invitation for abuse. Guy (Help!) 09:56, 2 August 2019 (UTC)

onezorse.com


Spam additions including reference substitutions. Guy (Help!) 09:50, 2 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 21:55, 4 August 2019 (UTC)

youngstownwater.com


Spammy links added to external links sections and as references - Eureka Lott 20:56, 4 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 21:54, 4 August 2019 (UTC)


 * Please add promptly, this is not the actual water company and is being used for phishing. Risker (talk) 22:56, 4 August 2019 (UTC)

BLP spam - self-published and dangerous source


Spam on a BLP with some determination, also turns up in other BLPs. Guy (Help!) 15:35, 7 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist. --Guy (Help!) 15:40, 7 August 2019 (UTC)

walkultimate.com
Seems relatively recent, but these contemptible people are manipulating existing references for spammish purposes. See here, here, here and here for some examples. Thanks, Cyphoidbomb (talk) 23:32, 11 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 23:36, 11 August 2019 (UTC)
 * Thanks! Cyphoidbomb (talk) 23:37, 11 August 2019 (UTC)

business-sale.com
Is this what is referred to as a 'regex' issue?

When including a link to this domain, contributors get the message: Your edit was not saved because it contains a new external link to a site registered on Wikipedia's blacklist. ...... --->

"The following link has triggered a protection filter: -sale.com"

It appears that because the domain has '-sale.com' contained therein, it is by default being classed as a spam domain. In fact it is an industry-recognized Google News-listed authority site at least 15 years old with journalists employed to research news about companies falling into insolvency or larger companies that have been put up for sale or have been divested. -- Montymoore (talk) 01:05, 11 August 2019 (UTC)
 * . If WP:RSN establishes that it is a reliable source, then . To delist, . —&thinsp;JJMC89&thinsp; (T·C) 23:42, 11 August 2019 (UTC)

econlib.org
This link appears to have been grouped in with spammy immigration law additions? [] in a March 2017 addition to the Local blacklist: []

Maybe the spamvertizers used this link also in some of their posts...but this link is to the real website for this organization: The Library of Economics and Liberty It may be a biased reference, but it is currently cited in one article: Wage share.

I'm guessing this was a false positive? --- Avatar317 (talk) 23:11, 5 August 2019 (UTC)
 * Discussed many times. Black Kite (talk) 23:15, 5 August 2019 (UTC)
 * no, the spamvertizers (declared paid editing ring) also had a close connection to the subject econlib and edited and created pages related to econlib. Maybe not part of the paid editing, but definite conflict of interest.  FYI, except for the encyclopedia (which is whitelisted) all information is full text available from other libraries, and in many cases even from WikiSource.    --Dirk Beetstra T  C 04:09, 6 August 2019 (UTC)


 * , I fixed this. It was an absolutely standard example of the genre: a public domain work, linked to the right-wing think tank, and listed as being published by them. That last bit of deceptive attribution is very common, I have found and removed hundreds, often to very well known works like Gibbon's Decline And fall Of The Roman Empire. In this case the full text is available at Gutenberg, and the publisher is not the Orwellian-titled "Library of Economics and Liberty" but John Murray, of London. I don't think there were many right-wing think tanks operating in 1817, and this one certainly wasn't. Guy (Help!) 07:18, 6 August 2019 (UTC)
 * for the record. —&thinsp;JJMC89&thinsp; (T·C) 23:43, 11 August 2019 (UTC)

selfgrowth.com
Appears to have extensively been spammed by many users (too many to list here, but COIBot lists them, of course). No extent use in mainspace. However, the COIBot report stops at 2015, but shows a number of possibly empty sections (a possible bug?) Thanks, — Paleo  Neonate  – 22:48, 5 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 23:48, 11 August 2019 (UTC)

Spam for advisory websites


Recurring spam for two - possibly related - advisory and registration websites (see COIBot reports for more details). Several warnings have been ignored. GermanJoe (talk) 09:50, 5 August 2019 (UTC)
 * The IPs of the servers that the websites seem to run on are very similar. --Dirk Beetstra T  C 10:26, 5 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 23:52, 11 August 2019 (UTC)

biglybt.com
Long running campaign mostly by IPs or SPA's to get this program in the encyclopedia despite multiple removals as separate article, external link or section. See for the section see the examples here,. For a link see the example here. The developer claims that the demand for inclusion in "Comparison of BitTorrent clients" is unfair: Talk:Comparison of BitTorrent clients. The Banner talk 00:00, 12 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 00:24, 12 August 2019 (UTC)

census2011.co.in
This is the official website containing the 2011 census data of India. Why is it blocked? It is used as a source on countless articles. SD0001 (talk) 10:04, 17 August 2019 (UTC)
 * . Official goovernment sites are on .gov.in. Guy (Help!) 17:27, 17 August 2019 (UTC)

CGAP.org
Is it possible to remove CGAP.org from the blacklist? It is a reliable source on financial inclusion. — Preceding unsigned comment added by Noel92140 (talk • contribs) 14:45, 15 August 2019 (UTC)

cgap.org is a reliable source on financial inclusion. It is a trust fund housed by the World Bank and it provides relevant publications on financial inclusion Adding this link would be beneficial for all pages providing information on financial inclusion, digital credit, policy, customer protection. — Preceding unsigned comment added by Noel92140 (talk • contribs) 14:54, 15 August 2019 (UTC)


 * Here's the previous discussion from 2012. – Thjarkur (talk) 17:52, 15 August 2019 (UTC)
 * Have you discussed this on WP:RSN? Guy (Help!) 17:28, 17 August 2019 (UTC)

Kyrgyz medical school phishing sites
Th official websites (.edu.kg) of Medical Institute, Osh State University and International School of Medicine Kyrgyzstan are being repeatedly changed to these almost identical phishing websites (containing the same @gmail.com contact information). – Thjarkur (talk) 10:23, 18 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 21:37, 18 August 2019 (UTC)

skillsaustralia.edu.au


Repeated linkspamming from new accounts to evade scrutiny, so it seems that it is time to add this to the blacklist. - MrOllie (talk) 13:03, 21 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist. --Guy (Help!) 22:23, 24 August 2019 (UTC)

straitsresearch.com


Repeated spamming for a self-styled "research" and marketing website. Multiple warnings and a block have been ignored. GermanJoe (talk) 13:10, 22 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist. --Guy (Help!) 22:23, 24 August 2019 (UTC)

targetworldtours.com


Usual problem: refspamming of a tour operator. Unlikely ever to be of any use as a source. Guy (Help!) 09:59, 21 August 2019 (UTC)


 * Ah, they first tried to do that using a redirect service which then got blacklisted (typical spammer behaviour - it doesn't show what you spammed so if you are lucky we don't notice, and we didn't). --Dirk Beetstra T  C 11:27, 21 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist then, should we also be looking at the redirect? --Guy (Help!) 11:55, 21 August 2019 (UTC)
 * It appears that most redirects were blacklisted (they go by definition on meta), but it is better to check. --Dirk Beetstra T  C 11:58, 21 August 2019 (UTC)

For tracking reasons:

See also Wikipedia_talk:WikiProject_Spam/2019_Archive_Jul_1. --Dirk Beetstra T C 12:00, 21 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist. --Guy (Help!) 22:25, 24 August 2019 (UTC)

reloadbench.com


Unreliable self-published source widely used (with at least some likely spam), now defunct and hosting malware. Guy (Help!) 22:22, 24 August 2019 (UTC)
 * ✅. Guy (Help!) 15:42, 25 August 2019 (UTC)

businesstelegraph.co.uk


This domain publishes scraped articles from other publications. Editors are accidentally citing this domain instead of the original source. Examples: I have already removed or replaced all links to businesstelegraph.co.uk in articles. —  Newslinger  talk   01:37, 12 July 2019 (UTC)
 * 1) https://www.businesstelegraph.co.uk/airbnb-rentals-in-london-block-sparks-call-for-action (from Special:Diff/888209120) is a copyright violation of a Financial Times article
 * 2) https://www.businesstelegraph.co.uk/andy-palmer-revs-up-iconic-british-sports-car-firm-aston-martin-for-a-5bn-float (from Special:Diff/872611028) is a copyright violation of a This is Money article
 * 3) https://www.businesstelegraph.co.uk/icon-of-icons-autocar-awards-readers-champion-bmw-3-series (from Special:Diff/886530753) is a copyright violation of an Autocar article
 * The noticeboard discussion has been archived to . —  Newslinger  talk   03:22, 18 July 2019 (UTC)
 * Unarchived from MediaWiki talk:Spam-blacklist/archives/July 2019. —  Newslinger  talk   08:17, 25 August 2019 (UTC)


 * to MediaWiki:Spam-blacklist. --Guy (Help!) 15:42, 25 August 2019 (UTC)

marketingtochina.com


This marketing company's website has been repeatedly cited by editors despite being a self-published source. Recent removals: There is no valid use case for this domain. —  Newslinger  talk   15:24, 25 August 2019 (UTC)
 * 1) Special:Diff/912432865 in Cosmetic surgery in China
 * 2) Special:Diff/912432732 in Baidu Tieba
 * 3) Special:Diff/912432662 in Meitu
 * 4) Special:Diff/912432351 in Baidu
 * 5) Special:Diff/912432243 in WeChat
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 20:05, 25 August 2019 (UTC)

Hikvision spams


This domain was spammed to Hikvision article as well as. May be an unrelated site, another set of ip spammed a rival site on the same article. Matthew hk (talk) 11:17, 25 August 2019 (UTC)
 * And it seem there is the third spam site. Thus added to the filing. Matthew hk (talk) 11:22, 25 August 2019 (UTC)

One more? (please check the COIBot user-reports). --Dirk Beetstra T C 11:57, 25 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 20:02, 25 August 2019 (UTC)

Seem hikvision.com is the official site and not a spam site. Matthew hk (talk) 01:04, 26 August 2019 (UTC)

paydayloansto1000.com
Hello! Please remove this source from black lists. This site with useful information about payday loans and doesn't use spam methods — Preceding unsigned comment added by GrandmasterXIII (talk • contribs) 21:34, 27 August 2019 (UTC)
 * . It looks like a sketchy 5-month-old proxied domain, asking for sensitive personal information. I can't even begin to imagine why you feel it would be useful, unless for some reason I need to quickly have my identity stolen. Kuru   (talk)  01:26, 28 August 2019 (UTC)

slotozilla.com

 * and a few more (see COIBot report)
 * and a few more (see COIBot report)
 * and a few more (see COIBot report)
 * and a few more (see COIBot report)
 * and a few more (see COIBot report)

Recurring spam for a slot game website. Multiple warnings and several blocks have been ignored. No foreseeable encyclopedic usage. GermanJoe (talk) 16:37, 27 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 04:45, 28 August 2019 (UTC)

indiagift.in


This link has been spammed across a mix of articles since the start of August. Blackmane (talk) 03:29, 28 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 04:48, 28 August 2019 (UTC)

globalresearch.ca


This domain is operated by Michel Chossudovsky's Centre for Research on Globalization. There is consensus at that the website is almost never usable on Wikipedia, as it is a highly questionable source that is well-known for publishing conspiracy theories. Editors have expressed interest in adding this domain to the spam blacklist, as it has been added to. —  Newslinger  talk   11:22, 27 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 04:39, 28 August 2019 (UTC)

Thanks, ! For the record, the following two domains are also associated with the Global Research website, although they are not currently being used on Wikipedia:. Could you add them as well?



—  Newslinger  talk   04:50, 28 August 2019 (UTC)
 * to MediaWiki:Spam-blacklist. —&thinsp;JJMC89&thinsp; (T·C) 04:53, 28 August 2019 (UTC)

Famous Birthdays
This website provides neutral bios for living people, they only post articles for notable people, and their editors compile the info that they find from reliable sources. UseTheWiki (talk) 19:02, 29 August 2019 (UTC)
 * ❌ The consensus here is that the site does not meet reliable sources guidelines, and thus has no use on Wikipedia. OhNo itsJamie Talk 19:16, 29 August 2019 (UTC)

le-corps.com


Repeatedly spammed by (now blocked) user Simransharma7440. (check contribs) James-the-Charizard (talk) 04:52, 24 August 2019 (UTC)
 * Can someone look at this...? James-the-Charizard (talk) 21:04, 29 August 2019 (UTC)