User talk:HelpUsStopSpam/Archive 1

Welcome!
Hello, HelpUsStopSpam, and welcome to Wikipedia! Thank you for your contributions. I hope you like the place and decide to stay. Here are a few links to pages you might find helpful:
 * Introduction and Getting started
 * Contributing to Wikipedia
 * The five pillars of Wikipedia
 * How to edit a page and How to develop articles
 * How to create your first article
 * Simplified Manual of Style

You may also want to take the Wikipedia Adventure, an interactive tour that will help you learn the basics of editing Wikipedia.

Please remember to sign your messages on talk pages by typing four tildes ( ~ ); this will automatically insert your username and the date. If you need help, check out Questions, ask me on my talk page, or to ask for help on your talk page, and a volunteer should respond shortly. Again, welcome! RJFJR (talk) 13:54, 7 February 2015 (UTC)

hierarchical k-means clustering
You claim that 'hierarchical k means clustering' is already mentioned but I can't see it. Could you edit the appropriate part to say "(also known as hierarchical k-means clustering)", because I don't get information from reading the existing article. I've never heard of "x-means" but I have heard of hierarchical-.


 * @Fmadd: Look for X-Means and G-Means. You may not have heard of X-means, but it has 1683 citations on google scholar. These two are "hierarchical k-means", done right. HelpUsStopSpam (talk) 08:40, 14 May 2016 (UTC)

so theres even more variations? great. Can they be listed, somewhere, along with the term 'hierarchical k-means'. This is the beauty of a wiki surely. people can discover information more easily. I would never have come across these terms was it not for that. Someone else might search for 'hierarchical k-means' - it's an intuitive label to give such technique, given two more widely known terms. "see also" / "a variation of ..." - whatever - this concept should be explained.

You also say "done right", are you really claiming that these are 'right' for every use case that will ever appear, in every niche? There are so many variations of contexts that algorithms appear in, different blends work better in different cases.. size/space/accuracy tradeoffs Fmadd (talk) 08:46, 14 May 2016 (UTC)


 * @Fmadd it's "done right" because they tackle the obvious problem of "when to stop", and not just propose the trivial idea of repeatedly running k-means (which is not a notable idea on its own). There are also about 1000 k-means variations, not all of which are notable. HelpUsStopSpam (talk) 08:48, 14 May 2016 (UTC)

ok, so sure with 1000 variations not everything deserves a unique article. What we have now with "hierarchical variations such as ... x means.." is kind of ok, but I still think variation exists here. With many variations you'd want to describe broad categories as well as drill down.

I've read the x-means paper now. If I've understood it correctly, it's main goal *is* the determination of the best value of k?(which is why its called x-means) hk-means however is aimed specifically at producing a tree structure of clusters (which in turn may be motivated by accelertion. "obvious problem of when.." .. but sometimes you still have a fixed 'k' target, something that will fill a buffer, a minimum efficient vector batch, whatever).

The x-means paper talks about using kdtrees to accelerate. So I see they are definitely related (in running a division step), but they are aimed at slightly different problems.

"not just propose the trivial idea of repeatedly running k-means (which is not a notable idea on its own)" ... so I'll stop short of wanting a 'dedicated article'.. it just wants a mention or section somewhere. the hierarchical clustering article is probably a better place for a more details description but it can link back Fmadd (talk) 10:27, 14 May 2016 (UTC)

"which is not a notable idea on its own"/ "non notable re-invention of the wheel"
Its an easy to find term, from which you could create links to other algorithms. The other algorithms you've presented don't deal with the specific situation I have in mind.

I am after a clustering algorithm that gives me a fixed depth and breadth, e.g. 4096 clusters organised as 16^3 4^12, and an algorithm to fills that optimally. It would be a variation of 'hierarchical k-means'. I'm not looking to 'spam' wikipedia with *that*, but it strikes me that wikipedia should be able to describe 'the space of clustering algorithms' such that there's a point that would describe such a solution. I don't know what it's called but I'm sure someone, somewhere else, has already done it and given it a name. Clustering algorithms are used in graphics for building BVH's for collision detection & rendering.


 * Wikipedia does not aim to be a WP:DIRECTORY of all existing algorithms. Only of notable variants. HelpUsStopSpam (talk) 07:41, 15 May 2016 (UTC)

Waterfall mg / Mehr86
I will take your word for it that User:Waterfall mg is a sock of User:Mehr86, but shouldn't there be some sort of warning on his talk page, etc.? Argyriou (talk) 18:33, 4 August 2016 (UTC)


 * like the "Please don't spam" warning he removed himself? Both accounts are a clear case of citation dropping. All they did were inserting references to the same three articles wherever possible, unfortunately. HelpUsStopSpam (talk) 09:49, 5 August 2016 (UTC)

Further reading on the Big data article & spam assessment
Peaceray (talk) 20:00, 16 September 2016 (UTC)

Data Mining
I have the italian version of the book, but there's also an english version. If you read at least the summary, you will see that that time-series analysis belongs to the data mining universe. Lbertolotti (talk) 23:59, 1 October 2016 (UTC)


 * I'm not saying that time-series analysis is not related to data mining. But in the original definition it was not included as one of the methods. One may well argue that A) time-series analysis long predates data mining (e.g. it has been done in statistics for decades) and B) it is not a method, but change detection and auto-regression are methods. Fayyad is the most notable source for "data mining". That italian book is not particularly well-known with just 13 citations (Fayyad has 8034); and it's also based on the rather peculiar "business intelligence" point of view. We don't just add everything because someone has said so, but we want the most notable sources. Fayyad is supposedly the definition of the data mining process, so we should treat it as authorative. He mentions time-series, but as an application, not as a method. The way you added time-series contradicts that original source! HelpUsStopSpam (talk) 15:09, 2 October 2016 (UTC)

You may think the source is not relevant, but this doesn't automatically make it "spam". Lbertolotti (talk) 16:39, 2 October 2016 (UTC)


 * almost every time an addition just adds a reference, it is WP:CITESPAM; in particular when it is a reference with next to no citations, slightly off-topic, and added to a popular article. And spam is a major issue on Wikipedia, in particular citation spam. HelpUsStopSpam (talk) 19:30, 2 October 2016 (UTC)

I added it to verify article content, furthermore I only added it to the data mining article, so I can't see how that is citation spamming. Nevermind, I just think something should be said about time-series analysis, otherwise the reader will think it has nothing to do with data mining. Lbertolotti (talk) 20:51, 2 October 2016 (UTC)

Polariton Interferometer
Please have a look at Polariton Interferometer, and User:Polaritonics. I've tried to clean up the article (after he citespammed Permeability (earth sciences), but I'm not sure the Polariton really counts as notable, or if the article is still too promotional. Argyriou (talk) 05:09, 23 October 2016 (UTC)
 * Note Conflict of interest/Noticeboard has been opened for comments. - Brianhe (talk) 01:59, 3 November 2016 (UTC)

Redundant categories?
Hi! I saw that you reverted the edits I made in Feature learning, Cluster analysis and K-means clustering, with the motivation that the categories are redundant. In which way are they redundant? —Kri (talk) 11:38, 5 January 2017 (UTC)


 * Avoid putting entries into all categories that seem fit. Feature learning is in Category:Machine learning already; so don't put it into Unsupervised and Supervised additionally. When you have such closely related categories, put it into the most adequate category only. Since it can be both supervised or unsupervised, the parent category is the best match. Also avoid Overcategorization. Focus on navigation: how often is a user likely to use this for navigating from one topic to another? "Unsupervised learning" is likely not very helpful, because the user will not have a generic "unsupervised problem", but rather e.g. clustering, or image recognition, or he is doing deep learning; so he will use only these categories for navigation. (Plus, in my opinion, categories are dead anyway.) HelpUsStopSpam (talk) 18:41, 5 January 2017 (UTC)

Breitbart
hi. please check and update thanks

http://www.breitbart.com/london/2017/01/08/fake-news-fake-news-media-sow-division-with-dishonest-attack-on-breitbarts-allahu-akbar-church-fire-story/ --A3420783249083290324098432 (talk) 05:04, 9 January 2017 (UTC)

Thank you for removing the references
Thank you for removing the references — Preceding unsigned comment added by Stateditor (talk • contribs) 03:04, 6 November 2017 (UTC)


 * I would prefer if you didn't add them the first place... HelpUsStopSpam (talk) 01:40, 8 November 2017 (UTC)

Deletion of my edit on Machine Learning (ML)
Why did you delete my insertion on Machine Leanring on the applications of ML? there are plenty of application s there and one NEW is forecasting of the housing market as it was published on a well respected journanl: •	“Forecasting the U.S. Real House Price Index”, with T. Papadimitriou, V. Plakandaras and R. Gupta, Economic Modelling, vol. 45, pp. 259-267, 2015. Impact Factor 0.736. So what is the problem with that? why is it spam? Periklis Gogas (talk) 08:27, 22 April 2016 (UTC)
 * Because you have a WP:Conflict of Interest, advertising your friend-and-coauthors work. HelpUsStopSpam (talk) 20:35, 22 April 2016 (UTC)

It is notable academic work on prestigious economics and finance journals. This my friend is not spam. In academia we call it reference. The are no other notable economics or finance references included in the page. I guess you are not an economist... Advertising means promoting something for profit. Academic articles in highly regarded peer reviewed journals is not advertisement. It is a citation. Periklis Gogas (talk) 08:46, 10 July 2016 (UTC)

And I quote from COI rules: "Using material you have written or published is allowed within reason, but only if it is relevant, conforms to the content policies, including WP:SELFPUB, and is not excessive. Citations should be in the third person and should not place undue emphasis on your work. When in doubt, defer to the community's opinion."

Thus mine was highly relevant, is not exessive and not written in first person. All the requirements for a scientific citation are there. It is strange and counterproductive all this for a valid reference from a prestigious journal! Periklis Gogas (talk) 08:54, 10 July 2016 (UTC)


 * Notability is not transitive. Just because it was published in a notable journal does not make every article notable. Time will tell what is notable, the article is not yet notable; in particular not for machine learning (it may be notable for housing prices). The way you inserted your article into the article is clearly for advertisment of your own work, and thus WP:COI. How many articles do you think there are that apply machine learning? We must not cite all of them. There are some 30 domains listed where machine learning has been used - we can probably find about 10000 articles... why include yours, and not, say, Facebook using machine learning to detect faces? HelpUsStopSpam (talk) 09:59, 10 July 2016 (UTC)
 * you being comparable to Ray Solomonoff and Vladimir Vapnik (because you placed yourself just after these two!), but that may be stretching your contributions to Machine Learning a little bit. HelpUsStopSpam (talk) 10:51, 10 July 2016 (UTC)


 * You do not seem to want to understand the point here. First you accuse me of ADVERTISING myself and I have proven you wrong.

First you may find 1000000 articles using ML for Facebook but you will find VERY few using it in Economics. Less than 10 in Macroeconomics. You are not an expert on ALL fields try to be more modest! I included mine because there was no example of ML from Economics in Wikipedia as there is from other areas and that is wrong. If someone else publishes an article in a more prestigious journal, with better results or whatever he/she can replace mine as an example. I do not care.


 * You cannot accept that you are wrong but it is understandable it is human nature... The article is one of the first attempts in using ML in economics. Of course there are more (not a lot as it is new in economics you are no expert on ALL fields).


 * For EVERY article there are plenty others. According to your reasoning NO article should be used as en example unless it is the FIRST one published and of course we know this is wrong. I know it hard to accept that you are WRONG... The article was placed as an example article from the field of economics and it is in accordance to all requirements. I can find 100s of similar references that are not spam. Obviously you are no academic and have no idea what is a reference and citation. My article being there does nothing for me. It helps the readers see one of the fist examples of such work in economics. The work is published where it should. It is not spam as you WRONGLY suggested and I have proven this from the guidelines I quoted above. It is a simple reference and you are biased and cannot accept you are wrong. I understand, you feel like you have power. Enjoy the "power"... :D I have better things to do than being biased and wrong in Wikipedia...


 * PS I will not read the response I do not care. You are biased.
 * Periklis Gogas (talk) 10:22, 19 July 2016 (UTC)


 * I don't need to be an expert to be able to tell the difference between:
 * 1. Christopher Bishop (1995). Neural Networks for Pattern Recognition, 24769 citations,
 * 2. Vladimir Vapnik (1998). Statistical Learning Theory, 27157 citations, inventor of SVM.
 * 3. Ray Solomonoff (1956, 1957, 1964). An Inductive Inference Machine, 2571 citations, inventor of algorithmic probability.
 * 4. Periklis Gogas et al. (2015). Yield Curve Poin Triplets in Recession Forecasting, 0 citations, advertising himself on Wikipdia.
 * This is very objective that you don't belong into this list. HelpUsStopSpam (talk) 19:34, 24 July 2016 (UTC)


 * Well, I'm afraid you may have gone overboard here. I see many of your points and your posts serve a great purpose. If Wikipedia only cites original inventors, then it would be mostly a static reference guide. You can't compare the number of citation of a new article to papers published dozens of years ago. I think you should take back this unprovoked attack and revert back the post of Periklis Gogas. At a bare minimum - you should bring that issue up on the Talk page for this topic. Thanks.  — Preceding unsigned comment added by 141.213.172.238 (talk) 23:36, 14 November 2016 (UTC)
 * I agree with this. I have my own issues stated below.  Something is terribly wrong here.  This "HelpUsStopSpam" takes out whole contributions - not just the attached reference and does not even bother to check the talk-page to see if other wiki editors accepted the contribution in the first place.  "HelpUsStopSpam" is unilaterally and thoughtlessly cutting entire sections of wikipedia apparently without consulting other editors.  This is not how it works.  He cut out a 2009 contribution of mine, agreed upon by other editors, and successively edited over the years (and accepted) and he tears the whole section out!  I do not believe wikipedia editors could agree to this.TonyMath (talk) 05:22, 4 December 2017 (UTC)

Deletion of an added reference to article on Foldy–Wouthuysen_transformation
We have the right to know if "HelpUsStopSpam" is an appointed wiki editor(s) or a self-appointed vigilante(?) Also does his background include Physics? Moore's decoupling technique described in Foldy–Wouthuysen_transformation is cited in ref. 13 (a 1990 J. Comp. Phys. paper - a high quality computational Physics journal), and well established as Moore's technique in references therein. Moore's work is also cited in the papers by e.g. W. Kutzelnigg. The reference removed was an earlier reference to his work and put in the right context - just before the formulation. If you add a reference by an author of a certain work already established decades ago in the very wiki section about the work in question, what can possibly be the conflict of interest? Moreover, this is certainly not a self-publication! This involves respectable journal publications (I also note the complaints and points made by other wiki users in this talk page). I do get it that we do not want cheap promotion of e.g. companies or recent not-so-good papers but I submit "HelpUsStopSpam" goes overboard. When it comes down to it, wiki contributions have to be made by people with at least a certain experience in the very areas they contribute to. That is especially true for Science and Mathematics. Experts in any specialized area know other experts in that very same area and (of course) have their own high-held opinions about their respective work. Even Scientists and Mathematicians promote their work while rating the work of others. There is no such thing as a purely disinterested party; a totally disinterested party would not even make the contribution to wiki in the first place. If "HelpUsStopSpam" has been appointed by the body of wiki editors to perform these article reversion tasks, that is one thing but if he has not been appointed, then he is himself an interested party imposing his own standards on all of Wikipedia itself and that will not do! TonyMath (talk) 16:33, 3 December 2017 (UTC)


 * first of all, it looks very much as if you are an author of that paper. If so, WP:COI tells you that you should certainly not add it there. Furthermore, that article, on Google Scholar, has exactly 2 citations. Since 1987. That indicates this may be either completely irrelevant research, or there is something obviously wrong in that paper... but by any means, the key issue here is WP:SELFCITE. That you already spammed Wikipedia in 2009 and nobody noticed  doesn't make this any better. HelpUsStopSpam (talk) 23:47, 3 December 2017 (UTC)


 * This will not do. I remember my first contribution to wikipedia and I was invited to write a section and put references to my papers after agreement by other editors in the corresponding talk-page. I was told be bold. We are encouraged to make pro-active contributions like this.  Self-interest is not necessarily conflict of interest. WP:COI says: And I quote from COI rules: "Using material you have written or published is allowed within reason, but only if it is relevant, conforms to the content policies, including WP:SELFPUB, and is not excessive. Citations should be in the third person and should not place undue emphasis on your work. When in doubt, defer to the community's opinion."  This has already been quoted to you and you have ignored it. Moreover, you don't have the right to vandalize content for whatever reason and unilaterally impose your veto.  Taking out content has to be discussed in the corresponding talk-page and agreed upon by other editors, often with a main wikipedia editor to make a final decision.  The J. Comp. Phys. paper has 18 citations which is pretty good for a specialized niche area in theoretical physics. I see that your contributions starts in early 2016.  That is recent.   The 2009 Wikipedia entry was not spam and was noticed and accepted by the other wiki editors. You identify yourself as "We" but unless other editors, in particular, a main editor, vouches for you and your actions to me or on your talk-page, I will assume you are nothing more than a self-appointed vigilante and act accordingly.  Someone who claims that ResearchGate is a spam generator and that citations to prominent journals is also spam, is a subjective extremist (and that it is putting it nicely).  I do not believe you are a main wikipedia editor and enpowered to make these changes TonyMath (talk) 03:08, 4 December 2017 (UTC)
 * Something else I noticed, when you state "...If you are an author...", it's clear that you are assuming that the contributor is the author without proof and without even being certain and then you cut out whole wiki sections citing the author, that and the very fact that you don't bother to check the talk-page to see if the contribution was accepted and agreed upon by other editors, or for that matter even understand the material and its dynamics, shows just how reckless and subjective you are. Your so-called contributions are just knee-jerk thoughtless removal of whole wiki sections based on your rigid notions of self-cite.  TonyMath (talk) 05:41, 4 December 2017 (UTC)
 * This should make you happier: After reverting your changes, I removed a couple of citations which you claimed as spam. So you largely get what you want.  These citations were repeated from other wiki sites to make the interconnecting sites self-contained (Wiki is always begging for citations).  Mind you, I restored the text (not the citation) in the quantum mechanical section of the Euler 3-body section.  That site wanted my contribution (By pulling the whole thing out as you did, you were doing a hatchet job on the site). Some things you need to understand: The Physics and Math communities are much smaller than IT.   Hence the number of citations and impact factors are commensurately much smaller than e.g. Medicine or IT. Wikipedia wants references in high quality journals but citations is not part of its criteria per se.  It's just yours and you don't seem to realize that the impact metrics change from area to area. E.g. Scholarpedia has high quality articles - no spam, no fat.  However, because of the rarefied atmosphere of Scientific communities, it is a safe bet that the author/expert of their own wiki articles probably know the very authors of the subjects they write about.  Yes, it's a comparatively more inbred community.  When Einstein wrote his famous 1916 review on General Relativity, only 3 people in the world understood it (and they all knew Einstein).  However, it had immediate impact but it would take a long time before it would be vindicated and for citations to build up. TonyMath (talk) 05:49, 5 December 2017 (UTC)


 * I am simply not online every day.
 * "invited on the talk page" - where?
 * "18 citations" - out of which at least 5 are self-cites. I doubt this is considered to be good for any domain, including theoretical physics...
 * The Kutzelnigg paper cites an earlier work, by Moore, to quote: "significantly different and less satisfactory".
 * I am concerned with the amount of paper spam on Wikipedia - which is a LOT - and yes, I may be rather inclined to clean up rather than to keep everything. My editing pattern is not particularly hard to understand. See e.g. . If I see new references added, and the stuff is just published ("Date Added to IEEE Xplore: 16 October 2017") and has next to no citations, then it's someone doing WP:CITESPAM. Which happens a lot. Your edit exhibits exactly such a "citation farming" behavior. The edit only adds a citation, no text, and the paper has 2 citations since 1986 on Google Scholar. In other words,  your paper was completely ignored by the scientific community (except for the dismissal as "more complicated" and "less satisfactory"). But then why is it "encyclopedic"? HelpUsStopSpam (talk) 22:05, 7 December 2017 (UTC)
 * You are taking the quote out of context. The paper was obviously not ignored because Kutzelnigg himself used Moore's method (and cites it) as the basis of his own work which he believed he had improved upon, hence the comments. You obviously don't know what you are talking about and there is no point in continuing this discussion. TonyMath (talk) 06:35, 8 December 2017 (UTC)

HelpUsStopSpam creates vandalism
Hello, HelpUsStopSpam

We have noticed you are removing categories from data-science articles (such as DataMelt). We have issued an official complain to Wikipedia about your activity. Your activity also correlates with several attempts to reach DataMelt administrators, offering help to improve DataMelt article (for money). After we declined this offer, somebody started to edit DataMelt article (removing categories). We do not have any doubts this was YOU, since we did not see this vandalism for many years of existence of this article.

Please stop your activity.

T.Smatsar, jWork.ORG administrator


 * by the definition of WP:COI, you should not edit the article, and not remove the COI tag. If someone "offered" such services to you, that was not me. I disdain commercial editing to Wikipedia, and flag these as "COI", just as I did with your article... and as you can see from the reactions, others agree with the COI problem on your article. HelpUsStopSpam (talk) 20:11, 7 May 2018 (UTC)


 * in fact, these were probably your friends at predictiveanalyticstoday - they appear to do this kind of advertisement jobs, that I have to undo then. HelpUsStopSpam (talk) 20:34, 7 May 2018 (UTC)

Hello, HelpUsStopSpam If this was not you who offered to review DataMelt on Wikipedia, then I apologize. This maybe a very strange coincidence, since your edits of this article started right after this offer.


 * well, it was not me offering that. HelpUsStopSpam (talk) 22:09, 8 May 2018 (UTC)

No headline
Stop spamming the data science article... you dont know a thing about data science. I am a professor of data science for god's sake. — Preceding unsigned comment added by 144.214.107.76 (talk) 03:50, 27 April 2018 (UTC)


 * And I am Donald Duck. So what? You then probably should know that classification is part of machine learning.

We should report your stupid editing to the world. Empty brain... show me one single publication about you actually have. — Preceding unsigned comment added by 61.10.112.50 (talk) 14:44, 11 May 2018 (UTC)


 * show me one single publication about you actually have yourself. And stop insulting people, Mr. 'Call me Professor' Anonymous. HelpUsStopSpam (talk) 17:39, 11 May 2018 (UTC)


 * For a "professor", you are behaving rather childish, and not professional at all! HelpUsStopSpam (talk) 16:12, 19 May 2018 (UTC)

Multimodel Deep Learning
Hi HelpUsStopSpam, How are you? I removed your prod. I monitor the computer science article alerts, and I need to know there is absolutely rational reason for deleting it. If you still think it needs to be removed from WP, please take it to WP:AFD. scope_creep (talk) 09:18, 10 June 2018 (UTC)
 * Articles for deletion/Multimodel Deep Learning HelpUsStopSpam (talk) 11:33, 10 June 2018 (UTC)

Good faith, really?
HelpUsStopSpam, we are quite sure that you are one of the editors who strike Wikipedia articles after you don't get what you want ($). You do not have this big appetite for removal of software articles ( AIDA_(computing), HippoDraw etc.) which do not have references, and no people behind. They are useless for you, so you make comments for such articles, but keep them in Wikipedia. DataMelt, deleted because of you, is different. Right? It had too many references, reviews etc., so it was a good target. We plan to contact admins related your activity. The internet is bitter than Wikipedia, so pay attention to other resources. — Preceding unsigned comment added by Tsma73 (talk • contribs) 00:50, 1 August 2018 (UTC)


 * As I told you before, I do not do paid edits, and I have no idea who offered you whatever they offered you. Why don't you just disclose the offer that you got? I would be in fact interested in who contacted you with such an offer - and undo their edits: because of undisclosed paid editing and the resulting WP:COI.
 * My interest here is to fight spam. In particular, I am concerned by people that abuse Wikipedia just for marketing their non-notable software (commercial spam) and advertise their publications (academic spam). And those who qualify as both, such as yours. Non-notable software that relies on Wikipedia for sales. HelpUsStopSpam (talk) 19:55, 1 August 2018 (UTC)

CRISP-DM
Dear HelpUsStopSpam,

Thank you for your keen attention on CRISP-DM page. I'm Majid Bahrepour who has been using CRISP-DM in last 10 years. In my recent projects, I saw some shortages of this model and added a new step to this. This new step may seem marginal change but actually, it is a big impact on reducing error / data quality (usually happen in the data preparation phase) for CRISP-DM users.

I believe one section is needed to be added in CRISP-DM Wiki page to let everyone present their modifications on CRISP-DM.

The readers then are able to see the recent advances in CRISP-DM as well.

I have retrieved my text and hope you agree with leaving a section for recent modifications on CRISP-DM.

Best regards,

Majid BahrepourMojassamehleiden (talk) 13:42, 21 August 2018 (UTC) Leiden, the Netherlands


 * no. What Wikipedia is not. If you think you substantially improved CRISP-DM (which is dead as a duck), then publish this in appropriate channels (e.g., a journal or magazine). When others praise your ideas in reliable sources, then we can add it. HelpUsStopSpam (talk) 22:41, 21 August 2018 (UTC)

sourced to particular wikipedia page revisions. i.e., actually unsourced.
I see the above (Subject line) comment from you to my edit when it was removed.

1. I didn't read anywhere that says Wikipedia pages cannot be referenced! If Wiki pages are credible, why can't they be sourced? 2. In this case, the edits were to address the career & disciplines driving Big Data at high-level on the Big Data page itself.

Could you please direct me to the terms of use or policy that describes what you stated in your comment?!

That way, newbie editors like me would be aware of such nuances and also to know that you are not spamming our edits!

Thank you. — Preceding unsigned comment added by Apar rajendran (talk • contribs) 19:09, 16 December 2018 (UTC)


 * Wikipedia is a WP:TERTIARY source. It writes on sources writing about things. "Wikipedia articles may not be used as tertiary sources in other Wikipedia articles, but are sometimes used as primary sources in articles about Wikipedia itself.". There is also WP:CIRCULAR which says: "Do not use articles from Wikipedia (whether this English Wikipedia or Wikipedias in other languages) as sources. Also, do not use websites that mirror Wikipedia content or publications that rely on material from Wikipedia as sources. Content from a Wikipedia article is not considered reliable unless it is backed up by citing reliable sources. Confirm that these sources support the content, then use them directly." HelpUsStopSpam (talk) 18:04, 17 December 2018 (UTC)


 * As for your contribution - which got reverted again - please provide a source that actually backs your claim (that there are these roles in big data; not just some remotely related source). HelpUsStopSpam (talk) 18:06, 17 December 2018 (UTC)

Cluster Analysis page edited by SPA
The Cluster Analysis page is being overedited/overhauled by a special purpose account. It deserves a look and some scrutiny. He keeps reverting me and ignoring my signals. Limit-theorem (talk) 18:04, 2 February 2019 (UTC)


 * The Cluster Analysis page has been extended by me and all the made statements have references to the state-of-the-art scientific papers published in the peer reviewed journals and conferences. I have not deleted any of the original material, just restructured it a bit, significantly extended and modified some statements (citing the trusted scientific papers). I believe that the article has been improved with the made changes and would be glad to receive some feedback and refinements for the made edits but the account  just "undo"-ed my extensions multiple times instead of refining them further. --Glokc (talk) 18:15, 2 February 2019 (UTC)


 * while the edits by are a bit too excessive for my liking (and therefore should probably have been discussed on the talk page first, at least after the first revert), and a bit suspicious because the user has not edited anything before, I am not convinced that they are spam. I don't like them because much of the addition is a rather useless list of abbreviated algorithms. And some of the "survey" articles are pretty bad these days, too... Some of the rewriting contradicts my shallow understanding of the matter (e.g., "Nowadays ... clustering is used for various applied tasks including ... construction of the numerical taxonomy, botryology", when numerical taxonomy is where clustering was actually originally used first? Like in the 60s, not so much "nowadays".) There is one reference in there that I consider suspicious (Lutov 2018, no citations), but apart from that it does not look like the usual spam to me, so I leave this to the domain experts to handle.  when you get reverted once, using the Talk page is usually a good idea. Try to find consensus, not try to "win" by repeated reverting. HelpUsStopSpam (talk) 22:21, 2 February 2019 (UTC)
 * Thanks. We will watch. Limit-theorem (talk) 00:08, 3 February 2019 (UTC)
 * With the latest additions, it seems that indeed that may have a WP:COI, and tries to promote Clubmark, which has 0 citations and now plays a prominent role here. In fact, it very much looks like a WP:COPYVIO with at least page 2 of: https://exascale.info/assets/pdf/icdm18_clubmark.pdf IMHO it is also mostly misplaced, because there is a separate article on community detection... what do you think? HelpUsStopSpam (talk) 09:58, 4 February 2019 (UTC)
 * , I extended the Cluster Analysis  page with all published benchmarking frameworks for the clustering algorithms evaluation I'm aware about. Please, refine the article restructuring or modifying it to make it better. Yes, Clubmark is the newest benchmark, it was presented in ICDM this November and either does not have any citations or such publications have not been indexed yet but this benchmark does have a published paper and provides both the executables and sources as most of other mentioned benchmarking frameworks.
 * Community structure detection is a subset of the clustering task from the perspective of the graphs clustering specified by the pairwise relations and the superset of the clustering task from the perspective of the extensions the clustering with additional heuristics to model and recover the actual social communities instead of (based on) the found clusters by some (intrinsic) statistical properties. All the mentioned benchmarking frameworks and toolkits perform evaluation of the pure clustering algorithms and/or clusters formed by these algorithms. --Glokc (talk) 10:28, 4 February 2019 (UTC)
 * Moreover, evaluating the resulting clusters by both intrinsic and extrinsic measures, it does not matter whether the clusters are abstract or represent social communities, taxonomy topics, functional modules in biology or anything else. Anyway, all the listed framework evaluate the pure resulting clusters and some of the frameworks additionally evaluate resource consumption of the executing algorithms or provide visualizations and recommendations based on the numerical results. --Glokc (talk) 10:43, 4 February 2019 (UTC)

RD1 request
You requested an RD1 for Eigenfunction, but you didn't identify the source. I thought perhaps it would be this site, but I don't see it.-- S Philbrick (Talk)  22:39, 18 May 2019 (UTC)
 * sorry, I put it into the template, but without the url= prefix. HelpUsStopSpam (talk) 23:13, 18 May 2019 (UTC)
 * Thanks. I completed the RD1 S Philbrick  (Talk)  00:19, 19 May 2019 (UTC)

Consensus clustering wikipedia page
Dear HelpUsStopSpam

We have just had this paper accepted by Scientific Reports, and it is totally a notable piece of work. The bioconductor package has around 250 independent users a month downloading it. I would like to point out the section we adjoined that to is all from a single author on their paper. Our work was based on theirs and we built and extended upon it, I am happy to discuss this matter in more detail by email because I strongly believe this is not spam, regardless of self citation: chris.r.john@gmail.com.

Best wishes,

Chris — Preceding unsigned comment added by Chris.r.john (talk • contribs) 11:40, 15 January 2020 (UTC)


 * Please review the Conflict of interest guidelines. Everybody thinks their own work is totally notable; but only time and other authors will tell. Adding everything makes articles unreadable! For scientific work, we mostly go by how much attention it has achieved in other scientific work as well as news article coverage. HelpUsStopSpam (talk) 16:31, 18 January 2020 (UTC)

Dear HelpUsStopSpam

In the consensus clustering wikipedia article, what I added was limited and proportionate, it contributed towards making a better article. I also have added much more detail on other peoples work throughout the page and made it better overall. Just because an article is new and does not have many citations, does not mean it is not notable. Some methods with many citations have fundamental problems with them as citations does not equate to validity. I think it is better to proceed with a good understanding of the area of interest to judge what is notable or not, although I can understand and appreciate your perspective.

Best wishes,

Chris — Preceding unsigned comment added by Chris.r.john (talk • contribs) 07:07, 19 January 2020 (UTC)