Talk:STATISTICA

EntropyAS
This is NOT the place to give a full sales spruke on Statistica. There is a link to Statsoft's website (do you work for them perhaps?) where people can get that kind of detail.

Perhaps you might want to add a concise list of the features Statistica has that differentiate it from SPSS, SAS and S and R. That would be useful as long as it wasn't a sales brochure. As an example of what is NOT appropriate, your section on the use of Statistica in various industries - SPSS and SAS are used in all the industries you list, and given that Staistica doesn't run on mainframes, you'll find that banking and finance (for example ) at the big end of town would regard Statistica as a toy.

Don't get me wrong, Statistica is an awesome piece of software for performing stats on a PC, and is a pleasure to use (much nicer to use than SPSS). But WP is not the place for sales brochures. Johnpf 23:24, 3 October 2006 (UTC)

I should also point out that the continuing dumping of text will make people who look at the History of this page doubt Statsoft's ethics. You might want to think about that before you TextDump here again. Johnpf 23:26, 3 October 2006 (UTC)

63.172.193.43
This person is an employee of Statsoft Inc:

$ whois 63.172.193.43 Sprint SPRN-BLKS (NET-63-160-0-0-1) 63.160.0.0 - 63.175.255.255 STATSOFT, INC. SPRINTLINK (NET-63-172-193-32-1) 63.172.193.32 - 63.172.193.47

Perhaps StatSoft would be happier for this page to be deleted? Or should we just get a block from their netblock?

Please Mr StatSoft/EntropyAS, if you are going to write an article on STATISTICA then make it an article, not a sales brochure. Mind you, if STATISTICA is as good as YOU claim it is, with the market penetration you claim, then why do you have to put a sales brochure on WP anyway? Johnpf 22:30, 12 October 2006 (UTC)

Do you really think deletion of this talk page in any way supports your position of using WP as a sales brochure? Look at the edit history of this discussion page. Johnpf 23:14, 13 October 2006 (UTC)

Johnpf
Thank you for removing the vandalism from the Statistica page. I can see that you are a very active and concerned member of the Wikipedia community, and I do admit that I am new at this, so please forgive me if I am asking this in the wrong forum. My intention when adding info to the Statistica page was to present information about the software in a similar fashion as SAS, SPSS, and other software companies had done. I am wondering what was specifically different about the information that was originally on the Statistica site and what is currently on other software vendors' pages. I am well aware of the policy against advertising on Wikipedia, and I'm very happy that these conventions are in place so that I can learn only objective facts about a subject matter. However, it seems like each of my attempts to revise this page and add only basic facts is labeled as "Advertising". Could you please offer suggestions about how you draw the line between fact and advertising? Or perhaps let us know which paragraphs you feel are offensive to the policy? I appreciate the guidance. Sincerely EntropyAS 22:26, 27 October 2006 (UTC)


 * I have to admit that I don't really care about the SPSS/SAS articles, as I don't have any recent experience with their products, and as such can't provide an objective and impartial view of an article about them. Over 10 years ago I taught stats at a University, and the version of SPSS for Windows we used certainly would of done a good job of selling copies of STATISTICA, as it looked like the older batch version I'd used on DEC-20 systems back in 1981, but with a set of forms in front of it - a truly horrible interactive program.  Perhaps it's changed by know (I'd certainly hope so!).  In terms of what you've done in the past, you've basically dumped a VERY long list of the tests and procedures that the product has.  I took the liberty to look at some of the marketing material available from StatSoft, and what you wrote looked very much like it was lifted straight from there.  Maybe it wasn't a straight lift, but it LOOKED like it was, and that is what made me think "ads!".  The information in question read like a marketing spiel, complete with "weasel words".  Perhaps you could list the features that discriminate STATISTICA from the other players (not just a list of every feature) so as to contrast the difference (in my mind it's the fact that STATISTICA is a far more interactive program, but that's probably a bit simplistic).  Look at http://en.wikipedia.org/w/index.php?title=STATISTICA&diff=74104353&oldid=71901776  and see the section on Data Miner in the Analytics section.  Things listed there are pretty much the menu choices - "General Modeler/Multivariate Explorer"  is the Statsoft name for some functionality, and doesn't actually tell us very much about what it does (explorer - like Windows Explorer? or Internet Explorer?  or perhaps it automagically tries to "explore the data and find a multivariate function the data fits?) Do you see how the actual information content is really very low?  Most of the page below the contents box is like that.  Go down further, where much is made of "Sarbanes-Oxley Compliance", yet no link is included to what that actually is.  That section could be changed to include links to Sarbanes-Oxley and Document Management Systems and then reduce the section to "STATISTICA includes tools to assist in complying with Sarbanes-Oxley requirements for data collection and retention." which includes the facts, allows readers to look further into what a Sarbanes-Oxley is, and is parsimonious.


 * Actually, parsimony is an important keyword for articles like this. Wise use of Reference links will allow you to point the reader to Statsoft pages for line-by-line breakdown of the entire feature set of STATISTICA.  Try to put yourself in the role of someone who has no idea what STATISTICA does or is, and then look at that page.  Will the reader be enlightened or confused?  I maintain that a huge text dump confuses the reader.  The use of marketing language ("weasel words") could easily leave the reader believing that STATISTICA is solely a tool for business to perform market research, and little else.  We both know that's not the case, in fact STATISTICA is probably the ideal tool for quick data exploration in the field for ecologists and social scientists.


 * I can appreciate that you are probably very proud of your software, and I know from experience that one gets emotionally attached to the "modules" and their labels when you've developed a great piece of software. But when writing an encyclopedia article you must put all that aside.  Ask yourself the following question: "In 30 years time, when Statsoft is gone and forgotten, and people are wanting to look at the technologies of the late 20th and early 21st century, would this article let them know what STATISTICA is, or would it try to sell them a product that no longer exists?".  Keep in mind as well that someone who knows what all the "buzzwords" mean probably already has been exposed to STATISTICA - if they're looking at the article in WP they are probably looking for other links and references, or else are looking to discriminate the product from the others in the market.  The person who doesn't know that much about stats (i.e. someone who is just looking for a tool to do a job) won't know all the buzzwords, and probably doesn't care about them anyway.  They want something that seems easy to use and apply correctly (which STATISTICA does, although you'd be hard pressed to recognise that from the article!!).


 * Finally, I've had articles torn to pieces, I've had articles deleted, vandalised and corrupted. I've had articles re-written by history revisionists (particularly to do with bushrangers).  In my opinion it's much harder to write well for Wikipedia than it is to write for most journals.  I don't revert your edits just because you have a Statsoft IP address (although that is bending the rules).  I revert them because they're not good encyclopedia articles, and they especially don't do the product justice, they read like a sales brochure.  A program that is clear and easy to use should have a WP article that reflects that, not an article that attempts to "impress by volume".  I appreciate that there is probably someone high up at Statsoft who wants an article in WP - they need to realise that a good WP article will entice people to visit Statsoft, without reading like a sales brochure.  Have you ever been to an academic conference, where there's a short "sales pitch" by some company like Hearnes, where they try to sell you all sorts of books and softwares?  Remember your reaction to the glossy "we can change the way you work" literature in the promo packages?  That's what you should desperately avoid!


 * I hope this helps, and please feel free to ask me any questions the above has raised. Johnpf 01:46, 28 October 2006 (UTC)

I appreciate you taking the time to address my questions -- You have definitely given me some good things to think about. Have a good weekend. EntropyAS 13:48, 28 October 2006 (UTC)

COIN
This article has been flagged at Conflict of interest/Noticeboard. Rees11 (talk) 18:29, 27 May 2009 (UTC)

my thoughts as a reader
I saw this on the editor request page and thought it was something with which I may be familiar and able to comment.

I have had a few related issues come up while writing a piece for Dendreon and another draft idea that involves me personally. Drug companies are interesting because it is hard for the FDA to find informed parties who have no financial or personal interest in a drug candidate so disclosure is used where NPOV is difficult and multi-POV cases are presented for many people to examine.

In general, it is hard to find people who have an incentive to write responsible critical commentary on any business or product, even competitors either don't benefit or can't be objective or write with an absence of malice.

I thought this article was a bit bland and superficial but only contained passing puffery and obvious promotional language. I'm still left wondering is there something this does or did to make it unique? Is the API patented or did someone use this to find a telegram from ET in a background of cosmic noise? Why would this be more interesting to wiki readers than an article on how my sister's kids did at a school play ( an example of something of generally personal interest competing with that of others' for time and attention) ?

Nerdseeksblonde (talk) 13:34, 28 May 2009 (UTC)

NPOV dispute
The article pervasively reads like advertising, containing many non-notable facts, biased words, original research and improper handling of Trademarks. The article does contain non-biased facts and can be redeemed. I'd like to clear state that this is a discussion so please give feedback.

Suggested changes or points of contention: 1. Only the first letter of Statistica should be capitalized and italics maybe used. See WP:MOSTM

2. "STATISTICA has an interactive graphical user interface with customizable menus and toolbars." This is not note worthy a graphical user interface is common among statistical software. (GUI could be combined with other sentence if needed).

3. There are three basic channels to which you can direct all analysis results spreadsheets and graphs: workbooks, reports, and standalone windows. Workbooks and reports are Active X document containers, which means they can hold all native STATISTICA documents as well as other types of Active X documents, including Microsoft Excel spreadsheets and Microsoft Word documents. Workbooks contain two panels: a navigation tree on the left and a document viewer on the right that allows the user to view the document that is selected on the left. Reports display a series of spreadsheets, graphs, or other objects sequentially in a word processor style document. Workbooks have a file extension of .stw and reports have a file extension of .str. Data files are typically displayed in spreadsheets. The basic form of the spreadsheet is a two-dimensional table arranged as cases (rows) and variables (columns). STATISTICA spreadsheets have a file extension of .sta.

This block is unnecessarily verbose and contains information that belongs in a Statistica manual and not an encyclopedia article. The design of the document viewer window and name of file extensions are not notable.

Suggested re-write: Analysis results may be channeled to workbooks, reports, or standalone result windows. Workbooks and reports can be Active X documents including Microsoft Excel and Word documents.

4. "STATISTICA contains several main groups of analytic techniques, which can be accessed from the Statistica menu bar. The software provides options for common basic statistics, multivariate techniques, and advanced linear/nonlinear models. Quality control techniques include quality control charts and process analysis. The Design of Experiments module provides an interface for setting up and analyzing experiment data sets. Other main types of analyses include neural networks and data mining."

The menubar reference is extraneous, and the utility/functionality section could be more concise. I'd recommend using bullet points with possible linking to the techniques Wikipedia entry.

5. The software also provides three options for creating macros automatically. Analysis Macros record the settings and output options chosen for a particular analysis. A Master Macro can record a series of analyses. A Keyboard Macro records the actual series of keystrokes that a user enters via the keyboard.

This is unnecessarily verbose and is note worthy? This kind of seems like reaching and by expanding verbosely on details like this it in someway diminishes the product. I'd recommend including "macro framework", or creation of macros in the fucntionality section.

Data import section:

6. "With this mechanism STATISTICA helps to bridge the gap of being able to get your data into the software. This is useful for the end-user who logs their own data."

Referring to the reader is improper tone and "helps to bridge the gap" is marketing speak.

7. "STATISTICA provides two interfaces for textual importing; largely because the interfaces would become too cluttered if they were combined."

I'm not sure this is note worthy and contains original research. How do you know why they made this decision? WP:OR

8. "These interfaces handle fixed width text files and variable width text files. More importantly, STATISTICA's import facility is based on smart, sophisticated parsing that the end-user has good control over."

Does anyone consider this unbiased and not blatant advertising?

9. If the data being imported were viewed like an object, then the user would tell STATISTICA what characters, or sequence of characters to expect to cause a line of data to be defined. In the same way, the end-user would tell STATISTICA what characters, or sequence of characters to expect to cause a break in data points. The import facility also allows you to tell the system what to use as a decimal point, what to use as string delimiters, and what to use as escape sequence.

Is this noteworthy, does excel, access, numbers, open office, and etc. not function the same way?

10. Does Sarbanes-Oxley Compliance deserve it's own section? R.Vinson (talk) 15:11, 28 May 2009 (UTC)

" smart, sophisticated parsing" - that was exactly the puffery that got my attention. Most of the other stuff appears to lack notability or even general interest. Puffery is generally the use of words that don't add factual information and can't be logically determined to be true or false. I would contrast this to a list of notable positive attributes ( whatever "positive" means LOL) that together could be taken as a POV. Patents, controversies, specific notable uses for the product, personally strike me a something I would want to read. I guess the salvagability issue hinges on notability- if there really are no derogatory comments ( has anyone come out against the theory of relativity lately? Are they notable because you can run monte-carlo's 10x faster than something? ) then a positive tone may create a POV. Alternatively, this is also a solicitation for negative notability ( did this ever have a bug that cost a company a lot of money? Did it make the headlines of the WSJ due to this?).

Is there a wiki page on stat packages into which this could be combined?

Nerdseeksblonde (talk) 15:29, 28 May 2009 (UTC)

Thank you for your comments and I certainly understand your concerns. I approach you all in good faith and would appreciate the chance to edit the article some more. For example, when I noticed that another user had added a section about importing data that was not neutral, I edited this down last week. A more recent reversion by another user undid these changes. Also note that STATISTICA (capitalized) is the software's brand name, so I do not intend for my capitalization to be spamming or unproductive. As of the page's general interest, several competitors to STATISTICA have pages of similar length and subject matter, so I do not think removing only the STATISTICA page or combining it with a general page on that grounds is justified. I will edit the page right now to remove all but the most basic information and then work on adding citations and items that would be worthy of note. EntropyAS (talk) 15:52, 28 May 2009 (UTC)


 * Thank you for considering the input. I'm sure the article will be beyond NPOV dispute in no time. Regarding the Statistica (STATISTICA) issue the WP:MOSTM guidelines are pretty clear. See Time for an example. R.Vinson (talk) 16:22, 28 May 2009 (UTC)

Notability and References
This article is in dire need of some references to reliable sources to demonstrate notability. In fact as it stands it's borderline speediable in my opinion as spam and I am surprised it's been around for so long without being nominated for deletion per an Afd discussion. – ukexpat (talk) 16:52, 28 May 2009 (UTC)


 * I doubt there can be any problems with notability - for example there are books "Statistika su STATISTICA" (Vilnius: Margi raštai, 1998) and "Duomenų analizė su STATISTICA" (Vilnius: Margi raštai, 2003, ISBN 9986-09-256-6) by Virgilijus Sakalauskas. And if there are relevant books in Lithuanian, there simply must be at least some in English. Of course, actually citing them in the article wouldn't hurt... --Martynas Patasius (talk) 00:30, 29 May 2009 (UTC)


 * Statistica is italian for statistics (which might cause some issues for us). The only Statistica book I found, that wasn't published by Statsoft, had multiple software packages.
 * --R.Vinson (talk) 05:00, 29 May 2009 (UTC)
 * --R.Vinson (talk) 05:00, 29 May 2009 (UTC)

One poster suggested mention in academic papers but that alone may not be enough. Can you figure out how to get citeseer to find "STATISTICA" without "statistical" etc? Or do you have some citations that show a unique attribute- if you know of a unique capability or author using this can you search for that term?

And wiki doesn't seem to like primary sources in isolation( although personally as a reader I would be happy if authors hunted them down and included them since some reviews and secondary sources are no better than the National Enquirer and as a rule you try to go right to original source ). There are many types of non-notable references. A mere footnote " we used Statistica to calculate an average of 2 numbers" wouldn't personally strike me as interesting or significant.

I did a google search on "statistica review" and in 1994 it seems their company was quite vocal about a PC mag or similar review. Other reviews suggest it has certain good points and bad points but no really exciting features are noted- it may be a good product for many uses but I guess inclusion would be an editorial call here.

As a point of reference, my first article outline was based on something I personally found interesting and could describe in a very short article. I never flipped it to main pages but the one person who commented did think it may be notable and ok even with my personal involvement. The page is here,

http://en.wikipedia.org/wiki/User:Nerdseeksblonde/Marchywka_Effect

this name also seems to be associated with software with interesting features, apparently here,

http://www.spottext.com/marchywka/glutp/docs.html

that mentions another page topic, Dendreon. Should I do an article on this software or cite it from Dendreon page? I did cite my own publications on Dendreon page but you need to keep a focus when dealing with stuff of personal interest.

Clarification
Just to be clear, here on Wikipedia "references" does not mean wikilinks to other articles, it means citations to reliable sources to support statements made in the article. They are essential to support the key inclusion criterion of notability as it applies to companies. – ukexpat (talk) 16:30, 29 May 2009 (UTC)

How do you put references on talk pages? I wanted to put mine at bottom but no luck. Thanks Nerdseeksblonde (talk) 18:34, 29 May 2009 (UTC)


 * Same as for article pages, see below. – ukexpat (talk) 19:29, 29 May 2009 (UTC)


 * References

Further reading section

 * Bi-E, Cheng, and Shun-Yu, Chen. STATISTICA Manual I - Basic Statistics. Taipei, R.O.C.: HwaTai Publishing, 1999.


 * Byun, Jong Seok; Choi, Young Hoon; Lee, Sueng Chun; Park, Dong Ryun; Sung, Woong Hyan. Using STATISTICA- Statistical Information Analysis. Seoul: Tomjin, 1998.


 * Data Mining - jak z Vašich dat vytežit maximum. Seminár StatSoft, Praha, Bratislava: 2002.


 * Kobus P., Pietrzykowski R., Zielinski W. Statystyka z pakietem STATISTICA. Fundacja "Rozwój SGGW", Warszawa: 1998.


 * Luszniewicz, Andrzej. Statystyka w biznesie: Laboratoria komputerowe STATISTICA PL wersja 6. Warszawa: 2002.


 * Mutoh, Mr.Shinsuke. Data Analysis Using STATISTICA. Asakura Shoten: 2000.


 * Nisbet, Robert, Handbook of Statistical Analysis and Data Mining Applications. Academic Press: 2009. ISBN: 978-0-12-374765-5.


 * Röhr, Michael. STATISTICA für Windows. Addison Wesley Longman Verlag GmbH: 1997.


 * Skalauskas, Virgilijus, Duomenu analize su STATISTICA. Vilnius: Margi raštai, 2003.


 * Tong Lee-Ing and Wang Chung-Ho. STATISTICA V5.5 and Basic Statistic Analysis, TasngHai Publisher, Taiwan, R.O.C.: 2002.

Most of the books are out of print, in a foreign language or can not be found reference too. Having out of print books on a software package is disingenuous unless they are providing historical context. They also are not referneced correctly (which I would fix if they were demonstrated to be valuable). R.Vinson (talk) 20:56, 29 May 2009 (UTC)
 * Bradumup Bopobukob STATISTICA 2003.


 * The only English book I could find reference too doesn't have a chapter on Statistica, but does include a trial version of the software. []R.Vinson (talk) 21:10, 29 May 2009 (UTC)


 * No problem. I added a book that is available on Amazon back to the article.  There seems to be different acceptable ways to reference books, but feel free to rearrange or correct as you see fit.  EntropyAS (talk) 02:40, 30 May 2009 (UTC)


 * I've changed the reference to use the appropriate citebook template. R.Vinson (talk) 03:46, 30 May 2009 (UTC)

As of right now, these are the listed references that appear to be press releases, catalogs, industry surveys ( like lists of advertised products ), articles about "R", and a general user's manual on various stat programs. If you are going to cite promotional material, do any of your customer testimonials include significant findings only possible with STATISTICA or other interesting results? I had R installed and really like it for some things but wrote my own code to look at Dendreon. Has anyone used your product in regards to producing some novel results or predictions- credit ratings? Clinical trials? Marketing analysis?

1. ^ "StatSoft Announces Version 9 of STATISTICA" 2. ^ "StatSoft Product Catalog" 3. ^ "Second Annual Rexer Analytics Data Miner Survey"Rexer. September 2008. 4. ^ Christian H. Weiss "Commercial meets Open Source: Tuning STATISTICA with R" R-Project. March 2008. 5. ^ "StatSoft Certifies REvolution Computing R Language" HPCwire. December 2008.

Sá, Joaquim (2007). Applied Statistics Using Spss, Statistica, Matlab and R. Berlin: Springer. —Preceding unsigned comment added by Nerdseeksblonde (talk • contribs) 11:50, 30 May 2009 (UTC)

Removal of  and  tags?
Should we remove the  and  tags and add    for now? R.Vinson (talk) 16:32, 30 May 2009 (UTC)
 * done R.Vinson (talk) 23:21, 1 June 2009 (UTC)

Numerical stability/correctness
Not putting this on the page itself, cause I don't have a citation handy, but I do remember reading a review of "econometric"/stats s/w in Econometrica (or was it J.of Appl.Econometrics?) that lambasted Statistica for giving blatantly incorrect solutions to various regression problems. Mind you, each numerical stats algo is not exact... 212.188.108.142 (talk) 08:43, 13 September 2009 (UTC)

What is reasonable content to expect here?
The page has less info than there is for other software products. Possibly some of the comments have been a bit hard on Statistica: the bar is being set higher than for say SPSS. So I want to make a plea for more objectivity, and I say this as an occasionaly user of Statistica, and I'm not a StatsSoft employee either. Inevitably some of the material on the page is going to be sourced from the Statistica website itself. Is this any different to how the other software is being treated? I think we have to treat Statistica the same way other products have been treated, and I'm not sure that is currently the case. What do other Editors feel are reasonable expectations for the content of the page? Lookign at the SPSS page, the obvious ones are: So if no-one objects, I propose that those headings be added to the page for starters.
 * lists of features
 * description of user interface
 * additional modules
 * data format
 * versions

Mango bush (talk) 23:11, 25 March 2010 (UTC)

Made some changes, particularly list of features, and the modules. These were based on StatSoft material, and they seemed accurate enough given my knowledge of the software. There are some features of Statistica that are pretty good and need to be mentioned. For example, SPSS did not (at the time of my using it) have the association links or Weibull distribution. So I think it's appropriate to list features that help inform Readers of the capability of the various softwares, and this necessarily means listing some detail.

I also deleted the opening comparison to SPSS etc. Reasons: there are other places to provide comparison; the SPSS site does not reciprocate; there are many other softwares not mentioned; the statement had a prejudical tone. This page is about Statistica, not the other software. Mango bush (talk) 00:21, 26 March 2010 (UTC)

OTRS Pending
I have removed the OTRS pending tag from this article, 2 years is plenty of time for any OTRS request to have been dealt with

Potentially add info about popularity and customer satisfaction
I am a novice wikipedia editor. MrOllie and other experienced wikipedia editors have helped me understand COI conventions (thanks!). I see that I should not have added material to the Statistica page that cited research that I am one of the authors of. It has been suggested to me that if I think material with COI is important to add, a good strategy is to propose it in the talk page, and let other editors evaluate and decide if it is appropriate to add to the Article page or not. I propose that it would enhance the Statsitica page to add information about the software's adoption/popularity, and whether or not customers are satisfied with the product. So, I will put some material here that I think would be good to add to the page. Then I will stand back and let other more experienced editors decide if the material should be added, how to modify this text, and where on the page would be the best place to put this material. Here is the sentence I propose adding (with citations):

Karl (talk) 16:32, 11 November 2012 (UTC)
 * Polls and surveys show that Statistica has very strong adoption among data miners and high user satisfaction.

conflicting info on page about most recent version number
I noticed that there is conflicting info on this page about whether version 10 or 11 is the most recent version of this product. However, I could not find citations about which info is correct. Karl (talk) 17:19, 11 November 2012 (UTC)