Talk:Software testing

"Post release"?
I mightn't be familiar enough with the practice, but I generally don't think of alphas and betas as "post" release testing. To me, post-release testing applies to things like patches and other updates. If the software has been released to the client for general use, then it's not really an alpha/beta anymore, just an undertested and potentially unstable initial release. "Releasing" an alpha/beta to, say, an executive for testing is still considered pre-release. Am I wrong? Ham Pastrami (talk) 21:00, 3 April 2008 (UTC)
 * You are right, I moved now info on alpha and beta testing into 'Pre-release' testing section. Andreas Kaufmann (talk) 20:27, 5 April 2008 (UTC)

software testing is the process to find the correctness as well as the defects in the software application. —Preceding unsigned comment added by 122.167.109.27 (talk) 04:52, 7 June 2008 (UTC)

People seem to be forgetting here that 'Beta Test' is in fact not a Testing Group function. Beta Test is a Marketing function to test the features of the product against what the targeted users want/desire the product to do. It is not intended to find actual development or programing bugs per se. —Preceding unsigned comment added by 205.225.192.66 (talk) 16:48, 6 January 2009 (UTC)
 * Three things
 * Your description of beta testing is far from accurate. What you seem to be describing seems to more in line with user acceptance testing.
 * Beta tests can, and should, be controlled by the testing group. The results of the wider, consumer-based testing during the beta cycle of development should be captured and analyzed by a representative on the testing team. Those results should be compared to known, reported defects and triaged in the same way as any internally reported defect at that point. If the beta programme is controlled by marketing, then there's not an easy way for defects to find their way into the defect tracking system.
 * What does beta testing have to do with "post release" testing, which is what the heading is? It is done before the official release of products. In fact, some software seems to be in perpetual beta. GMail is the most notable.
 * --Walter Görlitz (talk) 20:39, 6 January 2009 (UTC)

Checking not "Excersizing"
Some parts of the testing process have nothing to do with excersizing, so I changed the heading.

Having said that, I need to get some excersize, so signing out. ;-) -- Pashute (talk) 11:10, 25 June 2008 (UTC)

checking software to verify it (?!), "exercise" sounds better as in standard Do-178
Testing - The process of exercising a system or system component to verify that it satisfies specified requirements and to detect errors. [in DO-178 SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATIOn]Thread-union (talk) 17:25, 4 July 2008 (UTC)
 * Software testing encompasses more than just exercising the software - it usually encompasses a variety of checks, such as static analysis, code review, etc. AliveFreeHappy (talk) 00:07, 28 June 2011 (UTC)
 * I disagree - I don't think those activities are usually called testing. Rp (talk) 14:50, 6 May 2013 (UTC)

Controversy
One of the bullet points under the 'Controversy' sxn says "...and mostly still hold to CMM." What is CMM? There's no mention of this acronym earlier, and no obvious antecedent in the bullet point. Mcswell (talk) 00:49, 13 August 2008 (UTC)


 * Capability Maturity Model Tedickey (talk) 00:52, 13 August 2008 (UTC)

I added a link to the CMMI article and changed "CMM" to "CMMI" NoBot42 (talk) 20:00, 21 August 2008 (MEZ)

The references to CMMI seem to be misplaced. CMMI defines process improvement guidelines independent of the development model used (be it agile, waterfall, or otherwise). This 'controversy' is creating a false dichotomy between agile development and process maturity. There is nothing inherent in CMMI that precludes its implementation in an agile environment. The conflict between 'agile' vs. 'traditional'/waterfall is that agile methods emphasize continuous testing, where the traditional method was to begin testing only at the end of the development process. 76.164.8.130 (talk) 22:24, 21 November 2008 (UTC)

---

The comments above about CMMI are correct. The CMMI is about ensuring that the process is managed and does not prescribe any mechanics about how to perform the functions. At best it provides examples, but those examples are more to illustrate what they mean for a particular process area than they are to prescribe how to do it. By implementing a good agile process, you will by matter of course address most of the disciplines spelled out in the CMMI. It is true that the U.S. Government often mandates CMMI level 2 or 3 compliant processes, but they usually leave the implementation of that process to the company they contracted with. Many companies are learning the value of applying agile processes to the CMMI. The U.S. Government may impose its implementation of process on a company, but that is not the norm.

One of the chief mechanisms used by all agile processes to mitigate the risk of the costs associated with change in the development of the application is that of tight iterations. Essentially you break down the work into 1-2 week iterations where you do your specification to integration, with testing lagging behind an iteration. The costs of bad requirements, design, or implementation are essentially reduced to very manageable levels. That is what makes them agile. 137.100.53.254 (talk) 20:17, 16 April 2009 (UTC)

--- I think there needs to be a greater focus on test automation here. When advocates of agile development recomend 100% of all tests, they often mean "I want to have 100% code coverage with the unit tests we are running". Even if you have 100% code coverage, there is still plenty of room for error (usability bugs, integration bugs, etc). If there is a way to automate usability testing I am unaware of it.

Why is this important? There are a lot of uninformed PHBs (pointy headed bosses) that listen to (software automation tool) salesmen and believe that record-playback tools will allow thier QA team to completely automate thier (blackbox) tests. Some mention of this "snakeoil" in a highly respected, evolving media like wikipedia could put some damper on unreasonable expectations of this type. There should also be a clear distinction made between GUI automation (which is subject to varying degrees of brittleness, and requires as-much or more time to maintain than it does to write initial tests: the argument against automation) and Unit test automation (which usually requires less maintenance than GUI automation).

(Before this goes any further, I think someone should provide citations showing some prominent Agile advocates who do, in fact, propose 100% coverage. I'm absolutely certain that none of those I've met in real life do, and some of them get very irritated by this claim, so I'm not convinced this whole idea isn't a popular misconception/misunderstanding. 82.71.32.108 (talk) 01:12, 10 February 2009 (UTC))

--

The 100% coverage is usually quoted for unit tests, and not for all tests for a system. At best the 100% coverage is a goal, but may not be feasible due to limitations of the coverage tool or other mitigating circumstances. Many agile teams do not employ Test Driven Development (that is only one application of the agile processes), so they don't always measure the test coverage. Of the ones that do, 100% code coverage does not necessarily mean 100% of the cases have been tested.  137.100.53.254 (talk) 20:14, 16 April 2009 (UTC)

--

Manual testing vs. automated Some writers believe that test automation is so expensive relative to its value that it should be used sparingly.[47] Others, such as advocates of agile development, recommend automating 100% of all tests. More in particular, test-driven development states that developers should write unit-tests of the x-unit type before coding the functionality. The tests then can be considered as a way to capture and implement the requirements.

--

"Should testers learn to work under conditions of uncertainty and constant change or should they aim at process 'maturity'?" This very quote from the "Agile vs. Traditional" approach is a misrepresentation. Essentially the only difference is when testers become involved in the software life cycle. With the traditional approach, testers don't get involved until the software is 100% feature complete (all the requirements have been satisfied). With both agile and iterative approaches testers are involved as soon as an iteration is done. Agile methodologies have either weekly or bi-weekly iterations, which means testing begins in the second or third week of development. The testers are increasing the coverage of their tests to match the requirements (or user stories as agilists tend to call them) that were developed at that time. The benefit being that by the time an agile project has reached the 100% feature complete stage, most of the bugs have already been caught and dealt with.

Agile processes are mature, and it has little to do with working under conditions of uncertainty and constant change. It has to do with tight iteration cycles so that when the client needs the team to refocus their efforts they have that ability to do so. Let us not also forget that there are other mature processes which espouse many of the same principles such as Rational Unified Process (RUP) and other iterative methodologies where the requirements→design→build→test→maintenance cycle is repeated several times before the project is 100% feature complete. 72.83.188.248 (talk) 02:26, 20 April 2009 (UTC)

Specification Based Testing
"Specification based testing is necessary but insufficient to guard against certain risks. [16]". This sentence is completely irrelevant, because
 * - there is no testing methodology, which is sufficient to guard against all risks
 * - the article linked to does not contribute to the validity of this statement

I suppose, this sentence is only there to introduce a link to the author. This is why I remove it. —Preceding unsigned comment added by 80.219.3.124 (talk) 18:20, 21 August 2008 (UTC)

Finding faults section
The metrics for effort of finding and fixing bugs are interesting. I have seen them many times. It occurred to me that the collorary is that It is xx times easier to introduce bugs at requirement and architecture phases. Is there any literature on this? Ie, what is cause and what is effect? Many times these seem to be used by those use to a waterfall method to justify over specifying things. 69.77.161.3 (talk) 20:50, 19 November 2008 (UTC)

TMAP advertisement
The trademark is probably inappropriate usage; makes the reference here an advertisement. Tedickey (talk) 01:16, 30 November 2008 (UTC)
 * OK, I removed it. (Exin is a non profit exam organization, and here is the list as they put on their website: http://www.exin-exams.com/exams/exam-program.aspx) —Preceding unsigned comment added by Raynald (talk • contribs) 02:10, 30 November 2008 (UTC)

Oashi edits
From another software tester, stop making things up as you have done on the Software testing article. You're making up terms and concepts. If you don't stop, I'll consider your un-sourced edits as vandalism and report you. For instance, not a single book in my library mentions "Optimistic testing". It sounds like you're defining positive and negative testing. Nothing else. You're giving them elaborate terms, one of which conflicts with another term. Your elaborations on destructive testing, which has a Wikipedia article, are incorrect. Please stop necessarily elaborating on the terms. --Walter Görlitz (talk) 06:34, 21 July 2009 (UTC)


 * I see you delete a few of my sentence. I understand this one your objection, OK, the term of "Optimistic testing" may be not general.


 * However, why didd you delete all the sentences? If you have objections, then correct the terms, or replace them with better ones. But immediate deletition I see unconstructive. See the philosophy of Incrementalism: Step by step, the the article shall get better and better.


 * So, I will rewrite the terms to to pass & to fail: I hope, this will satisfy you better. I try to be open to every objections, so let's continue to discuss them more. Kindly Franta Oashi (talk) 15:26, 21 July 2009 (UTC)


 * The reason I deleted them were:
 * They were unsourced and original research,
 * Where there was a grain of truth to the material, it was narrowly focused and does not apply to the software testing industry as a whole,
 * The grammar was bad
 * As such, I felt the sections you added were unsalvageable and had to be removed. I did the same with your new sections on Planned vs. ad-hoc. The very fact that you wrote that ad hoc testing should be limited belies your misunderstanding of its use. Ad hoc testing and exploratory testing are done to the exclusion of other test methodologies, or as its practitioners would call it: a mindset, with great success. See the section in exploratory testing on 'Benefits and drawbacks. I also reworked some of the other sections you modified to remove a narrow focus and have attempted to remove the test to pass language as a whole. The various phases don't always use positive test cases. In fact test driven design emphasizes both positive and negative tests. This happens at the unit and integration levels and as such its inclusion here is pure fiction, despite its existence in real world environments. --Walter Görlitz (talk) 17:14, 21 July 2009 (UTC)


 * There are so many things on which I could react... Just a few in the moment:
 * The titles of the chapters you remove intentionally included the "vs.", versus, as comparison of opposite, cotradictional, however related topics: That's why planned and ad-hoc are together, the contrast was the point.
 * Sure, I like the ad-hoc testing, I am very successful with that. However, I do not agree with yours: . It was ment there in the article, that testing in general, and even more the ad-hoc testing and creativity, may discover so many bugs, as well as no one. You do not agree, that   ? The project has always only limited funding, pareto must be applied, thus also the ad-hoc "creative" testing must be limited in time and scope. You could test a certain part of a SW forever, thus it needs a limit: And such will be set really artificially, yes. ...I was pushed many time to make a "professional-expert time estimation", even without any informations available yet. Is this, what you do not agree?
 * Another topic was about the planning and mainly the "result giving". The SMART goals are needed, you need to state before the testing itself, what is the trigger to start and to finish. Also you need some measurabilty: You need an explicit result, yes/no. Did you get such answer at all, that the goal was reached? Such "result need" can be seen as "obvious", however, a tester can easily lost his given goal fom his mind, moving from the test-to-pass into to-fail, and even to ad-hoc, eventually missing his plan, and causing/forcing a slippage of manager's estimates of delivery to clients. There is always a contradiction between the creative / planned testing, thus the chapter. --Franta Oashi (talk) 18:41, 21 July 2009 (UTC)


 * What you have written shows that you don't understand how ad hoc testing can and is used in every day testing. It is not limited. It is used exclusively in many companies. Please read the articles: ad hoc testing and particularly exploratory testing --Walter Görlitz (talk) 18:51, 21 July 2009 (UTC)
 * Well, I have read the articles you have recommanded me: But I still believe, that I understand the "ad-hoc" perfectly! I would like pass "the ball to your side of playgroud": I have already tried to explain you my point of view, but you have rejected these without any explanaition, just a short no.
 * You said, I am wrong. So, please, show me my misundertandition/mistake, show me the contradiction between what I said before (the chapters you have deleted) and the ad hoc testing article. Thanks.
 * I still believe, that I have used the terms correctly, I still do not agree with the deletition, I still do not see your argument, sorry. It will be time consuming, but the only way I see, is to debate the sentences (or even terms) one by one. --Franta Oashi (talk) 22:16, 21 July 2009 (UTC)


 * Particularly exploratory testing. There are also a number of documents on the topic on http://www.stickyminds.com/ writen by Kaner, the Bach brothers and Bolton. Bach has written that exploratory testing is "An interactive process of simultaneous learning, test design, and test execution." This process has been developed into a teachable discipline. All of my sources and references are on my wiki but as I said, there are many more on http://www.stickyminds.com --Walter Görlitz (talk) 23:41, 21 July 2009 (UTC)


 * And the : tests can be constructed as a questions, as "do this, will this appear?" With the expecting, that (a) appearing / b) not appearing) is correct, thus the test can pass as well as fail on "appeared" event. Yes, we both know.
 * But such idea is not related to test-to-pass! Again, definition can be seen in the difference against "test to fail": a) the triger event, in which moment of the project progress these appear (see the SMART), and b) what is purpose/scope of such testing: To pass to the length, "I found a successful way from A to B" as these are the main scenarios in the FS, or development to the width, "I found a way, which I did not get from A to B". These cannot be performed in the same state of the project, rather on different build, and mainly with different purpose: The testin was started with some "question on the project", giving completely different answer. (can we start the further development based on this library? / Can we integrate the library and the belonging modules to the main?)


 * Do you have any documentation to back up these claims? Try http://www.stickyminds.com/ or some other reputable source. Possibly a book. Until then, it's original research. --Walter Görlitz (talk) 18:51, 21 July 2009 (UTC)


 * Related to the integration: test-to-pass are just always included. Your new edit is acceptible for me. Greate example of the "more deep" testing during regressionn are the cases of the "testin-to-fail", exactly! I think, your text there supports my "to-pass vs. to-fail" chapter. But what is used in regression, is not described now, you deleted that. --Franta Oashi (talk) 18:41, 21 July 2009 (UTC)

Code completeness
Looking for a clear definition of the term code completeness, I run into this passage. Clearly its first sentence is wrong: the paragraph is about completeness of the testing code, not the code being tested, although the latter is one of the things being tested. Rp (talk) 11:16, 18 August 2009 (UTC)
 * As with most software development terms, there are usually no clear definitions for code complete. In some places it could mean that no new features will be added but bug fixes will be made. In others, the previous definition would be feature complete, and code complete means we're not touching the code any more. Any bugs that are found from now on will have to be fixed by the maintenance team. --Walter Görlitz (talk) 16:50, 24 October 2009 (UTC)
 * Exactly. The paragraph isn't about that at all - it is about test coverage.  I changed the title accordingly. Rp (talk) 12:40, 29 October 2009 (UTC)

Definition of software testing
I made a minor edit to the opening definition. The definition was, "'Software testing' is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test, with respect to the context in which it is intended to operate." I dropped "with respect to the context in which it is intended to operate" because I think it is confusing and unnecessarily narrowing. People often use products in ways far different from the maker's intention. It often makes perfect sense to test a product's behavior under foreseeable uses or in foreseeable contexts, not just in the intended ones. (Think of life-critical products for example.) CemKaner (talk) 15:39, 29 December 2009 (UTC)

Definition of software testing: ISTQB vs. Common Usage
There is a very clear definition of "Software Testing" in the IEEE Standard Glossary of Software Engineering Terminology, IEEE Std 610.12-1990 (paywalled!), and I cite:


 * 1) "The process of operating a system or component under specified conditions, observing or recording the results, and making an evaluation of some aspect of the system or component"
 * 2) "(IEEE Std 829-1983) The process of analyzing a software item to detect the difference between existing and required conditions (that is, bugs) and to evaluate the features of the SW items."

More simply, in "Seven Principles of Software Testing" (Bertrand Meyer (ETH Zürich and Eiffel Software) in: “IEEE Computer”, August 2008, pp. 99-101 (paywalled!), the simple defintion is given:

"To test a program is to try to make it fail."

The ISTQB (International Software Testing Qualifications Board) does not give a proper definition of testing (in particular, no defintion can be found in "Standard Glossary of Terms used in Software Testing Version 2.0 by the ISTQB") but extends the meaning informally to include what they call "static testing" in complement to "dynamic testing", which are activies like reviews, code inspections and static analysis and which generally would fall under design and quality management (more power to them I guess). The following informal definition is given in the "ISTQB Foundation Level Syllabus":

Test activities exist before and after test execution. These activities include planning and control, choosing test conditions, designing and executing test cases, checking results, evaluating exit criteria, reporting on the testing process and system under test, and finalizing or completing closure activities after a test phase has been completed. Testing also includes reviewing documents (including source code) and conducting static analysis.

In particular, in "Point/Counterpoint - Test Principles Revisited" (Bertrand Meyer (ETH Zürich) vs. Gerald D. Everett (American Software Testing Qualifications Board), “IEEE Software”, August 2009, pp. 62-65 (paywalled!), we read the following by Mr. Meyer:

"Mr. Everett and the ISTQB broaden the definition of testing to cover essentially all of quality assurance. In science one is free to use any term, with a precise definition, to denote anything, but it makes no sense to contradict established practice. The ISTQB’s definition goes beyond dynamic techniques commonly known as testing to encompass static ones. Hundreds of publications discuss static analysis, including proofs, versus tests. Such comparisons are of great relevance (including to me as the originator, with Yuri Gurevich, of the Tests and Proofs conferences, http://tap.ethz.ch), but the differences remain clear. Ask practitioners or researchers about testing; most will describe dynamic techniques. If the ISTQB wants to extend its scope to quality assurance, it should change its name, not try to redefine decades-old terminology."

78.141.139.10 (talk) 17:27, 19 March 2013 (UTC)
 * The numbered definitions are excellent and should be used as a reference. However, "to test a program is to try to make it fail" is not a good definition as not all testing is trying to make it fail. Some testing is simply to determine its limitations (performance testing is one example of that). Walter Görlitz (talk) 18:10, 19 March 2013 (UTC)

Agree with that. Meyer clearly wanted a short and pregnant "core definition" that certainly was not intended to cover all aspects of testing. Here is the context, from the above-cited paper

The only incontrovertible connection is negative, a falsification in the Popperian sense: A failed test gives us evidence of nonquality. In addition, if the test previously passed, it indicates regression and points to possible quality problems in the program and the development process. The most famous quote about testing expressed this memorably: “Program testing,” wrote Edsger Dijkstra, “can be used to show the presence of bugs, but never to show their absence!” Less widely understood (and probably not intended by Dijkstra) is what this means for testers: the best possible self-advertisement. Surely, any technique that uncovers faults holds great interest for all “stakeholders,” from managers to developers and customers. Rather than an indictment, we should understand this maxim as a definition of testing. While less ambitious than providing “information about quality,” it is more realistic, and directly useful. Principle 1: Definition To test a program is to try to make it fail. This keeps the testing process focused: Its single goal is to uncover faults by triggering failures. Any inference about quality is the responsibility of quality assurance but beyond the scope of testing. The definition also reminds us that testing, unlike debugging, does not deal with correcting faults, only finding them.

78.141.139.10 (talk) 17:15, 22 March 2013 (UTC)


 * Ironically, Wikipedia is not a reliable source. I don't have a problem with a core definition. The problem is coming up with one. Walter Görlitz (talk) 18:37, 22 March 2013 (UTC)

Definition of "Testing" bleeds into on "Testing Methods" (and there is confusion about V&V)
Under "Testing Methods", we read the following:

Static vs. dynamic testing

There are many approaches to software testing. Reviews, walkthroughs, or inspections are referred to as static testing, whereas actually executing programmed code with a given set of test cases is referred to as dynamic testing. Static testing can be omitted, and unfortunately in practice often is. Dynamic testing takes place when the program itself is used. Dynamic testing may begin before the program is 100% complete in order to test particular sections of code and are applied to discrete functions or modules. Typical techniques for this are either using stubs/drivers or execution from a debugger environment.

Static testing involves verification whereas dynamic testing involves validation. Together they help improve software quality.

The nonstandard use of "Dynamic Testing" and "Static Testing" comes directly from the ISTQB syllabus. The phrase "Static testing can be omitted, and unfortunately in practice often is." does not make any sense, because "Static testing" belongs to design, lifecyle management and quality control. It's not testing. Can it be left out? Is it? Is that unfortunate? The answer is "it depends". Of course doing it, time and money permitting, helps improve SQ, that's what this is all about.

And then:

"Static testing involves verification whereas dynamic testing involves validation. Together they help improve software quality."

NO! On the Wikipedia webpage on V&V, we read:

Validation. The assurance that a product, service, or system meets the needs of the customer and other identified stakeholders. It often involves acceptance and suitability with external customers.

Verification. The evaluation of whether or not a product, service, or system complies with a regulation, requirement, specification, or imposed condition. It is often an internal process.

So-called "static testing" applies therefore to both Validation and Verification. So-called "dynamic testing" definitely to Verification ("does it meet listed requirements") but in a far lesser degree to Validation.

Conclusion: Rewrite needed. One should clarify the "Static/Dynamic" thing vs. the "Plain Testing" thing, it's very confusing.

78.141.139.10 (talk) 17:32, 22 March 2013 (UTC)

About Braguet test
Hi, I've seen that you deleted my subsection on software testing. I did't know that previously not published articles could not be added to wikipedia. What counts as "previously published"?

I've published at: http://experiencesonsoftwaretesting.blogspot.com/2010/01/braguet-testing-discovery-of-lucas.html so it's not anymore un-published material now.-- Diego.pamio (talk) 15:43, 5 January 2010 (UTC)


 * The preceding was taken from my talk page.
 * * It's still self-published. Blogs are not sources according to Wikipedia's guidelines. See WP:PRIMARY. The fact that you published the blog entry half an hour ago, just before you posted the comment to my talk page, possibly just so you could have a source, makes it even more dubious.
 * * The definition makes it no different than smoke testing or sanity testing. Just because Lucas A. Massuh doesn't know the terms doesn't mean he can't invent a new term that means the same thing. It does mean that we don't have to use that new term. I suggest that you take those two existing concepts to Lucas A. Massuh so that he is aware of them. --Walter Görlitz (talk) 16:08, 5 January 2010 (UTC)

Software Testing Quote is added under 'Overview' section
Does it make sense..plz let me know.


 * The origin of the quote was not given, and it didn't fit there. Tedickey (talk) 09:47, 22 January 2010 (UTC)


 * So, no it doesn't make sense. --Walter Görlitz (talk) 14:53, 22 January 2010 (UTC)

Certifications
The discussion of certification cites me for two propositions.
 * First that I say that the testing field is not ready for certification because none of the current certs is based on a widely accepted body of knowledge.
 * Second that I say that certification CANNOT measure an individual's productivity, skill or practical knowledge.

Regarding the first, I have often argued that software engineering (including testing) is not ready for LICENSING because we lack an accepted body of knowledge. (See for example, John Knight, Nancy Leveson, Lori Clarke, Michael DeWalt, Lynn Elliott, Cem Kaner, Bev Littlewood & Helen Nissenbaum. "ACM task force on licensing of software engineers working on safety-critical software: Draft report, July 2000. See also http://www.acm.org/serving/se_policy/safety_critical.pdf.) However, certification is not licensing. We can certify someone as competent in the use of a tool, the application of a technique, or the mastery of a body of knowledge without also asserting that this is the best tool, the most applicable technique, or the "right" (or the only valid) (or the universally accepted) body of knowledge. I think the current certifications claim too much, that they misrepresent the state of knowledge and agreement in the field and in doing so, several of them promote an ignorant narrow-mindedness that harms the field. But this is a problem of specifics, a problem of these particular certifications.

Regarding the second, I have not said that certification CANNOT measure these things. Look at Cisco's Expert-level certification, for example. I see this as a clear example of a certification of skill. Similarly, I see no reason to argue that we CANNOT measure a programmer's or tester's productivity or practical knowledge. However, believe that the current certifications DO NOT attempt to measure these things, or that anyone could reasonably argue that any of these certs does a credible job of making such measurements. CemKaner (talk) 22:30, 9 April 2010 (UTC)

Software Testing ROI
Return on investment (ROI) is often a misunderstood term. This term can get even more complex when measuring your investment return around software testing.

How to evaluate Software Testing ROI and How to improve it?

http://blogs.msdn.com/b/robcaron/archive/2006/01/31/520999.aspx

http://www.mverify.com/resources/Whats-My-Testing-ROI.pdf NewbieIT (talk) 03:53, 10 June 2010 (UTC)

CMMI or waterfall
The heading "CMMI or waterfall" was utterly incorrect. Neither the Capability Maturity Model for Software (SW-CMM), the CMM for Systems Engineering (EIA-731), nor the Capability Maturity Model Integrated (CMMI) have ever mandated the waterfall life cycle. In fact, many of the original CMM authors have lectured on the topic of how the waterfall originated from a mis-quoted and misunderstood speech by Winston Royce in 1973, and therefore arguably was never a legitimate life cycle in its strictest interpretation. Spiral, incremental, iterative, OOPS, sushi, fountain, etc. are all life cycles that may be the basis for project and test execution, and all may be used to address the practices called for in the CMMI. Vic Basily has written an excellent article discussing how test-everything-at-the-end approaches are not "traditional", and how incremental/iterative approaches to development and software testing have been around as long as the industry. The "test first" approach promoted by agile methods is actually a revival of long-standing disciplines. —Preceding unsigned comment added by 63.241.202.8 (talk) 19:31, 9 August 2010 (UTC)
 * I don't think that the word "or" in this heading is a conjunction comparing them but rather to contrast them. Both CMMi and Waterfall processes suggest similar things, but they are not being equated in the section. Also, any time content is removed without a comment, no one knows why the change was made. When that change was made by an anonymous edit, it's even more suspicious. If you want to come up with a better heading feel free, but don't forget to explain why you're making the changes to avoid arousing suspicions of vandalism. --Walter Görlitz (talk) 20:10, 9 August 2010 (UTC)

The heading for this section was so jarring that I had to stop and see if anyone had raised the issue. I'm glad to see someone has. The first writer above is 100% correct. The second...  CMMI and waterfall don't really suggest things that are in the slightest way similar. This heading appears to have been written by someone who hasn't the slightest clue about CMMI. 65.201.123.212 (talk) 17:16, 6 May 2013 (UTC)

I agree with 65.201.123.212 and Walter Görlitz, CMMI is a completely different model and what would be better referenced here is TMMI and how it related to the overall SDLC. This section should debate Waterfall or Agile Sandelk (talk) 11:00, 17 December 2013 (UTC)

Manual vs Automated testing
I am by no means an expert on the matter, which is why I struggle to understand this concept, why is automated testing so much more expensive? Once you have your automation architecture in place, I see no additional costs...if you are performing testing activities as a one-off thing I can see it being expensive, but in an environment where constant testing takes place, I imagine it's worth the investment? Cronax (talk) 12:21, 28 September 2010 (UTC)
 * It's more expensive since 1) the tools are rarely free, 2) the specialists of the tools are more costly to hire and maintain than manual testers, 3) when a test fails, it has to be repaired, quite often requiring a determination of what the original test was attempting to verify, and 4) the tests fail frequently. Unit tests are much less expensive, in fact automated, reusable unit tests are generally less expensive than hiring manual testers to find errors down-the-line. However "automation" is generally the term used to describe automated GUI functional testing, and it's quite often more expensive. With this in mind you also have to add that the people who sell the tools, particularly the commercial tools, do so by selling the tool's record-and-playback features, which work well for a while, but are quite brittle to minor changes. Feel free to read Test Automation Snake Oil (by James Bach) and Automation Myths (by M. N. Alam) for more information and greater details on these points. --Walter Görlitz (talk) 14:33, 28 September 2010 (UTC)

I will actually disagree a little with Walter Görlitz here. Automation can actually prove to be cheaper, but it has a really high return on investment and requires a lot of initial expense to pay for the respective tools, engineers to write the scripts and additional challenges that will be faced, but if done right - after 2-3 years the investment pays off greatly, especially when you consider the time saved through the automation efforts. This article needs to look at automation more objectively and highlight both the pros and cons more effectively. Sandelk (talk) 11:09, 17 December 2013 (UTC)
 * You're not disagreeing with me but with two professionals who say that automation cannot find new bugs and so is a waste of money. Feel free to read those two articles. Walter Görlitz (talk) 11:40, 17 December 2013 (UTC)

Controversy: Test team
Should there be a new controversy point: A separate test team vs business analysis team conducting the tests? I'm looking for information on this, and depending on the comments to this topic, a new controversy point may be added. —Preceding unsigned comment added by Culudamar (talk • contribs) 12:21, 20 October 2010 (UTC)
 * First I've heard of this as a potential controversy or even a subject of discussion. The article isn't a place for talk. There are many forums where this could be discussed but if you can't find any WP:V sources for it outside of Wikipedia, this isn't the place to come for information about it either. Any information you or others added wouldn't last very long without a WP:V source either. --Walter Görlitz (talk) 13:59, 20 October 2010 (UTC)

Testing vs. Quality Assurance
At Yahoo!, a major software company with ~15,000 employees, the terms "Quality Assurance" and "Quality Engineering" are used primarily to refer to what is termed Software Testing on this page. All testing except for Unit Testing falls to a Quality Assurance team, and the members of those teams have Quality Engineer as a part of their job titles. This leads me to wonder: does the distinction made here reflect industry practice? It could be that Yahoo! is an exception, or it could be that the usage of the terminology has shifted. I don't know enough about the issue to say, but I wanted to raise the question.

CopaceticOpus (talk) 04:25, 1 February 2011 (UTC)
 * It's not just Yahoo! that use the term "quality" in relation to what is actually a "test" role. There is much speculation, but in short where companies actually have distinct quality and test groups, the roles of the test group match the role of quality groups in companies that don't have both. --Walter Görlitz (talk) 05:07, 1 February 2011 (UTC)


 * My sense is that Quality Assurance looks at the process used to develop the software, while Quality Control is used to measure the software quality. That is, QA is an audit role to make sure that the analysis, design, development and testing comply with the established process that ensures quality software.  Testing is not a QA role, but a QC role.  "Quality Engineering" would be the process used to develop software with a eye for 100% verification and validation.  It would be an overarching process.  The PMBOK might call it "Quality Management".  QA ensures that QE(QM) techniques are employed to ensure QC. WikiWilliamP (talk) 20:33, 1 February 2011 (UTC)

Baloney Definition
Definition does not make sense at all: "Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test."

Bluh. One couldn't be more unspecific. Why not say "Software testing is an activity, that a lot of people earn their money from easily without even understanding the basics of reason, by scrumming up and crying "yeah, we have a profession too!"."?

The "software testing industry" and ISTQB is a religion that absorbs any decent thought. —Preceding unsigned comment added by 115.189.199.174 (talk) 23:10, 22 April 2011 (UTC)
 * The existing definition meets the requirements of several competing visions of what software testing is. ISTQB represents one flavour of software testing. So your definition, pessimism excluded, doesn't work for the other groups. --Walter Görlitz (talk) 00:57, 23 April 2011 (UTC)

Thank you, Walter.

I state it does not, because it is not a definition. A definition must distinguish the object defined from the rest of the universe, else it is not a definition. I will show this defect by replacing terms of the "definition" without changing the semantics, noting in for each step why it is correct:

"Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test."

<--> ("stakeholders" can be anyone)

"Software testing is an investigation conducted to provide information about the quality of the product or service under test."

<--> ("quality" is undefined and describes arbitrary features)

"Software testing is an investigation conducted to provide information about the product or service under test."

<--> ("quality" is undefined)

"Software testing is an investigation conducted to provide information about the product or service under test."

<--> ("product or service" can be anything)

"Software testing is an investigation conducted to provide information about the object under test."

<--> ("software testing" is the activity performed)

"Software testing is an investigation conducted to provide information about it's object."

<--> ("investigation" is any activity trying to reveal information on something)

"Software testing is an investigation on it's object."

<--> ("object" is any arbitrary thing)

"Software testing is an investigation on something."

<--> ("something" is an indifferent thing)

"Software testing is an investigation."

<--> ("investigation" is any activity trying to reveal information on something, but there is nothing specified)

"Software testing is an investigation on everything."

<--> ("an investigation on everything" focuses only on esoterics and exoterics)

"Software testing is religion."

<--> ("religion" is irrelevant to software quality, the only thing "Software testing" is relevant to)

"Software testing is irrelevant."

<--> ("irrelevant" means, it has no relation to other things)

"Software testing is!"

<--> (everything is)

"!"

So: not a definition there. The existing article should be replaced with an "!".

A pessimist is someone who states the truth too early. —Preceding unsigned comment added by 115.189.239.224 (talk) 09:15, 3 May 2011 (UTC)
 * Need we begin to point out your flawed logic on that one? –WikiWilliamP (talk) 15:54, 3 May 2011 (UTC)
 * Need we begin to point out your flawed logic on that one? –WikiWilliamP (talk) 15:54, 3 May 2011 (UTC)

Thank you, WikiWilliamP. I would love to see you try. —Preceding unsigned comment added by 115.189.38.100 (talk) 21:03, 3 May 2011 (UTC)
 * You change the definition of terms several times. That's pretty major flaw in logic. --Walter Görlitz (talk) 21:20, 3 May 2011 (UTC)
 * An example if this can be seen in the following:
 * No cat has eight tails.
 * One cat has got one more tail than no cat.
 * Therefore, one cat, has one more tail than no cat, which has eight tails, and so it has nine tails.
 * This is known as the "fallacy of equivocation". Your "equation" above suffers from this particular fallacy to an extreme sense. --Walter Görlitz (talk) 21:28, 3 May 2011 (UTC)
 * And then there are the simply errors. For instance, "'stakeholders' can be anyone" which is only partially true. Mozart can't be a stakeholder since he's dead. Similarly there are a about 6 billion people in the world who cannot, or more importantly will not, be stakeholders. Therefore, the stakeholders are those people who have a vested interest in the product. So you can't simply remove them from the definition. Ever. The information isn't created for the sake of having information, it's created to inform a specific group of people who cannot be removed from the definition. --Walter Görlitz (talk) 21:32, 3 May 2011 (UTC)
 * "stakeholders are those people who have a vested interest in the product". Please include this definition in the article. The word 'stakeholder' is jargon and needs explaining. Tim flatus (talk) 07:52, 27 October 2012 (UTC)

"Testing can never completely identify all the defects within software"
The article states: "Testing can never completely identify all the defects within software". This would really be bad if it were true. Consider simple programs that can be fully verified for all inputs. In these cases, the testing process can assure that the algorithm operates as expected unless the operating system, or the hardware fails, or data becomes corrupted by other parts of the software. All of these conditions lie outside of the scope of testing a single algorithm or collection of algorithms. However, usually software is too complex to allow for complete verification or proof of correctness. Even so, many bugs can be found through testing. You could say that a bug that cannot be found is one that does not exist, given sufficient time for testing (which may be a lot of time...). While you can never be sure that all errors in an algorithm have been found through testing, unless all input/output combinations are verified, you may still have found all bugs in the software. Therefore, a much more cautious wording is required here. I would propose to say, "Randomized testing cannot ensure that all defects within software have been found." — Preceding unsigned comment added by ScaledLizard (talk • contribs) 17:34, 20 June 2011 (UTC)
 * It is both true and not at all bad. No change is needed. If you insist, I can find references to back this opinion, but even not all defects can be found in even the most "simple" pieces of software because of interaction with compilers, operating systems, and other elements. --Walter Görlitz (talk) 19:25, 20 June 2011 (UTC)
 * I added a ref along those lines - there are plenty more. It's unfortunately all too true. AliveFreeHappy (talk) 00:05, 28 June 2011 (UTC)

That is what "combinatorial explosion" is all about. 78.141.139.10 (talk) 17:17, 22 March 2013 (UTC)
 * "That is what "combinatorial explosion" is all about." Yes. Which is the point. It's odd that you'd make it. A combinatorial explosion is a tipping point into another state, not an absolute. The previous state was, of course, stable...."debuggable", if you will. I'm curious when the human race lost the ability to find all the faults in a linear series of Boolean values. I'm guessing Walter knows.  lol...If I ever had an applicant tell me they couldn't write - and guarantee no bugs in a "Hello World" for an Atari 400, the interview would be over: I know I'm not talking to a programmer.Mad Bunny (talk) 16:27, 28 February 2014 (UTC)

Your dispute seems due to a simple ambiguity. "Testing can never completely identify all the defects within software" can be interpreted as or or something in between, e.g. I think most people would consider the first statement to be false and the second to be true. The third statement seems closest to what is intended and I think it's hard to refute. Rp (talk) 17:41, 28 February 2014 (UTC)
 * There exists no software for which testing can completely identify all the defects.
 * There exists software for which no testing can completely identify all the defects.
 * For typical software used in practice, no testing can completely identify all the defects.
 * That's what my revision - subsequently reverted by someone who holds their ignorance in far too high esteem - stated. The fundamental problem here is the notion that a product broken by a 3rd party was broken because of a defect created by the first party software producer. "Everything is your fault" is untenable.Mad Bunny (talk) 00:25, 31 March 2014 (UTC)

Under the presence of specific testing hypotheses, there exist finite test suites such that, if the implementation under test (IUT) passes them all, then it is necessarily correct with respect to the considered specification. For instance, it is well known that, if we assume that the IUT behavior can be denoted by a deterministic finite-state machine with no more than n states, then the (finite) test suite consisting of all sequences of 2n+1 consecutive inputs is complete (i.e. it is sound and exhaustive). That is, if the IUT passes them all then we know for sure that it is correct (check e.g. "Principles and methods of testing finite state machines" by Lee and Yannakakis). Moreover, it has been proved the existence of finite complete test suites for many other (very different) sets of assumed hypotheses, see hierarchy of testing difficulty in this article for further details. So, the sentence "Testing can never completely identify all the defects within software" is false and should be replaced by something like "Under the absence of appropriate testing hypotheses, testing can never completely identify all the defects within software." In order to explain this addition a little bit and avoid controversy, perhaps a link to the hierarchy of testing difficulty section in this article should be included right after the sentence, so it could be something like: "Under the absence of appropriate testing hypotheses, testing can never completely identify all the defects within software (though complete test suites may exist under some testing hypotheses, see hierarchy of testing difficulty below)." If nobody argues against this change within a few days, I'll do it. --EXPTIME-complete (talk) 20:00, 25 August 2014 (UTC)

Since nobody has argued against my point in the previous paragraph, I have changed the text by: “Although testing can precisely determine the correctness of software under the assumption of some specific hypotheses (see hierarchy of testing difficulty below), typically testing cannot completely identify all the defects within software.” I think this is a consensus sentence. On the one hand, it shows that testing cannot guarantee the software correctness if some specific hypotheses cannot be assumed (a typical case indeed). On the other hand, it shows that some hypotheses enable the completeness in testing. Note that the power of hypotheses in testing is relevant and worth of being mentioned here because, in any field, new knowledge can be gathered only if some hypotheses are assumed: Mathematicians need to assume axioms, Physicists need to assume that observations are correct and “universal rules” will not suddenly change in a few minutes, etc. Software testing is not an exception. Moreover, even when testing cannot guarantee the system correctness (the typical case), many hypotheses are also implicitly or explicitly assumed. --EXPTIME-complete (talk) 8:56, 28 August 2014 (UTC)
 * What's to argue against? I removed the WP:WEASEL words though because no software is sufficiently simple to be considered to qualify. Walter Görlitz (talk) 13:55, 28 August 2014 (UTC)
 * (“Argue against” should be “refute”, sorry) Actually, some real software qualifies. Let me come back to my previous example about finite-state machines. This model and its variants are used by software developers of web services, communication protocol or user interfaces. Typically, the developer designs a graphical state model, and next this model is automatically translated by a tool into executable code in some language (for instance, case tools transform UML statecharts into code; web service generators produce WS-BPEL or WS-CDL code from graphical state models; etc). Given some program that was generated in this way, testing it in a black-box manner until defects can be completely discarded is feasible if the state model this code is equivalent to is deterministic, we know an upper bound of its number of states, and this number is low enough. Programs being equivalent to some unknown finite-state machine with only 10 states might not look so complex, but they happen often in practice, they are also error prone (interactions with them can be arbitrarily long), and testing a black-box program like this up to discarding all defects is completely affordable (and, if sophisticated methods are used, for much more than 10 states). You could think that the automatic transformations performed by the tools generating that code could be wrong. Well, the correctness of many of these transformations has been formally proved. So, if you do not trust them, then any theorem mentioned in Wikipedia should also be distrusted. I could show other similar examples not involving finite-state machines.


 * By the way, after removing the word “typical”, I see two possible interpretations of the first sentence. If the reader considers that "testing cannot identify all the defects within software" means “testing cannot identify all the defects in all software artifacts”, then it’s Ok: in some cases (most of them), detecting all defects is impossible. However, if the sentence means "for all software, testing cannot identify all the defects within that software" then it is false, as I pointed out in the previous paragraph. Moreover, the whole sentence from “Although” on would be contradictory: "Although testing can do X if Y, testing cannot do X." (!!) Do you think all readers will consider the first (and correct) interpretation, because it is the only one that is not contradictory? Not sure! In order to remove the ambiguity here, the part of the sentence saying "testing cannot identify all the defects within software" could be replaced by “under the absence of strong hypotheses, testing cannot identify all the defects within software", or by "in virtually all practical cases, testing cannot identify all the defects within software.”--EXPTIME-complete (talk) 22:32, 28 August 2014 (UTC)
 * I have no need to refute anything, but I could argue against your suggestions. A refutation would be to prove something is wrong or false in some way, but I don't have to believe an argument is wrong or false to argue against it if I think a different approach is needed although the facts are correct.
 * There is too much weight in adding "typical" and no "real" software applies. See Kaner's discussion on this and Bezier's. Walter Görlitz (talk) 03:56, 29 August 2014 (UTC)
 * I admit that the word "typically" isn’t strong enough to emphasize that, in virtually all practical cases, testing cannot detect all defects, something I do agree with. That’s why I proposed two alternative approaches in my last comment (see the last lines). Anyway, if you don’t like these alternatives either, I think I could live with the sentence in its current state (even though it is not my favorite choice): one possible interpretation is correct, and the other one is “almost” correct for me. I guess that, if other readers see the contradiction I noted and it's not just me, they will eventually come here and tell.--EXPTIME-complete (talk) 08:49, 29 August 2014 (UTC)

Test recording and reporting
There is no mention of how tests should generally be recorded in each category. Some generally accepted guidelines would be useful, such as tester, date+time, title, detail, action, resolution, etc. Depends on the category, for example performance testing and regression testing are quite different.

It would also be useful to describe how to summarise results on an ongoing basis to developers & managers; where the test timeline stands, proportion of resolved issues, criteria for acceptance (not necessarily 100% success). All essential to managers. SombreGreenbul (talk) 13:07, 20 July 2011 (UTC)


 * Assumes that there is agreement on how tests should be recorded. See Test case in relation to formal and informal test cases. --Walter Görlitz (talk) 14:15, 20 July 2011 (UTC)

Environment testing
Often, software is portable between platforms, eg between Windows ME, Vista and 7. Testing of the software on relevant platforms should be a subsection withing Non-Functional Testing. SombreGreenbul (talk) 14:13, 20 July 2011 (UTC)
 * This isn't known as environment testing though. It's more-commonly known as cross-platform testing. --Walter Görlitz (talk) 14:16, 20 July 2011 (UTC)

Reverting Changes on Grey Box Testing
Moving this from my talk page: Hi, I am unable to get your ideas about grey box testing.Why are u reverting my edits ,i am providing references also which belongs to my relevant edits.. just check it out and inform me where i am wrong But please dont undo my edits. — Preceding unsigned comment added by Netra Nahar (talk • contribs) 2011-09-12T17:52:40


 * I removed the content because it was poorly written, poorly researched, and had no basis in reality. I do understand that this is material for your course, but I suggest you get better sources not just whatever you can find with a Google search. I would rely more on scholarly material and published books rather than company white pages as the latter are usually trying to sell something and aren't well researched. --Walter Görlitz (talk) 18:04, 12 September 2011 (UTC)
 * But just so it's not my opinion, here are links to the edits to show what was added:    and . I would appreciate if other editors would like to comment on whether it was removed unjustly or not. I also explained what the problems were, in detail, on Netra Nahar's talk page. Asking why they were removed is somewhat superfluous. --Walter Görlitz (talk) 18:17, 12 September 2011 (UTC)
 * Walter, I tend to agree with you. I am trying to see where Netra is going, but the concepts are not well-endorsed by the testing community.  Netra, I would put what you are calling "gray box" as merely a restatement of "white box", as you are claiming the tester has knowledge of the internals. WikiWilliamP (talk) 20:06, 12 September 2011 (UTC)
 * That is what grey box is, but knowledge of the internals at the level of algorithm or logic, not in terms of access to source code. The article describes that well. --Walter Görlitz (talk) 20:38, 12 September 2011 (UTC)
 * A similar thing was in data mining. I had to revert his edits there because they were highly redundant, improperly researched (e.g. mixing AI, data mining and machine learning) and not properly wiki formatted. I tried to leave a polite comment explaining my reasoning at User talk:Netra Nahar. I have the strong impression that we're seeing here a coursework assignment: India_Education_Program/Courses/Fall_2011/Software_Testing_and_Quality_Assurance ... --Chire (talk) 19:25, 14 September 2011 (UTC)

Software testing levels: beta testing
"Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.[citation needed]" Could this be expanded? This is now quite common practice for web sites and, typically, the larger the site then the longer the duration of the beta testing. Google mail was in beta for 5 years! For citations, how about http://www.slate.com/articles/news_and_politics/recycled/2009/07/why_did_it_take_google_so_long_to_take_gmail_out_of_beta.html or even a much earlier article: http://www.zdnet.com/news/a-long-winding-road-out-of-beta/141230 — Preceding unsigned comment added by 86.19.211.206 (talk) 15:19, 7 October 2011 (UTC)

Manual Testing vs. Human Testing
What are your thoughts on using the term "human testing" or "human performed testing" instead of manual testing? Does it make sense? This in contrast to machine performed testing, robot performed testing or automated testing.

Regards. — Preceding unsigned comment added by Anon5791 (talk • contribs) 22:21, 14 October 2011 (UTC)
 * It's a good idea but not supported in the literature. The distinction is manual vs. automated testing. If you could find sources to back that sort of change, feel free to add it. --Walter Görlitz (talk) 22:30, 14 October 2011 (UTC)

History section
After " Dave Gelperin and William C. Hetzel classified in 1988 the phases and goals in software testing in the following stages" a list follows that extends beyond 1988. More explanation is necessary. Jeblad (talk) 20:28, 27 December 2011 (UTC)

Risk-based testing
Risk-based testing appears to have been written without consideration that it is probably better off just a brief sentence in this article, if independent reliable sources demonstrate such weight is due. In other words, the article appears to be a neologism, WP:POVFORK, and a bit of a soapbox. Can anyone find sources to justify a brief mention in this article, or maybe sources enough to keep Risk-based testing as an article in itself? --Ronz (talk) 03:15, 31 January 2012 (UTC)
 * There is sufficient amount there. There are many other articles that are shorter or have fewer references (some completely unreferenced) and your choice of target seems misplaced. --Walter Görlitz (talk) 03:58, 31 January 2012 (UTC)
 * WP:OSE, WP:FOC. --Ronz (talk) 23:00, 31 January 2012 (UTC)
 * The subject you want to merge is notable and no one is taking your actions personally, I just don't agree that that article should be merged into this one. Your assertion that the other article needs "independent reliable sources" is flawed as it does have them. So the request to merge it here is premature and misplaced. --Walter Görlitz (talk) 23:21, 31 January 2012 (UTC)
 * Not a neologism as it's well represented on Google searches: http://www.google.com/search?q=%Risk-based+testing": 112,000 results. Risk based testing - Schaefer - Cited by 3, Heuristic risk-based testing - Bach - Cited by 54, Risk-based testing::: Risk analysis fundamentals and … - Amland - Cited by 54. So it's a common term in the field of software testing. --Walter Görlitz (talk) 23:31, 31 January 2012 (UTC)
 * Google searches aren't sources.
 * When I previously searched, what I found appears to be a marketing term for...well, it's hard to tell. Looks like a lot of reinventing the wheel, or just copying ideas from others and putting a new name on it for marketing sake. Maybe as a marketing term it's notable enough. Was it intentionally overlooked for inclusion in this article because of its promotional and sophomoric nature? --Ronz (talk) 02:23, 1 February 2012 (UTC)
 * Nor were they offered as sources, but rather as proof that your theory that this is a neologism, which it isn't. Your claim that it's a marketing term is WP:OR and full of holes. It wasn't intentionally overlooked. It's not promotional. It's not sophomoric. There are four main camps in software testing (see http://www.testingeducation.org/conference/wtst_pettichord_FSofST2.ppt or you can check it out with a power pass subscription on StickyMinds.com) and the school to which this approach belongs (the context-driven school) is the smallest and has the fewest authors but those authors (Kaner, Bach, Bach, and a few others) are the most highly-respected. They have a few other approaches that are used in different situations. Further discussion of this in the Controversy section and its main article.
 * The editors of this article are primarily in the two larger groups (those who rely on what they call "best practices") and discount these sorts of pragmatic approaches to testing. Not only do they not recognize it or the other context-sensitive activities, they discount them as ineffective, which is why they are not written about. --Walter Görlitz (talk) 02:38, 1 February 2012 (UTC)
 * Conspiracy theories are not a substitute for sources. Shed's light on the behavioral problems though... --Ronz (talk) 18:50, 1 February 2012 (UTC)
 * There sources. Please comment on the subject not on the editors. --Walter Görlitz (talk) 20:31, 1 February 2012 (UTC)
 * "Please comment on the subject not on the editors" I'm happy to refactor any of my comments per the relevant policies and guidelines. Of course, that doesn't appear to be the issue here.
 * So we've established that Risk-based testing is a pov-fork to get around the views of the editors here. Good to know. --Ronz (talk) 18:23, 2 February 2012 (UTC)
 * You've claimed that risk-based testing is a pov-fork and have not supported your case. Good to know. --Walter Görlitz (talk) 19:29, 2 February 2012 (UTC)
 * Misrepresenting other editors is disruptive. Please stop.
 * The case was made at 02:38, 1 February 2012. If this information is true, then it is a pov-fork. --Ronz (talk) 16:41, 3 February 2012 (UTC)
 * The case was not made in that statement simply indicating a vital branch of software testing uses the term. Others don't agree on the use of the term because it does not fit their method of testing. They use terms of the context-driven school differently. They would use ad hoc testing as a negative while the context-driven school uses it as a positive. They think that exploratory testing is not at all organized and should be abandoned for the use of formal, codified testing. That doesn't mean that they are fringe ideas when a good percentage of the testing community use the methods. There are others, but since you're unaware of the controversies, I've alerted a discussion group who are mostly from the context-driven school about your intentions on the risk-based testing article. If there is little or no response from that over the next few weeks, I'll ascent to the merging of the article. If there is a response you can tell them to take it to a proper forum. But I don't think that discussion forum is what you had in mind. --Walter Görlitz (talk) 17:36, 3 February 2012 (UTC)

Testing of multithreaded applications
This article seems to have nothing of the sort. There would be issues of synchronization, hazards, races, and use of semaphores or mutexes which, in addition to being designed more or less correctly, needs to be tested. There would be the issue as fast producer slow consumer and how this is handled by the system, or by the application. Sending of pointers to shared data, is it done, and does it work? There would be priority issues of processes or messages, and priority inversion handling. And this would probably only be some of the factors which would need to be tested. Should there be a separate chapter about this in the article? — Preceding unsigned comment added by Aclassifier (talk • contribs) 12:07, 26 March 2012 (UTC)
 * It's covered with race conditions. If you want to specifically find one or more reliable sources that discuss it, we could add it on its own. --Walter Görlitz (talk) 13:45, 26 March 2012 (UTC)
 * I can't find race conditions mentioned. The list I mention above has more than races. Also, the article has lots of chapter without references, so writing a non-referenced chapter about testing with respect to multithreadedness probably should be ok for a start? Øyvind Teig (talk) 07:27, 27 March 2012 (UTC)
 * Then race conditions should be added and feel free to add commentary about testing in multithreaded environments, but it should be supported with WP:RS. --Walter Görlitz (talk) 13:50, 27 March 2012 (UTC)

Ethics of testing
Is this worth mentioning? There may be requirements outlined in standards (like IEC 61508), but there is no mention of ethics? How do one treat or test a situation that would be very rare? What would the consequences be? How much do we tell to the end user what has been and what has not been tested (in this version)? How do we reply to a question? I don't know much about this, but to me it seems relevant. Should there be a sperate chapter about this? Øyvind Teig (talk) 12:16, 26 March 2012 (UTC)
 * You're not really talking about ethics here, but again if you know of reliable sources, we could add a section. --Walter Görlitz (talk) 13:45, 26 March 2012 (UTC)
 * The situations I describe above in my opinion all raise ethical questions. Dreaming up an example: What if you know of a very rare fault in a car breaking system but you cannot replicate it in a test? You have just seen it "once". But maybe ethic matters are as relevant to testing as it is to "everything else", and then too difficult to make a separete point of here? Øyvind Teig (talk) 07:38, 27 March 2012 (UTC)
 * I understand the issues. Kaner has commented that life-critical system testing drives the adoption of a lot testing approaches because if there is a loss of life, and a lawyer somewhere can show that a some obscure testing approach could have found the problem that resulted in the loss of life, the company will be held accountable. Your situation is simple: the defect is reported but marked as can't reproduce. --Walter Görlitz (talk) 13:50, 27 March 2012 (UTC)

Non-functional testing
There are two problems with the discussion of non-functional testing: --AlanUS (talk) 18:06, 31 March 2012 (UTC)
 * 1) The description of non-functional testing at the top of the file is fundamentally different from the one given in the introduction to the non-functional testing section itself.
 * 2) The elements underneath the non-functional testing section are not proper subelements of non-functional testing as defined in the introduction to the section.
 * Do you have a suggestion for fixing it? I have not yet looked at it. --Walter Görlitz (talk) 21:35, 31 March 2012 (UTC)

The whole section on "functional" versus "non-functional" is wrong. The distinction alluded to is between verification ("Did we code the thing right?") and validation ("Did we code the right thing?").

Functional refers to the code which is called by the Code Under Test (CUT). Integration refers to the code which calls the CUT. That is probably the single most important distinction in all of software testing, and it's not even part of the vocabulary for most coders.

The reason it's so important is that most people combine integration- and functional-testing. That's actually validation, though most people lazily call that whole enchilada integration-testing. It's the hardest to debug. They should perform integration-testing without functional-testing by faking the code which is called by the CUT, or by using a trivial version of the CUT.

Many people say unit-testing when they mean functional-testing. Proper Unit testing mocks the code called by the CUT so that only the CUT is executed.

-- Cdunn2001 (talk) 17:58, 7 July 2013 (UTC)

Positive and negative testing
Positive and negative test cases redirects here, but neither is explained in the article. -- Beland (talk) 18:16, 5 October 2012 (UTC)==

Stress Testing
Since 2008 there has been a detailed article on stress testing software, 100% devoted to the topic. Stress test (software). The reference here to that specific article has been reverted back 2x to the very general Stress testing article? That's a broad brush article covering hardware, software, financial (bank stress tests), and may soon cover medical/human stress testing (cardiac, voice, labor & delivery, emotional stress testing, etc.). Seems to make no sense that this article which is 100% devoted to software should not point directly to Stress test (software), since the detail reader here is known for sure to be focused on software Rick (talk) 03:34, 25 February 2013 (UTC)
 * Since 2008 it has been at stress testing (software). It was moved about six hours ago to the general article, which was created in 2003. I have no problems pointing it to the software-specific article once it has been returned to its original location. --Walter Görlitz (talk) 04:18, 25 February 2013 (UTC)

The above refers to revert1 and revert2 Where on the page, Software testing, the link: has been twice reverted to the more general:
 * Stress test (software) (specific article 100% on software stress testing)
 * Stress testing (broad, more general article)

Need to pin down what is meant by (it) in:
 * Since 2008 (it) has been at stress testing (software). (It) was moved about six hours ago to the general article, which was created in 2003. I have no problems pointing (it) to the software-specific article once (it) has been returned to its original location.

Also need to pin down exactly what is being referring to in these ?
 * Since 2008 it has been at stress testing (software). It was moved about six hours ago to (the general article), which was created in 2003. I have no problems pointing it to (the software-specific article) once it has been returned to its (original location)
 * I'm sorry for my broad use of pronouns. Stress testing (software), 2003 title, was moved to Stress test (software) which is against naming conventions in these projects. Stress testing (software), 2008 title, was created at that location and was not really part of this discussion until you pointed the link here to an admitted poor choice to an article whose move is in dispute. When that dispute is settled, then we can decide where the link here should go. Until then, don't bother. --Walter Görlitz (talk) 07:22, 25 February 2013 (UTC)

Weekend Testers America to edit on this subject 7 September 2013
This article and other related articles may be subject to editing by inexperienced editors as part of an effort to improve the quality of information on the subject of software testing: http://weekendtesting.com/archives/3095

Please be kind.

Cmcmahon (talk) 22:45, 5 September 2013 (UTC) (I am WMF staff but operating here not in my official capacity)
 * The current problem with most of the articles is lack of sources. If the edits come with sources, there will be no problems. If they come without sources, or if they come with bad grammar, there will be reverts. Walter Görlitz (talk) 23:10, 5 September 2013 (UTC)

Monitoring part of Software Testing?
I think monitoring (for example with nagios) is part of software testing. The current wikipedia article does not cover this. There is a section about alpha-testing, then about beta-testing. It is long ago that programms were written, then burned onto a CD/DVD and then sold. Today most software is server based and the programmers are able to care for the software during live execution. Like "DevOps": watching the processes is part of software testing. I am not a native speakers, that's why you don't want to write on the real wiki article. But maybe someone agrees with me and can add something to the real article. — Preceding unsigned comment added by 89.246.192.60 (talk) 19:47, 8 November 2013 (UTC)
 * I am not aware that monitoring is a testing role. It is usually the role of an IT team. Walter Görlitz (talk) 19:56, 8 November 2013 (UTC)


 * Can you back up the claim that monitoring would be part of testing somehow? I find that hard to believe. Slsh (talk) 15:28, 30 January 2014 (UTC)

I think "Is monitoring part of software testing?" has no definite answer. But I hope all agree: It is **related** to software testing. That's why I think some sentences about monitoring (nagios checks) should be included in the page. Up to now I am too new to wikipedia and don't know how to start. But if someone starts, I would love to give feedback. Guettli (talk) 07:25, 6 November 2015 (UTC)

In my context, monitoring is part of testing. The canonical (but maybe not best) reference is this decade old talk from Ed Keyes, where he says "Sufficiently Advanced Monitoring is Indistinguishable from Testing" (video link)Angryweasel (talk) 23:18, 17 November 2017 (UTC)

One of the poorest written wikipedia articles ever
I am a software author with more than 40 years experience of testing. This article is massively oversized for what is essentially a "simple" process. Most people understand what a knife is used for and would recognize cutting implements of different kinds from the stone age up to the present time. A stone aged "tester" would perform the task of "testing" his product in much the same way as a modern day butcher. Does the cutting implement do what it is supposed to do? If not why not? How can it be fixed? The article differentiates debugging from testing despite the fact that testing is the most obvious way of identifying errors. I believe there should be a history section that clearly identifies at what stage each advance in manual or automatic program validation techniques progressed. The present article would have us believe that there were numerous "arcane" sub divisions of software testing from the outset. This is so not true. There have been paradigm shifts in the art of testing and debugging that are reflected in the commercial tools that have evolved to assist the progress right up to the present day. To some extent the vast number of programming paradigms, languages and hardware platforms has hampered the development of universal testing tools - but the concept of assisted testing has existed since at least the 1970's and is understated. It is a truly dreadful article. — Preceding unsigned comment added by 81.154.101.27 (talk) 09:44, 4 January 2014 (UTC)


 * I think this article is hardly oversized, if you consider the amount of theory, research, books published, tools existing for the process. Do you consider all that to be oversized as well? Debugging is different from testing, although it is common error to mix up between the two. Testing is about finding problems, debugging is a way of diagnozing a problem, finding a root cause for a problem you already know exists. Also, there are a lot of different testing done by different people. If the product to be tested is of importance, you would have dedicated usability testing, performance testing, system testing and acceptance testing - they should not be done by same people, but people that are actually trained for their field of testing. Slsh (talk) 15:26, 30 January 2014 (UTC)

Good programmers are lazy. That's at least my opinion. Yes, there is big amount of theory, .... but what's the goal of this article? Do in-depth theoretical academic work, or give a good overview? For me the overview is more important than the details. I would like a much shorter article, too. If some parts need more in-depth explanations, then a new page needs to be created. For example "security testing". I guess only 0.0001% of all developers work in an environment which needs security testing. Yes, it is important for some people, but only very few. PS: I talk about "security testing". "Security concerns" is something else. This needs to be done by every developer daily. Guettli (talk) 07:33, 6 November 2015 (UTC)

Certifications are not so controversial as the article claims
Article claims that "Several certification programs exist to support the professional aspirations of software testers and quality assurance specialists. No certification now offered actually requires the applicant to show their ability to test software. No certification is based on a widely accepted body of knowledge.", but what is the actual basis of claiming so? There are certainly others that don't believe this to be true, see for example ISTQB, "The scheme relies on a Body of Knowledge (Syllabi and Glossary) and exam rules that are applied consistently all over the world, with exams and supporting material being available in many languages.". Added citation needed-template. Slsh (talk) 15:34, 30 January 2014 (UTC)
 * I removed your Citation needed tags because it's all discussed in the Kaner reference. The context sensitive school is heavily against any certification despite the existence of documents, the bodies of knowledge do not always agree on terms and definitions. Compare ISTQB with the CSTE or CSQA bodies of knowledge. I suggest you read the Kaner document. Walter Görlitz (talk) 15:51, 30 January 2014 (UTC)
 * Well, a) first of all, Kaner is just a one person, even if influential. Is there anything to back up his claims? At the very least, this claim should be marked as controversial, not as The Truth, as there are plenty of opposite views. Or, it should add that "according to Kaner" or "according to context sensitive school". It is not the only view there is. b) There are two Kaner documents references in the section we talked about, this, from 2001, and this, from 2003. That's over ten years old already. ISTQB was founded in 2002, and since then, it is active with 47 members boards in 71 countries (source). For such claims, a more up-to-date reference would be needed. c) I've read both documents and I haven't actually seen him claiming the things that are claimed in this article. Can you pinpoint what is it exactly that you're referring to? Slsh (talk) 16:51, 3 March 2014 (UTC)
 * There are at least two others: Bach and Bolton. Probably a dozen, all of whom are members of the context sensitive school, and all of whom are published and recognized. The point is not whether it's controversial or not, it's whether it's referenced. Walter Görlitz (talk) 05:50, 4 March 2014 (UTC)
 * Then the part should say that "according to context sensitive school", with explicit references. The views of context-sentive school are even listed in Software testing controversies, so it should be a no-brainer to add the note that the views are controversial and not agreed by all. Slsh (talk) 11:44, 5 March 2014 (UTC)
 * I use them as an example. There are others who feel that way. And we don't indicate who finds certifications as non-controversial, so why should we list who finds them so? Walter Görlitz (talk) 17:19, 5 March 2014 (UTC)
 * I guess you're just too biased on this to see what's wrong with your reasoning. There needs to be a second opinion from someone else who has a more neutral view. Slsh (talk) 09:02, 10 March 2014 (UTC)
 * I guess you're just too biased on this to see what's wrong with your reasoning. Walter Görlitz (talk) 14:44, 10 March 2014 (UTC)
 * Just wanted to make note that even Kaner himself seems to disagree with your claims about his work which I didn't notice until now. But I guess that won't convince you either? Slsh (talk) 20:18, 25 August 2014 (UTC)

Negative testing
Negative test is a disambig. The software meaning links here, but this page doesn't contain the word "negative". --Dan Wylie-Sears 2 (talk) 01:48, 11 April 2014 (UTC)
 * Right. The DaB page states that it's a "test designed to determine the response of the system outside of what is defined. It is designed to determine if the system fails with unexpected input." Walter Görlitz (talk) 02:07, 11 April 2014 (UTC)

combinatorial test design
The article desperately needs a definition of the term or a link to another article in which it is defined. — Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
 * It does, but it doesn't use IBM's term of "Combinatorial Test Design", it uses instead the more common term "all-pairs testing", which is linked in the black-box testing section. Walter Görlitz (talk) 20:46, 21 May 2014 (UTC)

"Grey-box testing" section defines nothing.
One does not need access to logs or databases to understand an algorithm or internal data structure and vice versa. So, there is nothing that actually distinguishes gray box from white or black box testing. — Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
 * That's because no one practices true "black-box" testing. Most of what passes as "black-box" is actually "grey-box" testing. The distinction is made here. Walter Görlitz (talk) 20:47, 21 May 2014 (UTC)
 * No, it isn't - see above. — Preceding unsigned comment added by 68.183.37.170 (talk • contribs) 21:40, 21 May 2014 (UTC)
 * I don't believe you understand what the definition of grey-box testing is. If you do, please offer one rather than simply negating the one used in the article, which is based on the definition in Testing Computer Software, Second Edition (1993). Walter Görlitz (talk) 01:01, 22 May 2014 (UTC)

Acceptance testing
Is it a level or type of testing? If the answer is "both", then the notions of "level" and "type" overlap and those two sections would have to be combined. — Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
 * Sorry you don't the ambiguity of the language. There are two types of testing that are commonly called acceptance testing:
 * Acceptance into the test cycles, which then relies on a smoke test, BVT or something similar, and
 * User acceptance test, which is when the client who paid for the work accepts the product.
 * So it's both, and perhaps you can suggest a way to explain that better in the article. Walter Görlitz (talk) 20:50, 21 May 2014 (UTC)

Certification provider spam
I've removed most of the entries in the certification provider section. Wikipedia is WP:NOTDIRECTORY, and the section was getting awfully spammy with all the entries lacking articles or secondary sources. If these testing certifications are actually provided by noteworthy organizations, their inclusion should be supported by either an article, or a WP:SECONDARY source. If no such sources can be found, it becomes impossible to tell the difference between a legitimate service and a certification-mill. Regardless, Wikipedia is not a platform for advertising, which is what this amounted to. I think the few remaining entries should also be removed, unless there are any objections. Grayfell (talk) 20:28, 6 September 2014 (UTC)

NIST study
The NIST study isn't a credible source for the economic estimate of the costs of software defects to the economy. It comes up with weird results, like "on average a minor software error has a cost of four million dollars" or "minor errors can cost more than major ones" (both from Table 6-11). It has unreasonably low sample sizes - fewer than 15 software developers, and even though the user portion of the study had 179 plus 98 respondents, that represents a dismally low response rate that would have resulted in tossing the study in most academic publications. Most crucially, it isn't based on any actual in-house measurements but on a 25-minute survey which asked people to guess what bugs were costing their company.

More details here: https://plus.google.com/u/1/+LaurentBossavit/posts/8QLBPXA9miZ — Preceding unsigned comment added by LaurentBossavit (talk • contribs) 14:32, 13 September 2014 (UTC)


 * It seems that it would make sense to move the NIST information and Laurent Bossavit's commentary to the Controversy section. Thoughts? Yorkyabroad (talk) 13:36, 9 December 2017 (UTC)
 * Sounds good to me. It wouldn't be hard to find a citation to support "It is commonly believed that the earlier a defect is found, the cheaper it is to fix it," but it would be hard to find solid data to back that belief. Faught (talk) 22:49, 11 December 2017 (UTC)
 * I was looking at the arguments about not having a separate controversy section, but the guidance there is about having one-sided topics, and all of the topics, including this one, discusses both sides, so yes, it makes sense to move it. Walter Görlitz (talk) 07:33, 13 December 2017 (UTC)

Editing Needed in the first section of this page
I noticed that the first 3-4 paragraphs in the very first section of this page repeat itself. If you read it, you'll see what I mean. It could really use to be cleaned up. I would gladly do it, but I don't want to just jump in and take care of it without bringing it up here first, and since I have no idea how long it might take for this process to play out, someone else will probably want to do it, at least if anyone cars about how intelligent the article should appear to be, considering the subject matter. — Preceding unsigned comment added by 184.78.188.225 (talk) 03:50, 14 September 2014 (UTC)

Same sentences used for Unit Testing and Development Testing
I've noticed that two sections use the exact same sentences when discussing two different types of testing:

Unit testing is a software development process that involves synchronized application of a broad spectrum of defect prevention and detection strategies in order to reduce software development risks, time, and costs. It is performed by the software developer or engineer during the construction phase of the software development lifecycle. Rather than replace traditional QA focuses, it augments it. Unit testing aims to eliminate construction errors before code is promoted to QA; this strategy is intended to increase the quality of the resulting software as well as the efficiency of the overall development and QA process.

Depending on the organization's expectations for software development, unit testing might include static code analysis, data flow analysis, metrics analysis, peer code reviews, code coverage analysis and other software verification practices. Development Testing is a software development process that involves synchronized application of a broad spectrum of defect prevention and detection strategies in order to reduce software development risks, time, and costs. It is performed by the software developer or engineer during the construction phase of the software development lifecycle. Rather than replace traditional QA focuses, it augments it. Development Testing aims to eliminate construction errors before code is promoted to QA; this strategy is intended to increase the quality of the resulting software as well as the efficiency of the overall development and QA process.

Depending on the organization's expectations for software development, Development Testing might include static code analysis, data flow analysis, metrics analysis, peer code reviews, unit testing, code coverage analysis, traceability, and other software verification practices. Can someone who understands this topic better please clean that up? —  Ma y ast  ( talk ) 22:12, 23 November 2014 (UTC)

Back-to-back testing
I think that back-to-back testing is missing & should be mentioned in this article.--Sae1962 (talk) 14:59, 16 June 2015 (UTC)

External links modified
Hello fellow Wikipedians,

I have just added archive links to 1 one external link on Software testing. Please take a moment to review my edit. If necessary, add after the link to keep me from modifying it. Alternatively, you can add to keep me off the page altogether. I made the following changes:
 * Added archive https://web.archive.org/20150402110525/http://channel9.msdn.com/forums/Coffeehouse/402611-Are-you-a-Test-Driven-Developer/ to http://channel9.msdn.com/forums/Coffeehouse/402611-Are-you-a-Test-Driven-Developer/

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Cheers. —cyberbot II  Talk to my owner :Online 16:54, 27 August 2015 (UTC)

Recent overhaul
The changes made by were too much. The addition of sources was good, but not all meet WP:RS. Elisabeth Hendrickson is but Kate Falanga is not and going from 73 references to 36 references isn't an improvement. Removing common terms such as Black-box and white-box testing is incomprehensible. It is too much to review in a single sitting. Walter Görlitz (talk) 14:01, 15 March 2017 (UTC)


 * Walter Görlitz please explain why this wholesale reversion doesn't violate
 * WP:MASSR
 * WP:REVEXP
 * Generalizations like "not all meet WP:RS do not help the author improve the article as they are not usefully specific. Additionally "Kate Falanga is not" cite no rule for WP:RS... Why is Kate's work "not"? — Preceding unsigned comment added by Cyetain (talk • contribs) 15:20, 15 March 2017 (UTC)
 * Additionally your comment that "Wholesale removal of common terms is 'simply ignorant'''" as well as the vague insinuations about Kate violate WP:NPA and you should consider removal of this derogatory language WP:RPA  — Preceding unsigned comment added by Cyetain (talk • contribs) 15:28, 15 March 2017 (UTC)
 * I thought I explained above. My rationale is that she's not a recognized expert in the field. The page is essentially a company blog. Feel free to go to WP:RSN to see if they think it's a good source. I'm sorry if you think it's ignorant, yet you offer even less of a reason to include. Walter Görlitz (talk) 16:49, 15 March 2017 (UTC)
 * It's also terribly kind of you to create an account just to complain. See WP:SPA.
 * I totally missed your incorrect claim that discussion of Falanga is a violation of NPA. Talk pages are where we discuss editors' actions and doing so in a neutral way does not violate NPA. However, if you're suggesting that Falanga and the previous editor are one in the same, then we have a case of WP:COI. However I'm not sure why Falanga would select the user name of "NoahSussman" to edit under. And since I did not discuss NoahSussman but simply pointed out that Falanga's work is not likely a RS, there was no violation of NPA in any sense. Walter Görlitz (talk) 16:56, 15 March 2017 (UTC)


 * What reason do you have to believe Kate isn't a recognized expert?
 * You've misrepresented what I've said. I didn't accuse you of being ignorant... You accused @NoahSussman of being ignorant.
 * You may want to look up Noah before accusing Kate of sock puppetting Noah... another violation of NPA.
 * This isn't a single purpose account. Again please refrain from personal attacks.
 * You've also COMPLETELY failed to explain why this revision isn't a violation of : WP:MASSR & : WP:REVEXP — Preceding unsigned comment added by Cyetain (talk • contribs) 17:58, 15 March 2017 (UTC)
 * I explained the former, didn't call them socks I couldn't figure logic, and I don't feel the need to explain the two things you claim I'm violating because I'm not violating essays. 19:06, 15 March 2017 (UTC)

Walter Görlitz It would be nice if you would consider Noah Sussman's work ongoing, and criticize it or update it point-by-point if necessary, rather than revert these changes wholesale. This page has been a shambles for years, and now that finally someone competent is updating it, the software testing community would appreciate as much support as Wikipedia can give. If it makes a difference, I was the QA Lead at WMF for about three years, and I can vouch that no one in this conversation is a sock puppet. Cmcmahon (talk) 18:16, 15 March 2017 (UTC)
 * I am considering it, which is why I started the discussion, but removing half of the refs is problematic and removing half the content isn't helpful. Walter Görlitz (talk) 19:06, 15 March 2017 (UTC)

As a reminder you have not justified your reversion of Noah's edits, you've simply stated that the revisions weren't in your opinion "helpful." Which specific revisions weren't helpful? Why? If 1/2 of the content is moved to other pages has it been removed? or simply edited? If half the content is moved, wouldn't one expect half of the references to be removed as well?

The language you use here " I am considering it", "It is too much to review in a single sitting." is very reminiscent of WP:OWNBEHAVIOR please remember WP:OWNERSHIP — Preceding unsigned comment added by Cyetain (talk • contribs) 19:34, 15 March 2017 (UTC)

I will readily admit that I made massive edits to see what would happen. I am sympathetic to anyone who would like their work chunked at the smallest grain that is practical :) I will redo the edits in small chunks. HOWEVER the "wholesale removal" is INACCURATE AND WRONG as I MOVED the content in question to a new page, which is linked from the old location of the content. I intend to apply this change again. IT IS NOT REMOVAL OF BLACK AND WHITE BOX TESTING I am simply complying with the "too large" box that I *found in place* on the page. I am trying to follow the extant instructions for improving wikipedia and making the page smaller by extracting list content into a "list of things" page. So I fully expect not to get pushback on that change when I re-implement it in the near future. Thank you and I look forward to continuing the discussion / your thoughts / your further feedback NoahSussman (talk) 10:38, 16 March 2017 (UTC)

"73 references to 36 references isn't an improvement." it is if half the references are to marketing material, out of date, badly written material or material that is all three at once. As is the case here. Too many unreliable / marketing links on this page is a serious credibility problem. Again though I will now challenge one reference at a time rather than attempting any more bulk deletion. No more bulk deletions. But the references on this page blow chunks and I will END THEM NoahSussman (talk) 10:45, 16 March 2017 (UTC)
 * Thanks for discussing.
 * Let's address very long. It's currently 79,017 bytes, including references. The prose are around 57,000 bytes or around 8600 words. Wikipedia:Article size suggests nothing about wholesale ripping out sections, even if the template does. It talks about making "readable-prose". Since most reading is likely done by clicking through to a section, reading the summary or clicking through to an article. I don't have any metrics to support that, but it's been my experience with summary articles like this, both my own and watching friends and co-workers.
 * References may be out-of-date, but in those cases, we don't remove them. That term can mean two things in Wikipedia terms. The first is that the link no longer works or that the data is outdated. I suspect that you're implying the former. Wikipedia:Link rot discusses how to address that problem. In short, if you can find an updated version of the content, update it. If you can't, update the reference with a dead link. If you were suggesting the latter, add a link to new content, however, I'm not sure how a technique can become outdated. I'm not sure which references you thought were WP:REFSPAM or marketing links, but I'd be happy to work through it. I suspect that if many of those links you are talking about were added today, I would remove them as not meeting WP:RS. Walter Görlitz (talk) 14:19, 16 March 2017 (UTC)

"I'm not sure how a technique can become outdated" - well, this is the crux of problem with this page, I think. The whole page is based on an idea of Software Testing that emerged decades ago, while software development techniques have moved on. Many of us software testers have extended/adapted our methods over the last decade or so, and I agree with Noah that major changes are necessary to the whole thing. I do understand the reluctance to throw away large parts of the page (and references) and I hope we do get a better page out of this discussion. Rutty (talk) 15:00, 16 March 2017 (UTC)
 * But the fundamentals, which is what this article is about, have not changed. Functional testing is still that. Let's put some meat on this. Are you saying that black- and white-box testing are outdated? If you have extended and adapted those techniques, then add those extensions and adaptations in the articles that discuss them, but leave a brief summary here. Walter Görlitz (talk) 15:11, 16 March 2017 (UTC)

Generative Testing, QuickCheck, etc.
I'd like to see some discussion of generative testing, supplemented by a link to the QuickCheck page. RichMorin (talk) 03:29, 28 September 2017 (UTC)

Roles section
Does this section provide value or reference on software testing roles? I'm questioning whether it can be removed or merged into another section.

Furthermore, the list of "roles" in this section are just a snapshot of some software testing titles, and there are so many of these that I don't think listing a few of them would help any reader. — Preceding unsigned comment added by Angryweasel (talk • contribs) 23:22, 17 November 2017 (UTC)
 * It would be better to update the section. SDET has started to be used in some areas. The term "quality analyst" has become synonymous with software tester. The section could easily be expanded. As long as the section doesn't become a WP:COATRACK for terms, it doesn't hurt to keep the section. Walter Görlitz (talk) 23:55, 17 November 2017 (UTC)
 * The terms you mentioned are titles, not roles. What do you think about renaming the section to Testing Job Titles and repurposing in that direction. As it is, it has nothing to do with Roles. A Role is a description of what someone does - none of the examples fit that description.Angryweasel (talk) 16:50, 18 November 2017 (UTC)
 * I'd be fine with that, or expanding to incorporate roles. Walter Görlitz (talk) 18:07, 20 November 2017 (UTC)

Date format
I don't see a strong precedent for the date format in citations. I see things like "2012-01-13" and "July 1, 2009". Is there a preference? I think "July 1, 2009" is more readable. Faught (talk) 19:23, 21 November 2017 (UTC)
 * MOS:DATE states that the formatting should be unified, but not whether one format or another should be used. MOS:STRONGNAT, generally speaking, states that some subjects have strong ties to a national format. So "international" English subjects and U.S. military would use Day Month Year format (28 August 2015) while American subjects would use Month Day, Year format (August 28, 2015). Canadian subjects may use either but shouldn't change unless there's a reason. However, software testing doesn't have strong national ties to either format and so ISO 8601 format (2015-08-28) is probably the best to use. As long a it's not different between references (which seems to be the case now) and we can agree to it (which is what this discussion could achieve). Walter Görlitz (talk) 20:17, 21 November 2017 (UTC)

So I went with the US style that was prevalent in the body of the article. But perhaps you'd want to use ISO format only in the citations? Not sure whether you'd want to make it different just in the citations. Faught (talk) 00:47, 28 November 2017 (UTC)
 * It's fine either way. Walter Görlitz (talk) 00:51, 28 November 2017 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified 2 external links on Software testing. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Corrected formatting/usage for http://www.jacoozi.com/blog/?p=18
 * Added archive https://web.archive.org/web/20090831182649/http://stpcollaborative.com/knowledge/272-were-all-part-of-the-story to http://stpcollaborative.com/knowledge/272-were-all-part-of-the-story

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 08:17, 2 December 2017 (UTC)

National spelling
How should we choose between American English or British English? Does the spelling on any related pages matter?

On Software testing I see American spellings like artifacts, behavior, and unrecognized, and British spellings like artefacts, grey-box, unauthorised, and organisational. Faught (talk) 20:57, 6 December 2017 (UTC)
 * There has been a mix, yes. No real way to decide. There are scripts to standardize for "international", Oxford and Canadian spelling, but nothing for US English. The preference for "grey-box" is my fault. The others are not. We can come to a consensus here and state it applies it to the article by introducing the appropriate template. Walter Görlitz (talk) 21:04, 6 December 2017 (UTC)
 * It would be easiest for me to use US English since that's my native tongue. Faught (talk) 01:16, 7 December 2017 (UTC)
 * Understood. I have three dictionaries in my browsers, British, Canadian and American, and can switch between them. If we can leave it for a few days to see if others chime-in with support or objections, we can get a better sense of the direction, but this is what you'll want to add to the article when it comes time: . Of course, this will change each month and year that it's on the talk page or in the archive because of the embedded formula. Walter Görlitz (talk) 01:58, 7 December 2017 (UTC)
 * No idea what would be the fairest choice. I see that the Load testing article has "use American English" already. I'm not sure how to search the rest of the articles in the Software testing category for something similar. Faught (talk) 19:43, 7 December 2017 (UTC)

Mysterious Gelperin/Hetzel reference
Can anyone identify what this reference is? "Regarding the periods and the different goals in software testing, ..." Perhaps the article they co-authored, "The Growth of Software Testing"? If we can't identify what this is referring to, we should delete the reference. Faught (talk) 18:24, 18 December 2017 (UTC)
 * It looks like that article, which is their most-cited article. In it, they state: "Titles such as “test manager,” “lead tester,” “test analyst,” and “test technician” have become common." The paragraph the quotation comes from is talking about the rise of software test engineering as a speciality and using the job titles with "test" as a supporting argument. Yorkyabroad (talk) 21:31, 18 December 2017 (UTC)
 * Is there a publication date and journal or other identifying information for the source?
 * Side note: MOS:LQ. Walter Görlitz (talk) 21:44, 18 December 2017 (UTC)
 * The quotation, which had the strange punctuation in the original article, i.e. list separator inside quotation marks, came from: David Gelperin and Bill Hetzel. The growth of software testing. Communications of the ACM, 31(6):687–695, 1988. Yorkyabroad (talk) 22:49, 18 December 2017 (UTC)
 * Thanks Yorkyabroad - I updated the citation. I used full author names, by the way. The academic style with only first initials drives me nuts. Not sure what the standard is here. — Preceding unsigned comment added by Faught (talk • contribs) 15:49, 19 December 2017 (UTC)
 * For citation style, see cite web, cite journal, cite book or other citation templates. They all support full names where you can select the format, or given and family name. Walter Görlitz (talk) 16:11, 19 December 2017 (UTC)
 * I did separate first and last names. The examples in the templates do imply that first names should be spelled out too. Faught (talk) 17:07, 19 December 2017 (UTC)

Incomplete Dr. Dobbs citation
There's a rather odd citation in the History section: Company, People's Computer (1987). "Dr. Dobb's journal of software tools for the professional programmer". Dr. Dobb's journal of software tools for the professional programmer. M&T Pub. 12 (1–6): 116.

Can anyone intuit a title and author for this article? Faught (talk) 19:19, 20 December 2017 (UTC)


 * So, I used this to practice some Wikipedia searching... The quote, "a successful test is one that finds a bug" was introduced 20 Aug 2008, soon after tagged with citation needed. A citation was added 16 May 2009 , which was a search result from google books. The citation was tidied up by a bot on 19 June 2010.


 * If you search on the above quote in google books, one of the search returns is for a Dr Dobbs article (1987, volume 12, pp116) which shows the beginning of the quote highlighted. From the search result, it's not clear which particular article it is or who the author is. The quote is stated as coming from Myers . However, a look in my paper copy of that edition seems to point to it being a misquotation. In chapter 2, The Psychology and Economics of Program Testing, Myers writes as a principle, "A successful test case is one that detects an as-yet undiscovered error."
 * In that chapter, he doesn't appear to use the word, "bug".


 * Re-reading Myers, it seems that the statement in the article, "Although his attention was on breakage", isn't quite right. Myers talks about testing as adding value by finding and removing errors - his emphasis seems to be on finding, rather than breakage. So, in summary, it appears to be a misquote of Myers and so could be safely removed, as Myers is already referenced, plus the sentence might need to be re-visited to adjust, "attention was on breakage".Yorkyabroad (talk) 00:09, 21 December 2017 (UTC)
 * Excellent detective work! It seems that the text and ref should either be corrected and moved, if it fits in a better section, or simply removed. Walter Görlitz (talk) 00:30, 21 December 2017 (UTC)
 * I changed the quote to match what Yorkyabroad found and changed the citation accordingly. Faught (talk) 15:56, 26 December 2017 (UTC)

revamped the certification section
I published an edit to the Certifications section with several minor improvements, and adding certifications from the International Software Certifications Board (better known as QAI).

I deleted two of them: 1) Certified Quality Improvement Associate (CQIA), which is not specific to software, and 2) ISEB, which is not a certification, but hints at the fact that ISEB markets the ISTQB certifications already listed here. If there is any controversy about those deletions, let's add them back without losing the other changes.

The ISTQB offers a richer set of certifications than is indicated here, but the way they organize them makes it difficult to list all the variations as separate certifications. Maybe someone could find a good solution for this page.

One more thing - I want to delete the "Software testing certification types" information entirely. This is already covered at Certification and doesn't need to be hashed out on the software testing page. Any objections?

Also, I want to consider moving most of the first paragraph to the Controversies section. I don't think it has a neutral point of view. Faught (talk) 22:20, 30 January 2018 (UTC)
 * Good call. In my opinion, since certification is not required, a brief mention is all the article that is needed. Pointing to the main article will allow a reader to understand further information. Walter Görlitz (talk) 01:22, 31 January 2018 (UTC)
 * I have completed two further edits - delegating the certification types, and moving information to the controversy section. I just noticed the separate Software testing controversies article, and though it's a bit of a mess, it seems that the controversies section should migrate there. Faught (talk) 19:32, 4 February 2018 (UTC)

Testing Levels
I added a tag to the following line in the Testing Levels section:

"There are generally four recognized levels of tests: unit testing, integration testing, component interface testing, and system testing."

I do see a few web mentions (mostly from sites selling their wares) defining "the 4 levels" as unit, integration, system, and acceptance. I feel like that may be the better edit, but I'm not certain where the initial reference of those 4 levels comes from. I'll keep digging, but throwing it here in the meantime. For example, there's an explanation on the test-institute dot org website (coincidentally blocked by wikipedia) - but I don't want to reference a site that is focused on selling certifications. I've thumbed through my library of test books, but haven't found the original source yet.

I also found a reference to Component, integration, system, and acceptance testing in _Foundations of Software Testing ISTQB Certification_ by Rex Black and Dot Graham.

I am wondering (out loud) if anything about Testing Levels rises to the level of being worth mentioning in Wikipedia. Angryweasel (talk) 19:49, 3 April 2018 (UTC)


 * It looks like the citation that comes directly after that sentence addresses, partially, your concerns, and the original writer meant for that reference (to SWEBOK v3.0) to stand for both prior sentences. The SWEBOK reference references unit, integration, and system testing specifically at page 4-5. Page 10-3 of the same document seems to explicitly put "acceptance testing" outside of "system level testing," which is solidified with its approach to mentioning only the three levels mentioned. I'll play with this a bit more and see if I can come up with something more (no pun intended) acceptable. Lostraven (talk) 20:02, 13 July 2018 (UTC)


 * Additionally, this reference places component interface testings outside of level testing as a type of black-box, software testing technique. Lostraven (talk) 20:12, 13 July 2018 (UTC)


 * Ultimately I decided to move component testing as a subsection of black-box testing. The literature I'm finding consistently has unit, integration, and system testing, and only some additionally lump in acceptance testing. I'm rewriting the intro section to reflect this, though if even stronger citations can be found to state that acceptance testing is firmly a fourth level, feel free to update my updates. Lostraven (talk) 20:41, 13 July 2018 (UTC)
 * WRT acceptance testing: I think that's a good choice. I've seen acceptance listed as a level, but IMO it's not a level like the other three are. There are a zillion types of testing and acceptance is surely important, but not in the same dimension as unit, integration, system. ... WRT component testing: I think that's a synonym for unit testing. Stevebroshar (talk) 12:01, 26 April 2024 (UTC)

Outsourcing link is clearly commercial
The blurb about outsourcing links to an article touting a particular firms services and offers little actual evidence about the claims, suggesting it be removed Sinfoid (talk) 02:52, 7 May 2018 (UTC)
 * As long as it's an article on Wikipedia, there's no reason not to link to it. Walter Görlitz (talk) 03:03, 7 May 2018 (UTC)

urdu 2021 tests
we can find the tests to get overview and asked my students to this overview 103.228.159.104 (talk) 15:47, 6 December 2021 (UTC)

India Education Program course assignment
This article was the subject of an educational assignment supported by Wikipedia Ambassadors through the India Education Program.

The above message was substituted from by PrimeBOT (talk) on 20:08, 1 February 2023 (UTC)

Section on Testability Hierarchy recently removed: a volunteer to review its suitability?
User MrOllie has recently removed my contributions about the Testability Hierarchy arguing citation spam. I would like to ask a volunteer to review the relevance to this article of the Testability Hierarchy section existing before it was reverted by MrOllie at 22:27, 20 February 2024 (UTC). EXPTIME-complete (talk) 23:33, 20 February 2024 (UTC)

In particular, this is the text I would like to introduce again:

Hierarchy of testing difficulty
Based on the number of test cases required to construct a complete test suite in each context (i.e. a test suite such that, if it is applied to the implementation under test, then we collect enough information to precisely determine whether the system is correct or incorrect according to some specification), a hierarchy of testing difficulty has been proposed. It includes the following testability classes:


 * Class I: there exists a finite complete test suite.
 * Class II: any partial distinguishing rate (i.e., any incomplete capability to distinguish correct systems from incorrect systems) can be reached with a finite test suite.
 * Class III: there exists a countable complete test suite.
 * Class IV: there exists a complete test suite.
 * Class V: all cases.

It has been proved that each class is strictly included in the next. For instance, testing when we assume that the behavior of the implementation under test can be denoted by a deterministic finite-state machine for some known finite sets of inputs and outputs and with some known number of states belongs to Class I (and all subsequent classes). However, if the number of states is not known, then it only belongs to all classes from Class II on. If the implementation under test must be a deterministic finite-state machine failing the specification for a single trace (and its continuations), and its number of states is unknown, then it only belongs to classes from Class III on. Testing temporal machines where transitions are triggered if inputs are produced within some real-bounded interval only belongs to classes from Class IV on, whereas testing many non-deterministic systems only belongs to Class V (but not all, and some even belong to Class I). The inclusion into Class I does not require the simplicity of the assumed computation model, as some testing cases involving implementations written in any programming language, and testing implementations defined as machines depending on continuous magnitudes, have been proved to be in Class I. Other elaborated cases, such as the testing framework by Matthew Hennessy under must semantics, and temporal machines with rational timeouts, belong to Class II.

EXPTIME-complete (talk) 09:01, 21 February 2024 (UTC)


 * well... I've never heard of this concept/idea before. So, it seems less than notable, but of course I don't know everything. ... I will say that the topic of this article is rather broad. If all info about software testing was put here, the article would be GINORMOUS which is not readable IMO. Maybe it's too long already. I suggest to exclude all but the most core ideas. Stevebroshar (talk) 11:56, 26 April 2024 (UTC)

Proposed deletions
I'm planning to delete section Faults and failures since although not wrong is off topic. Also planning to delete the paragraph with "software product caters" since also far off topic.

@Furkanakkurt8015: wanted to tag you since I see you modified the fault section reacently. Hope you don't mind if I ax it. Stevebroshar (talk) 20:27, 28 April 2024 (UTC)