Talk:Bessen/Hunt technique

some things to improve
I am the eponymous Bessen. In my view, this page should be improved in several ways and a few comments are not germane to the topic:

1. The key reference is not up to date. The Bessen-Hunt paper has been published in a peer-reviewed journal. The citation should be

James Bessen and Robert M. Hunt (2007), “An Empirical Look at Software Patents,” Journal of Economics and Management Strategy 16, no. 1, pp. 157-89.

The link to the working paper is still helpful because that version is freely available, so I would recommend keeping that in addition to the new citation.

2. The criticism raised by Hahn and Wallsten is not really appropriate because that criticism refers to a very early version of the paper, the paper has been substantially revised and now includes careful validation tests on the definition of software patent, and this criticism was considered by the reviewers at the Journal of Economics and Management Strategy and found to be no longer an issue for the version that was published. In other words, while valid criticism can be raised, this paper does not really do so.

3. The unpublished paper by Noel and Schankerman is also not on topic because these authors are not attempting to come up with a general definition of software patent to use in a variety of research, but, instead, they are merely trying to obtain a quick-and-dirty measure to use in their own research. The use of IPC class G06F is plainly NOT a general definition of software patents because it was designed to be a definition of data processing patents that are mostly hardware (anyone can verify this by looking at the sub-classes).

4. The Bessen-Hunt paper compares alternative techniques and discusses their relative merits and weaknesses. This Wikipedia article could be expanded to include that discussion, in brief, and information about the validation tests if some discussion of these issues seems important to this entry. Note, however, that both Bessen-Hunt and also a paper by Hall and MacGarvie ran data analysis using multiple alternative means for identifying software patents. In neither case were there substantial differences in the results of the data analyses. So it is not clear that the criticism is of any significance. However, it has been raised.

5. I would recommend that the discussion of Hahn-Wallsten and Noel-Schankerman be deleted from this article because it does not deal adequately with the real issues that should be raised (ie., the validity of the technique as demonstrated by empirical tests). If these papers are included it might be appropriate to mention that they were funded by proponents of software patents (Microsoft and LECG, hired by Microsoft; see the papers' acknowledgements). Jbessen 13:39, 16 November 2007 (UTC)
 * jbessen, Excellent suggestions.  You seem to be right on top of the situation.  Feel free to make changes (with references, of course).  We will help with a bit of after editing if needed.--Nowa (talk) 03:34, 17 November 2007 (UTC)
 * Nowa, it does not seem a good idea to me. This seems to be a clear-cut conflict of interest. We should improve the article ourselves and suggestions from Jbessen should come on the talk page. Thanks. --Edcolins (talk) 11:43, 17 November 2007 (UTC)
 * Other things to bear in mind
 * The working paper is the one people use and refer to, therefore much of the subject of this article SHOULD be analysis and critcism of that version. If we can find discussion of the peer-cited version, great. If not, it't not notable and shouldn't be the focus of this article.
 * Criticism should definitely not be removed. If we start removing criticism of the paper, then notability becomes an issue as well as, obviously, NPOV.
 * Bessen's criticism of the criticism of his paper is off base because this article demonstrably (although not explicitly since that would be OR) shows that his technique for identifying software patents is off-base since it shows that such patents have been granted since the 70's when everyone seems to think they have only be granted since the 90's. The other papers are quite clear that there are many ways of finding software patents and none of them are perfect and all are useful for different purposes and this needs to be said.
 * GDallimore (Talk) 11:56, 17 November 2007 (UTC)
 * I would be glad to make suggestions on the Talk page. I believe that what would be most helpful is a brief discussion of the different techniques that have been used to identify software patents and their significance. I provide some text along these lines below.


 * But first, I think GDallimore is off base on a couple points:


 * - The Hahn-Wallsten paper does NOT critique the working paper cited on the page--how could it? it was written before the working paper was. Instead, it criticizes a preliminary version we circulated in 2003. If GDallimore wants to argue that the article should focus on the 2004 version, then this critique is not relevant. The 2004 paper and the 2007 published version both discuss alternative techniques for identifying software patents, run tests to compare them, and run their data analysis using both the Bessen-Hunt technique and the Graham-Mowery technique (mentioned in Hahn-Wallsten).


 * - GDallimore needs to distinguish criticism of the results of the paper from criticism of the technique that is the topic of this article. Noel & Schankerman criticize the results by way of presenting what they see as conflicting evidence, but they do not provide any serious alternative to the technique used. The class of "Electric Digital Data Processing" patents is simply not a class of "software patents," the IPC did not design it as such, economists have criticized Noel & Schankerman for this shortcut approach, and it is simply off-topic on a page that discusses techniques for identifying software patents.


 * - GDallimore is quite right that it might be helpful to discuss and compare alternative techniques (as I do below). But the issue is not that "everyone seems to think they have only be granted since the 90'." It is not at all hard to find software patents granted before 1990 and, as a patent attorney, GDallimore should know that the Benson decision did not (as is popularly believed) bar patents on software. Moreover, the selection techniques based on technology classes tend to find relatively more software patents before 1990, so this is not the important distinction.


 * So here is some sample text:

ALTERNATIVE TECHNIQUES

Any technique such as this makes errors, either failing to identify some patents as software patents or mis-identifying some patents that are not as software patents. A technique based on keywords might suffer if patent attorneys draft patents in such a way as to obscure their meaning. Other researchers, such as Graham and Mowery [1] have used techniques based on the technology classes used by the USPTO or by WIPO to select software patents. These classification schemes do not specifically identify software as a technology, so these classes do not necessarily provide a good match. In addition, the Patent Office continually re-classifies patents, so any identification based on these classes is not stable over time.

Bessen and Hunt tested the Graham-Mowery technique and their own technique against databases of patents in which researchers had classified software patents based on reading each patent. The Bessen-Hunt technique had both fewer false positives and fewer false negatives. In addition, Bessen and Hunt ran their data analysis using the Graham-Mowery method and found no significant difference in their results. Hall and MacGarvie [2] also ran their data analysis using different classification techniques and did not find major differences in results. This suggests that although no classification technique is perfect, robust conclusions might not be particularly sensitive to the choice of technique.

[1] Graham, Stuart J. H., and David C. Mowery, 2003. “Intellectual Property Protection in the U. S. Software Industry,” in Wesley M. Cohen and Stephen A. Merrill, eds., Patents in the Knowledge-Based Economy, National Research Council, Washington: National Academies Press, pp. 219-58.

[2] Bronwyn H. Hall and Megan MacGarvie. 2007. "The Private Value of Software Patents," working paper http://elsa.berkeley.edu/~bhhall/papers/HallMacGarvie_April07.pdf -- Jbessen (talk) 14:27, 18 November 2007 (UTC)