Talk:Set cover problem

What does it mean to call the decision problem NP-complete and the optimization problem NP-hard? NP-hardness refers to decision problems. Isn't #P-completeness the notion we are after for the optimization part?

NP-Complete means that the problem is both NP-Hard (harder than any other problem in NP) as well as in NP (meaning a given solution can be verified in polynomial time). We can verify that a solution to the decision problem is valid in polynomial time (by simply checking if there are at most k sets used in the solution, and whether those sets actually do cover U). However, it is more difficult to verify a solution to the optimization problem, as we have to check whether the given solution is in fact the optimal one (which can't be done in polynomial time). This means the optimization problem is not in NP, and therefore cannot be NP-Complete. So no, NP-hardness does not always refer to decision problems. —Preceding unsigned comment added by 128.113.55.43 (talk) 02:45, 22 April 2008 (UTC)


 * NP is a class of decision problems, so an optimization problem cannot be in NP *by definition*. Your argument is a red herring (in particular since it is probably difficult to prove that deciding whether a given solution is optimal cannot be done in polynomial time). --Mellum (talk) 12:35, 12 May 2008 (UTC)

I believe that set packing and hitting set are duals of one another, not set cover and hitting set, as indicated in "related problems." Set cover and hitting set are both minimization problems. —Preceding unsigned comment added by 132.236.215.11 (talk) 16:12, 4 September 2008 (UTC)


 * True. I corrected that and added a template Template:Covering-Packing_Problem_Pairs with the goal to collect all important covering problems together with their LP-dual problems. ylloh (talk) 17:50, 11 March 2009 (UTC)

Hitting set and set cover
There is some discussion related to this page at Talk:Vertex cover; comments welcome. — Miym (talk) 14:49, 11 November 2009 (UTC)

Oops
Thanks for fixing my change Miym, i must have misclicked in some way. —Preceding unsigned comment added by 129.132.57.162 (talk) 14:22, 30 November 2009 (UTC)

m and n
Hi everyone, There is a mistake in this article: the universe should contain n elements instead of m, and the collection of subsets should be of size m instead of n. This way, the approximability and inapproximability section become correct. — Preceding unsigned comment added by 89.159.247.99 (talk) 20:56, 8 May 2012 (UTC)

NP-complete?
The decision version of set covering is NP-complete, and the optimization version of set cover is NP-hard. Isn't the optimization version NP-complete, too? — Preceding unsigned comment added by 134.95.82.23 (talk) 08:51, 8 March 2013 (UTC)

How can the optimization version not be NP-complete if it can be formulated as a 0-1 integer linear program (which is NP-complete)?


 * NP-completeness is defined for decision problems. The optimization version is NP-hard. (Also note that all problems in NP can be formulated as 0-1 integer linear programs; this is a reduction *to* a hard problem, and showing hardness requires reducing *from* them.) Mlhetland (talk) 10:19, 17 June 2020 (UTC)

Over-emphasis of inapproximability?
There are countless pages on the internet where people ask for algorithms for set cover and get told to use the greedy algorithm, some quoting this article "Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover". The "best possible" referred to here is only according to criteria interesting to someone involved in complexity theory-- it's talking about approximation error in the _worse cast_. Practical problems that people are usually trying to solve are not usually pathological, and often the greedy algorithm does not do very well (in terms of fitness of the solution) compared to integer linear programming techniques. Personally, I find the results from complexity theory fascinating, but usually completely irrelevant for the set cover problems that arise as sub-problems in engineering. --Gmaxwell (talk) 05:06, 27 October 2015 (UTC)

The integer linear programming technique is mentioned, so I don't really see your point. Maybe it would be helpful to mention that integer linear programming typically performs well "in practice", but that claim would need a citation, (and ideally also explain what about the "practical" instances help you outperform the NP-hardness). I do feel like the current section on inapproximability is unreasonably long. Vast majority of audience only care about the punchline that the problem is NP-hard to approximate to within near optimal factors - they don't need the entire history. Aviad.rubinstein (talk) 17:00, 29 January 2018 (UTC)


 * I'm seeing a lot of results where approximability is achieved in terms of the largest set via primal-dual method, e.g. here: https://www.cise.ufl.edu/class/cot5442sp13/Notes/PrimalDual.pdf should this be included? — Preceding unsigned comment added by 2001:6B0:2:2801:A5FF:4396:970D:7204 (talk) 11:53, 28 February 2018 (UTC)

Error in inapproximability?
In the inapproximability section, it says that Dinur and Steurer showed it could not be approximated to $$(1-o(1))\ln n$$, unless P=NP, but that's not correct, is it? What they show is that for any $$\epsilon > 0$$, it's NP-hard to approximate it to $$(1-\epsilon)\ln n$$, but $$(1-o(1))\ln n$$ is a strictly looser approximation than that given by any such fixed $$\epsilon$$. To be fair, Dinur and Steurer themselves say that Feige shows that you can't approximate to within a factor of $$(1-o(1))\ln n$$ (given certain assumptions), and that is in itself incorrect. What feige shows is “that $$(1-o(1))\ln n$$ is a threshold below which SET COVER cannot be approximated efficiently [given certain assumptions]”. That is, this is not shown to be a strict bound. Or…? Mlhetland (talk) 10:31, 17 June 2020 (UTC)

Greedy Algorithm (alternatives)
So when looking at the article one thing that stands out is the use of the greedy algorithm as a general solution for this problem. It does have some good properties. I was implementing a solution and decided against it, for a simpler algorithm that didn't involve searching for the largest gain. It's a simple algorithm that simply includes all subsets, it removes a subset, checks to see if the universe is complete and if so moves to the next subset, if not it replaces the subset and moves to the next subset, it iterates through each subset until no more items can be removed. I imagine that this will produces more varied solutions than the greedy algorithm. Not necessarily better solutions. — Preceding unsigned comment added by 70.166.250.122 (talk) 15:56, 16 December 2019 (UTC)