Talk:Hill climbing

Local search
What is the difference to local search? Stern 08:28, 13 Jun 2005 (UTC)


 * They are the same thing. In Talk:Local search, there's an ongoing discussion of that article's relation to this one. The idea of merging this article with Local search is being raised. --Anthony Liekens 21:43, 11 October 2005 (UTC)

Hill climbing and Convex Problems
In the preamble it is stated that hill climbing is optimal if the problem is convex; but this only holds, if the goal is a minimization rather than a maximization. The rest of the article deals with maximization, however. In this case, hill climbing would be optimal only if the problem is concave, right? In any case, I think there should be a reference justifying this optimality statement. — Preceding unsigned comment added by 129.187.109.199 (talk) 14:24, 26 February 2015 (UTC)

Need a reference on Probabilistic Hill Climbing
The header/title says it (mostly). I still think we need a plain Metropolis Algorithm entry -- not just Metropolis-Hastings. Another user said the same thing (Metropolis is "more commonly used" than Metropolis-Hastings according to that user). In any case Metropolis Algoithm certainly merits an entry. 199.196.144.11 22:16, 21 March 2007 (UTC)
 * Hi, can you explain to me how Metropolis-Hasting is related to Hill Climbing? Thanks, Sander123 08:22, 22 March 2007 (UTC)
 * Hill-climbing can be reduced to: 1- pick a new sample point, 2- Measure the goodness of the new sample point, 3- go to step 1. Suppose you have some distribution that represents your belief about the goodness of points in your search space. It would be nice if you could draw your samples according to your believed-goodness distribution. This way, you will be fully utilizing your belief to climb the hill as fast as possible. (Note that your believed-goodness distribution will change as you gather more samples.) Metropolis is an algorithm that draws samples from an arbitrary distribution. (But it requires that you supply a symmetric candidate distribution that hopefully is at least somewhat close to your believed-goodness distribution. Almost everyone just uses a Gaussian for the candidate distribution.) Metropolis-Hastings is a generalization of Metropolis that can use any candidate distribution (but most people just stick to Metropolis and use a Gaussian because they don't know how to pick a better candidate distribution). Metropolis is commonly used in conjunction with Gibb's sampling to evaluate Bayesian Belief Networks. It is effective for hill-climbing in low-dimensional space (ie with univariate distributions), but in high-dimensional space it suffers severly from the curse of dimensionality and converges extremely slowly. The problem with high-dimensional space is that it tries to sample from your believed-goodness distribution rather than sampling points that are likely to improve your believed distribution. Gradient algorithms, on the other hand, are efficient for hill-climbing in high-dimensional space. For example, Empirical Gradient Ascent measures the gradient a tiny ways out in each dimension to compute the optimal direction to ascend the hill (assuming local linearity). Stochastic Gradient Ascent tends to be even faster than Empirical Gradient Ascent. It picks a random direction, measures the goodness of that direction, backs up, steps forward in the best direction yet found, and repeats. Gradient algorithms that add a momentum term to each dimension tend to perform even better than Stochastic Gradient Ascent.

I'm actually implementing a stochastic hill climbing algorithm and I have no earthly idea what the heck this page is on about.

This page could use an informal, intuitive description of the hill climbing concept. It would be useful for many people who are new to optimization. Eternallyoptimistic 14:19, 26 April 2007 (UTC)
 * I've added a description that avoids the mathematical notation. Sander123 09:52, 21 May 2007 (UTC)

Original
Random-restart hill climbing is a surprisingly effective algorithm in many cases. It turns out that it is often better to spend CPU time exploring the space, than carefully optimizing from an initial condition.[original research?] I agree we need a source, but I agree and I think it is definitely correct. Could someone please find an example to back this up. Good article, by the way. Gigitrix(90.216.108.161 (talk) 14:41, 8 November 2008 (UTC)) (Didn't login)

it is not guaranteed that hill climbing will ever come close to the optimal solution
The article states: When the algorithm cannot see any improvement anymore, it terminates. Ideally, at that point the current solution is close to optimal, but it is not guaranteed that hill climbing will ever come close to the optimal solution.

Shouldn't "ever" be "always"? If an algorithm may not ever even come close to an optimal solution, isn't it a waste? Can someone just run the algorithm on a solution set where all the solutions are optimal? fogus (talk) 20:02, 24 April 2009 (UTC)
 * To clarify the intended meaning, for a particular run of hill climbing on a particular problem, it may never reach a solution that is even approximately optimal (because once it reaches a local optimum it stays there). This is as opposed to methods like random-restart hill climbing, where the search will inevitably find the global optimum, but may take a very long time to do so. Dcoetzee 21:00, 24 April 2009 (UTC)

Greedy vs. Hill Climbing
There is also an article on Greedy algorithms. I can't tell the difference - is one more general than the other? If so, it seems like relevant page should explain that. This comment is also posted on the discussion page there. Thanks. --Jdvelasc (talk) 19:10, 10 September 2009 (UTC)


 * Commented on Talk:Greedy_algorithm. &mdash;ZeroOne ( talk / @ ) 21:43, 2 September 2010 (UTC)


 * Hill climbing is a greedy optimization technique. There are greedy algorithms that do other things besides optimization. For example, a greedy path planner finds a path from point A to point B by always stepping immediately toward B. A greedy packing algorithm packs objects in the configuration that immediately minimizes total space used so far. A greedy diffing tool always assumes the first matching lines it finds between two files correspond with each other.--Headlessplatter (talk) 04:43, 3 September 2010 (UTC)

What is it?
The article's lead section describes one relative quality after another of the algorithm without actually saying what the algorithm is or how it works. Can anyone fix that?--Father Goose (talk) 05:21, 2 December 2010 (UTC)


 * I agree that a lead section should say what it is, not just why it is wonderful. I did a pass over the "mathematical description" section to try to make the description more intuitive. Unfortunately, I am having a hard time figuring out how to explain the concept without sounding technical, so I am not sure how to make the description suitable for a lead section. Perhaps someone else could try another pass at it now?--Headlessplatter (talk) 15:57, 2 December 2010 (UTC)


 * All right, I think I understand it better now, so I've rewritten the lead according to my understanding. Maybe you will want to iteratively improve my solution. ;-) --Father Goose (talk) 22:10, 2 December 2010 (UTC)

"Although more advanced algorithms such as simulated annealing or tabu search may give better results, in some situations hill climbing works just as well. Hill climbing can often produce a better result than other algorithms when the amount of time available to perform a search is limited, such as with real-time systems.". What is the source of this sentence. Is there an article or book I can verify this statement? Rolb2 (talk) 09:43, 9 July 2013 (UTC)


 * Back in 2010, I merely did a copyedit and added a clearer summary of information already in the article. I don't know the source of the info – however, a google search for "hill climbing" "real-time" search produces promising results, as does a Google Books search for the same terms: .--Father Goose (talk) 22:33, 22 July 2013 (UTC)

Bubble sort
I added the counter-claim about bubble sort to trigger someone with more expertise to better characterize the efficiency trade-offs. My sentence doesn't belong where I put it, but without my addition, I regard the lead as promulgating a slanted perspective. There was nothing at all about the problem of edit distance (which can potentially be very large, even exponential). &mdash; MaxEnt 03:50, 18 April 2017 (UTC)