Talk:Consistent hashing

Untitled
I am after reading this article unable to code a consistent hash. I therefore argue it has failed to explain the core concept involved; the article has only described its functionality. Toby Douglass (talk) 22:48, 5 July 2012 (UTC)

Confusing K and n notations
I might be the only one but I find the notation $$K$$ for the number of keys and $$n$$ for the number of nodes confusing. I know it is the first letter of each word but usually $$K$$ is used for the number of partitions and $$n$$ for the total number of elements and therefore considered $$K < n$$. I think it should either be $$|K|/|n|$$ or changed to $$N/k$$. I wanted to have some opinions before doing this correction. Cmolter (talk) —Preceding undated comment added 16:40, 15 July 2013 (UTC)
 * Slightly related to this, I want to add that it is not clear whether $$n$$ is the number of slots before or after the resize... Additionally, it be good if this article would clarify what "resizing a hash table" means... does it mean "changing the number of slots"? --Abdull (talk) 14:12, 18 July 2013 (UTC)

Additional algorithm
There is an alternative algorithm for consistent hashing that is shorter, and that is much more efficient when there is a large numbers of servers. The open source Guava hashing library contributed by Google contains an implementation. See consistentHash here: https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/hash/Hashing.java

That code doesn't have an explanation of the algorithm, but a preprint now gives one: http://arxiv.org/ftp/arxiv/papers/1406/1406.2294.pdf

As an author of that preprint, it is not appropriate for me to edit this page, but it seems like appropriate subject matter for this page. Johnlamping (talk) 05:02, 13 June 2014 (UTC)

Technique clarification needed
"Since each bucket is associated with many pseudo-randomly distributed points, the resources that were held by that bucket will now map to many different buckets."


 * Please explain what this (emphasized part) means. I see that the values for the keys were distributed randomly, but the keys themselves were a contiguous block (the bucket), not random "points" (points on the circle, I assume — the circle analogy is confusing).

"Since each bucket is associated with many pseudo-randomly distributed points, the resources that were held by that bucket will now map to many different buckets."


 * How?

209.49.110.163 (talk) 21:45, 20 August 2014 (UTC)


 * I was also confused by this. The confusing bit is that part of the text says a bucket is associated with many points on the circle, but this text seems to suggest that a bucket is a single point. It might be easier to remove the confusing notion of a bucket, and just call it a machine.

"then walks around the circle until falling into the first bucket it encounters (or equivalently, the first available bucket with a higher angle). The result is that each bucket contains all the resources located between its point and the previous bucket point. — Preceding unsigned comment added by 50.158.132.159 (talk) 17:54, 8 February 2015 (UTC)

Jungleh (talk) 09:40, 9 October 2016 (UTC)
 * I clarified the confusion. It seems that the original text mixed the simplistic one-point-per-bucket technique with the more elaborate technique of having many different points around the circle map to the same bucket.

External links modified
Hello fellow Wikipedians,

I have just modified one external link on Consistent hashing. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20120605030524/http://weblogs.java.net/blog/tomwhite/archive/2007/11/consistent_hash.html to http://weblogs.java.net/blog/tomwhite/archive/2007/11/consistent_hash.html

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 10:03, 12 August 2017 (UTC)

Removal of citation needed tag
A citation needed tag was added to the claim of rendezvous hashing being first invented in 1996. A reference is present on the appropriate page and has hence been removed from this page. AB1908 (talk) 06:09, 16 April 2020 (UTC)

India Education Program course assignment
This article was the subject of an educational assignment at College Of Engineering Pune supported by Wikipedia Ambassadors through the India Education Program&#32;during the 2011 Q3 term.&#32;Further details are available on the course page.

The above message was substituted from by PrimeBOT (talk) on 20:09, 1 February 2023 (UTC)

I'm confused a bit
I'm confused a bit about the following statement:

"The central idea is, we use a hash function that randomly maps both the BLOB and servers to a unit circle, usually radians. For example, (where  is hash of a BLOB or server's identifier, like IP address or UUID)."

I was under the impression that a hash function is deterministic so I'm not sure how it would randomly map... Grossb1 (talk) 19:41, 27 July 2023 (UTC)


 * I believe this is actually referring to an random-like uniform distribution and have updated the article accordingly in two places. Skyerise (talk) 16:55, 4 August 2023 (UTC)