Talk:Compare-and-swap

Bug in referenced Shann et al. paper
A warning: the cited Shann et al. paper has a bug in the algorithm, and I thought it was my own implementation that was at fault until I saw the Groves paper that tried to prove correctness of the Shann algorithm and found the bug, and presents a fix in that paper. I have added a note besides the link and hopefully save people a debugging nightmare, but I leave the proper formatting of my note to those that know their ways around here better than me. ThVa (talk) 21:38, 24 March 2009 (UTC)

"Maurice Herlihy (1993) proved that fetch-and-add is inferior to compare-and-swap." This should not be the only reason for merge. Although the FAA operation might be inferior in the context of lock free algorithms it still is a locking primitive and it's usage should be shown (somewhere). --Jyke 11:13, 26 August 2005 (UTC)

Further, Fich, Hendler and Shavit have shown that, in terms of memory usage, compare-and-swap can be weaker than fetch-and-add. Implementing fetch-and-add wait-free from compare-and-swap requires at least as many memory locations as threads. Implementing fetch-and-add wait-free from fetch-and-add requires only one. --Chris Purcell 12:02, 11 October 2005 (UTC)
 * LL/SC is better. CPUs with LL/SC only have the weak versions, but that's still good for a lot of algorithms, and also the functionality can be implemented with software if you have CAS.  See a recent C. Evequoz paper. ThVa (talk) 21:38, 24 March 2009 (UTC)

Is this three way merge really necessary?
It looks like these are all distict primitives for atomic synchronization. While they each have advantages and disadvantages, they are distinct. If there is really so much overlap, then we should merge all these articles into a single article on "Hardware Synchronization Primitives" or something like that. (But that article would probably be too big, and would get split into articles like these at some point...) Jamie 01:21, 14 November 2005 (UTC)

I would agree: there doesn't seem to be any real point to merging. Chris Purcell 11:35, 14 November 2005 (UTC)

I agree too : there doesn't seem to be any real point to merging or a very good operating system writer would have to rewrite it completly. TheCric 11:35, 14 February 2006 (UTC)


 * Which three primitives were being discussed to not merge? RJFJR (talk) 22:00, 3 November 2011 (UTC)

CAS useful on uniprocessor systems as well
The article mentions that CAS is only used on multiprocessor systems. It is currently used on uniprocessor systems for user-space applications where disabling the interrupt is not allowed directly and invoking the OS kernel is quite expensive. Catalin 13:54, 12 April 2007 (UTC)


 * I added a comment about this. --JeffreyYasskin (talk) 18:45, 16 December 2007 (UTC)

CAS is both an abstraction and an assembly instruction
Both C++0x and Java  use CAS as the name for an atomic operation on data of various sizes, including larger than a single pointer. On the other hand, the assembly instruction is only reliably available on data sizes up to the pointer size. I'd like to point out the ambiguity in the text, but I'm not sure of the best wording to use. --JeffreyYasskin (talk) 18:45, 16 December 2007 (UTC)
 * You are wrong. Double-pointer-sized CAS is available on at least all x86-64 processors other than very early AMD ones.  If one is doing x86, unless it's on a very old processor, CMPXCHG16B is a given. ThVa (talk) 21:32, 24 March 2009 (UTC)

C code
Am I mistaken or is register a reserved keyword in C? Will that code even compile? .froth. (talk) 00:06, 17 June 2010 (UTC)

Is the C-code really just a pseudocode representation using c syntax? RJFJR (talk) 14:29, 11 November 2011 (UTC)


 * The C code is misleading. A newcomer who doesn't understand atomic ops will think that they can write CAS in C.  You can't.  There's no atomicity there.  Of course someone who's worked with this before will know this and point out where in the article it says this but last I knew assembly programmers aren't the target audience of Wikipedia. 166.250.45.126 (talk) 17:40, 3 April 2012 (UTC)

Reworded intro
In an attempt to provide more context for people new to the CAS instruction, I reworded the intro paragraph. The first sentence now reads, "In computer science, compare-and-swap (CAS) is an atomic CPU instruction used in multithreading to achieve synchronization." I think that links to all the important contextual areas? --Chris Purcell (talk) 11:02, 23 December 2011 (UTC)

Implementation in C
Should we just delete the whole ==Implementation in C== section of the article? Is it really the best way to give a pseudo code description of the CAS operation? See the section above called 'C code' which discusses some of this. RJFJR (talk) 18:01, 13 July 2012 (UTC)


 * One common approach in the literature is to surround the code in an "atomically" block construction, which makes it clear something non-standard is happening. I think having _some_ form of code to present the ideas is a good thing. --Chris Purcell (talk) 16:49, 16 July 2012 (UTC)

Strange text in intro
It says "It compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value". The "only if they are the same" is probably an error, isn't it?


 * Nope, that's right.  --Chris Purcell (talk) 16:00, 29 December 2014 (UTC)

CMPXCHG16B requirement
See Talk:Windows_8.1 for a discussion. - Yuhong (talk) 07:53, 6 August 2017 (UTC)

CS
The article mentions that it traces back to IBM S/370, but S/370 has CS and CDS, not CAS. Which architectures have CAS? Gah4 (talk) 01:49, 20 September 2022 (UTC)

refetch
As noted for the pseudo-code x86 version, it seems that the S/370 (and later) CS and CDS refetch the value from memory when the compare fails. The article doesn't explain this well. It seems that, at least on some versions, this is related to caching. Also, at least for the S/370 version, it does not protect against changed by I/O operations. Gah4 (talk) 12:46, 26 September 2022 (UTC)

Invention of CAS
According to Lynn Wheeler - a key player in the 360/370 world...

MIT Lincoln Labs was the first installation for CP67 and they had it in a multiprocessor configuration. Lincoln Labs returned one of their processors to IBM where it was vectored to IBM's Cambridge Scientific Center (CSC) machine to upgrade it to a two processor multiprocessor.

Charles Arnold Salisbury (CAS) was at CSC when he invented "Compare And Swap" (matching Charles' initials) while he was working on CP67 multiprocessor fine-grain kernel locking (as alternative to 360 "Test & Set").

Lynn was also at CSC working on performance, pathlength, algorithms, paging, scheduling, etc. He had started on the single processor base. While the CSC 360/67 had a 2nd processor added, it didn't increase the memory size from 768kbytes. Consequently, it was somewhat memory limited even when running single processor. For a while, Lynn's paging improvements to single processor provided better throughput than adding the 2nd processor. Once he started working on the same base as Charles with the multiprocessor support it improved throughput. Earthshew (talk) 20:30, 29 July 2023 (UTC)