Talk:X86 Bit manipulation instruction set

Unnamed section
I cant understand the wording of the description of item 1 in the table under BMI1, The reference uses the same wording, although it does not make sense Matthias291999 (talk) 01:14, 23 January 2016 (UTC)
 * The description of ANDN? Or the comment about about CPUID? Carewolf (talk) 11:45, 26 January 2016 (UTC)

Parallel bit deposit and extract examples
All these illustrate is simple right and left shifts, which is NOT what these instructions do. (yes, in the reductio ad absurdum case, they can can do a shift, but then why do we need them?). It's actually misleading, if the reader assumes this is all they can do. It's bad enough to just delete the examples. I don't think the editor who constructed them (that's why there's no citation) understood the operators. It's nominally WP:OR, though in construction of self-evident examples, we sometimes have to allow a little liberty. I'd feel better if the examples were taken from published material. Sbalfour (talk) 14:49, 25 October 2019 (UTC)
 * It was written that way so that people can look at the example and immediately tell what is going on. While your examples are show more of the power of the instruction, the examples in themselves are not instructive as to what the instructions does (in my opinion).Carewolf (talk) 14:57, 26 October 2019 (UTC)
 * Your example confounds bits, nibbles and bytes as well as actual numeric quantities with symbolic representations of them. My examples also had some of that.  I'm not sure what RGBA8888 and RGBA4444 represent.  Are 'R', etc symbols or literals?  Is the quantity to be interpreted as 0xRGBA8888, i.e. a hexadecimal 32-bit number?  'R' and 'G' aren't hexadecimal digits, but even if they're symbolic, PEXT does not translate 0xRGBA8888 to 0xRGBA4444.  I presume we're in 32-bit mode (though my examples were 64-bit) as the text stands. You use the mask "111100001111000011110000" twice, a 24-bit quantity; I'll take that as a typo for 0b11110000111100001111000011110000, though any typo here means the operator won't work as described.  I've actually run the code, using ascii bytecode transliterations of your R1..8G1..8B1..8A1..8 as the source.  Here is the result:


 * PEXT(0x52474241,0xf0f0f0f0) = 0x00005444
 * PDEP(0x00005444,0xf0f0f0f0) = 0x50404040


 * Even correcting the mask, these don't agree with your "Result" column. If everything could be fixed up, the problem remains that the user may imagine that PEXT and PDEP are "nibble-packing/unpacking" operators, and lack generalization.  PEXT sequentially collates any number of arbitrary-size bitfields of the source omitting gaps between them, into the low portion of the destination; PDEP is the inverse, taking any number of arbitrary-size contiguous fields of the source and distributing them sequentially over the destination, with gaps between the fields being zeroed.  I've therefore recast the examples (and run them) to something correct and readable using hexadecimal numbers uniformly, and putting the selector, source and destinations fields of the table in the order in which the operands are specified to the operators in GAS, though I think assemblers differ. Sbalfour (talk) 22:19, 28 October 2019 (UTC)


 * RGBA is a RGBA color space the numbers after refers to how many bits you have for each color. But your new example are good too and more compact. Carewolf (talk) 07:21, 29 October 2019 (UTC)

LZCNT
The text says: "LZCNT is almost identical to the Bit Scan Reverse (BSR) instruction..." [except for flags and zero operands]. That is NOT NOT NOT true, and don't you believe it. BSR returns the index (offset from 0 bit) of the high '1' bit; LZCNT does literally what it says. They're not even close; in fact they are almost inverses of each other: for example LZCNT(0x80000000) = 0; BSR(0x80000000) = 31, because the highest bit set is 31 offset from bit 0, or index 31. LZCNT(0x00000001) = 31; BSR(0x00000001) = 0, because the set bit is offset zero from bit 0. If one checks gcc's __builtin_clz(x) on architectures without ABM or BMI, it codes as 31^BSR(x) (that's basically 31-BSR(x) in a 5 bit field).

And even worse, LZCNT executes as BSR on architectures that don't support LZCNT, and that can lead to some surprises because BSR returns a different result than LZCNT. Sbalfour (talk) 17:22, 13 November 2019 (UTC)

Similarly, TZCNT and BSF are not 'almost identical': there is an analogous inverse relationship between them. Sbalfour (talk) 17:30, 13 November 2019 (UTC)

You are pretty close. Their result are exactly reversed because what they do on a logical level is the exact same. 32bit lzcnt(x) is 31 - bsr(x), and 64bit bsr is 63 - lzcnt(x). They can do the exact same job, except one is undefined on a common input, if that wasn't the case we would just use bsr plus a cheap add or minus instruction.Carewolf (talk) 20:57, 14 November 2019 (UTC)

bitmanipulation != x86!
err this page says, by omission, "bitmanipulation instruction sets are the sole exclusive domain of Intel and AMD and there does not exist anywhere in the world anywhere throughout the entire history of computing an instruction set other than x86 with bitmanipulation".

it should be blindingly obvious that this is preposterous and false.

PowerISA has had bitmanip since 1993, popcnt, cntlz, cnttz, RISCV bitmanip extension is a good resource which had a huge amount of research done into historic and modern bitmanip ISAs.

if this page were named "X86 bitmanipulation instructions" it would be perfectly correct. as it stands it is false and misleading information.

one of two things need to be done:
 * 1) move the page (rename to "x86 bitmanipulation")
 * 2) start adding in other ISAs.

i honestly don't know which is the better course of action. *as it stands* the page is an extremely good body of work.. about x86. Lkcl (talk) 04:42, 18 June 2021 (UTC)


 * For anyone else reading this comment: This has already been fixed (the second option was chosen) LachlanA (talk) 02:15, 23 September 2022 (UTC)

Motivation?
Would it be possible for someone to write a bit about why they operations are useful? Are there particular programming "phrases" that would use them? (For example, are they useful in bounds checking arrays, or in deep learning operations, or in encryption algorithms?) The "equivalent C expressions" are helpful in explaining what the operation does, but do not seem to be things that would typically occur in user code. LachlanA (talk) 02:13, 23 September 2022 (UTC)