Talk:Data structure alignment

untitled
The page gives no reason why this practise is used. --Anonymous

I replaced this, and left out the specific programs mentioned because I could not find any mention of trouble porting those specific programs: "A common problem in computer programming is called word alignement. One way to increase performance, especially in RISC processors which are designed to maximize raw performance is to require data to loaded or stored on a word boundary. So though memory is commonly addressed by 8 bit bytes, loading a 32 bit integer or 64 bit floating point number would be required to be start at every 64 bits on a 64 bit machine. The processor could flag a fault if it were asked to load a number which was not on such a boundary, or call a routine which would effectively figure out which word or words contained the data and extract the equivalent value. This caused difficulty when the team from Mosaic Software ported their Twin Spreadsheet to the 68000 based Atari ST. The Intel 8086 architecture had no such restrictions. It would also cause difficulties in porting Microsoft Office to Windows NT on MIPS and PowerPC for NEC and IBM. Since the software was not written with such restrictions in mind, designers had to set a bit in the O/S to enable non-aligned data. However since this bit was masked with other flags, it was impossible to keep the O/S from faulting on non-aligned data when other modules used the other flags. This may have been a major factor in abandoning Windows NT on non-Intel processors as they failed as platforms for hosting common Windows applications and one more reason for the baffling dominance of the x86 architecture over technologically elegant rivals. "

I hope to have provided sufficient background, but I think the article still needs some work. (but I need to sleep)

Aij 07:03, 29 May 2006 (UTC)

Packed array
What is a packed array? It redirects here, but I could not find it. 85.145.103.163 (talk) 15:52, 30 January 2008 (UTC)


 * In the section Data_Structure_Padding, the article talks about that. 0x6adb015 (talk) 14:07, 3 March 2008 (UTC)

Too verbose/jargon filled
I'm a 4th year Comp Sci student sitting a PhD next year...and don't think the description here is clear enough. It really doesn't explain it in any useful way. I think a diagram, an analogy, or something similar would help. In its current state it's no use to anyone who isn't a computer expert-I come pretty close, and I can't follow it. It needs to be much, much, simpler at first, instead of jumping right into details. —Preceding unsigned comment added by 130.209.241.193 (talk) 11:53, 27 April 2008 (UTC)

I agree, one thing which struck is why is the word datum used instead of data? I've seen the word used before but needed to look it up, only to see that it's apparantly just a synonym to data, which is significantly more well understod. —Preceding unsigned comment added by 195.84.66.198 (talk) 13:31, 7 June 2008 (UTC)


 * I rewrote the introduction, added a short example and removed the tag. Virtlinkt&#x2022;c 22:08, 28 July 2008 (UTC)

..due to how the CPU handles memory
I've read the article and, well I agree with the opinions above, but, in the introduction there's a part that says '...due to how the CPU handles memory'. Well if you Google that you're going to come up with thousands of hits... how 'bout a link or something that could explain the part of "how the CPU handles memory" that it's really important on the "Structure padding"? thanks. (I would do it myself but my knowledge doesn't extent that long :S) —Preceding unsigned comment added by 190.201.223.72 (talk) 01:31, 20 January 2009 (UTC)

Default packing and #pragma pack
The cited source for this section doesn't support the information presented. The linked MSDN article just talks about "default alignment" derived from member type, e.g. 4 bytes for an int. This is very different from /Zp packsize which basically wraps whole source file in #pragma pack(push, N) ... #pragma pack(pop). —Preceding unsigned comment added by 195.146.153.234 (talk) 16:14, 4 October 2010 (UTC)

least common multiple requires at least 2 args
The

Typical_alignment_of_C_structs_on_x86

section contains the sentence:

''It is important to note that the last member is padded with the number of bytes required that the total size of the structure should be a least common multiple of the size of a largest structure member''.

However, from

[| the Least_common_multiple page]

the least common multiple operator (lcm) requires 2 arguments; hence, the cited sentence is confusing because it implies lcm is unary. It's also probably wrong because it makes no mention of the alignment of the largest structure member, it only mentions the size of the largest structure member.

Hence, according to this sentence, if the largest structure member is:

char[1000] carray;

and the complete structure were:

struct A{ char[1000] carray, char last};

then this sentence would require the A::last to have 999 padding characters after it. That's wrong, as shown by the program:

struct A { char carray[1000]; char last; }; #include int main(void) {     std::cout<<"sizeof(A)="<<sizeof(A)<<"\n"; return 0; }   whose output is:

sizeof(A)=1001

In addition, the page:

cxx-data-alignment-portability

claims:

The compiler always assumes that an instance of foo will start at an address aligned to the most strict alignment requirement of all of foo’s members,

which suggests that the aforementioned Typical_alignment_of_C_structs_on_x86 section sentence should be modified to:

''It is important to note that the last member is padded with the number of bytes required that the total size of the structure should be a multiple of the largest alignment of the structure's members''.

In addition, the reference:



claims, on page 13:

Structures and unions assume the alignment of their most strictly aligned component.

Also, the reference:



makes a similar claim on page 3-38:

The alignment of a non-packed structure is the maximum alignment required of any of its fields.

Both claims support the modification suggested above.

— Preceding unsigned comment added by Cppljevans (talk • contribs) 13:27, 16 February 2011 (UTC)

Yes I totally agree. And I have tested an example with a double to make sure that the structure was not sized a multiple of 8. As you say there is a logic error in any case since the largest element may be an arbitrarily sized array. I have made a change to replace 'least common multiple' with multiple and corrected my change to your correct interpretation of the largest alignment of the members not 'the platform' which I asserted initially. I have added 2 (compiled and tested) examples which should clear this up. Kerfuffle090 (talk) 22:50, 18 June 2011 (UTC)

calculation of whole structure alignment and size and all member paddings
Given the following pseudo-c++-code data structure:

struct TheStructure {     Type[1] member[1]; Type[2] member[2]; ...     Type[N] member[N]; };

composed of N members:

member[1]...member[N] of types:

Type[1]...Type[N] , and the following definitions:

unsigned alignof(SomeType) //Purpose: // Returns the alignment of type, SomeType. // Implementation defined. ;   unsigned sizeof(SomeType) //Purpose // Returns the size of type, SomeType. // Implementation defined. ;   unsigned aligned_offset ( unsigned size  , unsigned alignment    ) //Purpose: // Returns minimum offset >= size such that // offset%alignment == 0. //Ensures: // for some unsigned q>=0 //   offset == alignment*q //   size <= offset < size+alignment //Requires: // 1 <= alignment {         unsigned remainder = size%alignment //The following assertions from reference: // http://en.wikipedia.org/wiki/Modulo_operation //         //    r == a - n*floor(a/n) //   0 <= r < n          // //when replaced with the corresponding variable names //used here, become: //         //  q == floor(size/alignment) // remainder == size - alignment*q // 0 <= remainder < alignment //         //These assertions are illustrated by the following //number-line diagram: //         //  |<--- size --->| //                      |<-- remainder -->| // |<-- alignment*q --->| //         ;        unsiged offset = remainder == 0 ? size //In this case, the number-line diagram becomes: //           //  |<--- size ->| //                   -->|<-- remainder // |<-- alignment*q --->| // |<- offset ->| //           //thus: //           //  offset == alignment*q // offset%alignment == 0 //         : size+alignment-remainder //In this case, the above number-line diagram //needs to be augmented to become: //           //  |<--- size --->| //                      |<-- remainder -->| // |<-- alignment*q --->| | //                 alignment-remainder -->|   |<-- // |<- offset --->| //           //thus: //           //  offset == alignment*(q+1) // offset%alignment == 0 //         ;        return offset; } align[i] = alignof(Type[i])//alignment of member[i] for i=1 to N   ; size[i] = offset[i]+sizeof(Type[i]) //Size of TheStructure truncated to just i members. for i=1 to N   ; offset[1] = 0 //offset from start of TheStructure to start of member[1] ; offset[i+1] = aligned_offset(size[i],align[i+1]) //If i<N // the offset from the start of TheStructure to the start of     //  member[i+1] //else if i==N // the offset from the start of TheStructure to the start of    //  the next contiguous TheStructure. IOW, this is   //  sizeof(TheStructure). for i=1 to N   ; The only unknown in the above is align[N+1]; hence, the problem is:

Find:

align[N+1] //alignment for TheStructure constrained_by: (m*align[N+1]+offset[i]) % align[i] == 0 //member[i] is properly aligned //This constraint is labelled the "member_constraint[i]". , for i=1 to N    , for any unsigned m>=0 (align[N+1]+offset[N+1]) % align[N+1] == 0 //If two TheStructure's are contiguous in memory, //the second is properly aligned. //This constraint is labelled the "structure_constraint".

For the particular case when 1==i, the member_constraint[1] is:

(m*align[N+1]+offset[1])%align[1] == 0 , for any unsigned m>=0 After substituting in the definition of offset[1], this becomes:

(m*align[N+1])%align[1] == 0 , for any unsigned m>=0 Any solution has the form: align[N+1] == j*align[1] , for any unsigned j>=1. IOW, the set of solutions for align[N+1] for member_constraint[1] is: soln[1] = {j*align[1]|unsigned j>= 1} For the particular cases when 1=0 After substituting in the definition of offset[i-1], this becomes:

(m*align[N+1]+aligned_offset(size[i-1],align[i]) % align[i] == 0    , for any unsigned m>=0  After substituting in Ensures: assertion from aligned_offset, this becomes:

(m*align[N+1]+align[i]*q) % align[i] == 0 , for any unsigned m>=0 , for some unsigned q>=0 Any solution has the form: align[N+1] == j*align[i] , for any unsigned j>=1. IOW, the set of solutions for align[N+1] for member_constraint[i] for 1= 1} Thus, there's a sequence of solutions, one for each member:

soln[1], soln[2], ... soln[N] For the structure_constraint:

(align[N+1]+offset[N+1]) % align[N+1] == 0 after substituting in the definition of offset[N+1], this becomes: (align[N+1]+aligned_offset(size[N],align[N+1]) % align[N+1] == 0

After substituting in Ensures: assertion from aligned_offset, this becomes:

(m*align[N+1]+align[N+1]*q) % align[N+1] == 0 , for any unsigned m>=0 , for some unsigned q>=0 which, after simplification, becomes: (align[N+1]*(m+q)) % align[N+1] == 0 , for any unsigned m>=0 , for some unsigned q>=0 which is obviously true as long as 0 < align[N+1]. To handle the boundary case when N==0(no members), define: align[0] = 1 soln[0] = {j*align[0]|unsigned j>=1} = {j|unsigned j>=1} So, there are n+1 partial solutions, all of the form:

soln[i] = {j*align[i]|unsigned j>=1} for i=0 to N

To satisfy all of soln[0],soln[1]...soln[N], align[N+1] has to be in the intersection of soln[0],soln[1]...soln[N]. IOW, align[N+1] has to be a multiple of all of align[0],align[1]...align[N]. The least solution satisfying this constraint is the least common multiple of align[0],align[1],align[2],...,align[N].

In summary, using haskell foldl

alignof(TheStructure) = foldl lcm align[0] [align[1],align[2],...,align[N]]

In case N==0, then:

alignof(TheStructure) = foldl lcm align[0] [] == align[0] == 1. In case all of the alignments are of the form:

align[i] = 2**n[i], for some unsigned n[i], where 2**k is 2 raised to the k-th power, then max can be used in place of lcm since lcm(2**k,2**(k+n)) = 2**(k+n).

Cppljevans (talk) 13:05, 17 February 2011 (UTC)

Packing vs endianness
In the section on padding, the following text is given:

''Although use of "packed" structures is most frequently used to conserve memory space, it may also be used to format a data structure for transmission using a standard protocol. However in this usage, care must also be taken to ensure that the values of the struct members are stored with the endianness required by the protocol (often network byte order), which may be different from the endianness used natively by the host machine.''

While it is true that endianness must be addressed when transmitting data across a network, it is not directly related to data structure alignment. The use of different data structure alignment techniques doesn't change the requirements of endianness correctness.

I suggest that the final sentence in this section is removed. It could also be reworded in a way that states that integer data sent across the wire, regardless of padding/alignment, requires endianness handling for portability.

Clc38 (talk) 13:09, 22 November 2011 (UTC)

ARM
ARM has (historically, at least) interesting behaviour here. While single-register stores and multiple-register loads and stores have been forcibly word-aligned, single-register loads may be special in that one word is read, but the lowest two bits of the address cause (for little-endian) a rotate right by 8× their value, e.g.

ADD R0,PC,#12 MVN R1,#&FF ; &FFFFFF00 STR R1,[R0] LDR R0,[R0,#1] MOV PC,R14

results in R0 = &00FFFFFF.

Combined with MOV instructions making use of the barrel shifter, this was widely exploited for sign extension of 8-bit values and reading of 16-bit values.

Dsalt (talk) 00:58, 16 September 2012 (UTC)

The article is too vague to be very useful
For example, what exactly is it referring to when it says the "computer" is doing X?

Does it mean the OS? the CPU? the memory controller? Who exactly is responsible for the decision to read memory in blocks of bytes, and why is it still a concern? — Preceding unsigned comment added by Bumblebritches57 (talk • contribs) 08:15, 23 March 2016 (UTC)

Multiple-of-6 and multiple-of-8 word lengths a consequence of character sizes?
The article currently claims that

"The widespread use of packing on 6 bit character codes in the mainframe era led to machines with word lengths that were multiples of 6 bits; 36-bit machines were particularly common. The introduction of ASCII led to new designs with word lengths that are multiples of 8 bits, 7 bits of code and a parity bit."

Were the 36-bit machines of the 1950's 36-bit because the character codes were 6-bit, or were they 36-bit for other reasons, such as "36-bit floating-point numbers are a good size", or just following in the footsteps of other 36-bit machines, and 6-bit character codes chosen because they were "big enough" and 6 was a factor of 36? Guy Harris (talk) 17:59, 5 October 2017 (UTC)


 * Depends on the platform. 6-bit BCD was also a common reason. Maury Markowitz (talk) 19:30, 9 September 2019 (UTC)


 * So on which 36-bit machines was 6-bit BCD (rather than 4-bit BCD) used - and why? You can pack more 4-bit digits than 6-bit digits into a word. (That's "BCD" as in "binary coding of decimal digits" rather than as in "a character set called BCD that has more than digits in it".) Guy Harris (talk) 20:03, 9 September 2019 (UTC)

On a 32-bit architecture, would a 16-bit value not aligned on a 32-bit boundary be accessed more slowly?
The article currently says:


 * The CPU in modern computer hardware performs reads and writes to memory most efficiently when the data is naturally aligned, which generally means that the data's memory address is a multiple of the data size. For instance, in a 32-bit architecture, the data may be aligned if the data is stored in four consecutive bytes and the first byte lies on a 4-byte boundary.


 * Data alignment refers to aligning elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some padding between structure elements or after the last element of a structure. For example, in a 32-bit machine storing a 16-bit value one can pad out the value with an additional 16-bits to make it naturally align to 32 bits. Alternately, one can pack two 16-bit values into a 32-bit space, which may lead to slower access but use half as much memory.

In what implementations is accessing a 16-bit value that's aligned on a 16-bit boundary, but not aligned on a 32-bit boundary, slower than accessing a 16-bit value aligned on a 32-bit boundary? Guy Harris (talk) 18:52, 5 September 2019 (UTC)


 * If the original data is not packed, and you wish to store it packed, you need to convert from one form to another. Folding the data together is generally a multiple-instruction task, while the unfolding is generally simpler (often an AND followed by some shifts). This is why Pascal included a PACKED keyword, which allowed the programmer to choose whether to pack their data, trading memory for performance. This was almost always used in the context of strings. I recall writing a version of Conway's LIFE on the VAX which stored the board in a character array which ran significantly faster if you didn't PACK it. Maury Markowitz (talk) 19:26, 9 September 2019 (UTC)
 * IIRC TurboPascal did something additional here, but I no longer recall what... something about bitpacked? Maury Markowitz (talk) 19:28, 9 September 2019 (UTC)


 * If you have a 32-bit machine that 1) supports byte addressing and 2) has a "load 16-bit word" instruction (or, for CISC, also has arithmetic/logical instructions that support 16-bit in-memory operands), if you have a structure, aligned on a 32-bit boundary with a 32-bit field, followed by a 16-bit field, followed by another 16-bit field, the exact same code sequence would be used to access the first 16-bit field and the second 16-bit field, so I'm not seeing where accesses to the second field would require more instructions than accesses to the first field.


 * If you replace the 32-bit field in that structure with an 8-bit field, then there's a difference between the case where there's a byte of padding after the 8-bit field and the case where the structure is "packed" and there's no padding byte - either the access to the 16-bit fields might be slowed down due to the fields being unaligned or the access might not even be doable with a single instruction if natural alignment is required. Guy Harris (talk) 19:54, 9 September 2019 (UTC)


 * Lots of machines lacked sub-word addressing, the HP 2100's for instance. It's not like this issue is unknown, and it is one of the reasons C exists at all. Or perhaps your concern is that the example is 32-bits and not some other number? Maury Markowitz (talk) 13:51, 10 September 2019 (UTC)


 * "Lots of machines lacked sub-word addressing" Yes, I'm quite aware of that, but:


 * few do today;
 * without sub-word addressing, even a 16-bit word aligned on a 32-bit boundary would require, at minimum, masking after the load unless the generated code guarantees that the padding is in the upper bits of the 32-bit word containing the 16-bit quantity and that the upper bits are either zero-extended or sign-extended.


 * A better example, which applies to many machines with byte addressing, as well as those without, would be a case with a structure with a 16-bit field followed by a 32-bit field. On a 32-bit word-addressed processor (e.g., Stanford MIPS), padding after the 16-bit field means you don't have to do masking (and possibly shifting) to access either the 16-bit word or the 32-bit word following it; on a 32-bit byte-addressed processor, it means you either 1) don't get a performance hit, if accessing a 32-bit word that's only on a 16-bit boundary is allowed or 2) don't have to do several instructions, if that's not allowed (e.g., on SPARC processors). Guy Harris (talk) 17:10, 10 September 2019 (UTC)

OK, I went with the 16-bit field followed by a 32-bit field as the example, following the previous edit by somebody who apparently doesn't like disputed text. Guy Harris (talk) 18:52, 20 January 2020 (UTC)

Half vs 3/4
The article states:

For example, on a 32-bit machine, a data structure containing a 16-bit value followed by a 32-bit value could have 16 bits of padding between the 16-bit value and the 32-bit value to align the 32-bit value on a 32-bit boundary. Alternatively, one can pack the structure, omitting the padding, which may lead to slower access, but uses half as much memory.

Am I mislead or should it read ``three quarters as much'' in the end? -- — Preceding unsigned comment added by 88.130.51.182 (talk) 21:47, 11 November 2020 (UTC)
 * Fixed. Guy Harris (talk) 22:17, 11 November 2020 (UTC)

Mention C11's _Alignas
I think _Alignas is relevant here. Maybe someone more knowledgeable than me in C could edit the article?

See https://en.cppreference.com/w/c/language/_Alignas — Preceding unsigned comment added by Abd.nh (talk • contribs) 14:00, 13 December 2020 (UTC)

"Padding (computing)" listed at Redirects for discussion
The redirect [//en.wikipedia.org/w/index.php?title=Padding_(computing)&redirect=no Padding (computing)] has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at  until a consensus is reached. Jay 💬 20:23, 21 July 2023 (UTC)