As far as I can tell layout of IR vectors (especially those with sub-byte sized elements) have not been described in the documentation. This patch tries to address that using material from the comments in https://reviews.llvm.org/D42100#992315 where the decision appears to have been taken.
Maybe it is just confusing to talk about sub-byte sized elements and C language here? I mean we need to define how it works for sizes larger than a byte as well. And the IR is source language agnostic (even if the motivation here might origin from C).
We could perhaps just state that the layout is packed. And that the vector could be seen as one large iN scalar (N given by the type store size in bits of the vector), with element zero being in the most significant bits for a big-endian target and in the least significant bits for a little-endian target.
I guess it isn't defined where padding goes if the type size is less than the type store size (e.g. <2 x i6> has a type size that is 12 bits, but the type store size is 16 bits).
I think if it might be good to add a caveat here about bitcasts involving vector types. For example that bitcast <2 x i8> to i16 puts element zero of the vector in the least significant bits of the i16 for little-endian while element zero ends up in the most significant bits for big-endian.
Hello. The general idea of documenting what llvm does sounds like a good idea. Alive agrees with this too, which is a good sign: https://alive2.llvm.org/ce/z/XbkTEz.
Do we know which backends support big endian? Arm and AArch64 do. Sparc, PPC, Mips, Lanai. It seems like quite a few do.
I am not sure if it is desired or even acceptable in the language reference, but my experience is that a diagram goes a long way towards explaining this. I've had to teach countless new developers here at IBM about the two vector layouts (since PPC supports both).
Something like this tends to resonate with developers:
Use a <4 x i32> vector as an example: Memory: Register(LE): Register(BE): 0x0 0x4 0x8 0xC 3 2 1 0 0 1 2 3 [A, B, C, D] [D, C, B, A] [A, B, C, D]
As it shows both the relationship of the numbering of bytes in memory and the vector and the layout of the elements in the register.
I like the idea of comparing a vector to a scalar of the same width and stating where the elements are placed in terms of bit significance.