Without this, 64-byte vector types (__m512), specified to be 64-byte aligned in the AVX512 draft SysV ABI, will only be 32-byte aligned.
One might raise a couple valid concerns:
- this doesn't change alignment of anything other than clang-generated vector code. So malloc() and all will still only be 16-byte aligned.
- we've gotten to a point where unaligned accesses are good enough, why care about alignment?
To which I pedantically counter:
- there's precedent: AVX bumped alignment to 32, so vector users are already familiar with the issue.
- because the spec says so ;)