D53906 increased p_align of PT_TLS on ARM/AArch64 to 32/64 to make the
TLS layout compatible with Android Bionic. However, this can make glibc
ARM/AArch64 programs using {initial,local}-exec TLS models crash (see
PR41527).
The faulty PT_TLS satisfies p_vaddr%p_align != 0. The remainder is normally
0 but may be non-zero with the D53906 hack in place. The problem is that
we increase the p_align of the PT_TLS after the OutputSection's
addresses are fixed (assignAddress()). It is possible that
p_vaddr%old_p_align = 0 while p_vaddr%new_p_align != 0.
When this happens, lld and different ld.so implementations have
different opinions on the offset of the first module. glibc elf/dl-tls.c
computes an offset that satisfies: offset%p_align == p_vaddr%p_align.
This is correct as a basic ELF requirement.
lld, musl (1.1.20 ~ latest 1.1.22), FreeBSD rtld, and Bionic place the
main TLS block at a wrong offset: `alignUp(2*sizeof(void*),
main_tls_align)`. This choice doesn't take p_vaddr%p_align into account.
This patches overaligns .tbss to make p_vaddr%p_align==0.