We were mishandling the case where both __tbss and __thread_data sections were
present.
TLVP relocations should be encoded as offsets from the start of __thread_data,
even if the symbol is actually located in __thread_bss. Previously, we were
writing the offset from the start of the containing section, which doesn't
really make sense since there's no way tlv_get_addr() can know which section a
given tlv$init symbol is in at runtime.
In addition, this patch ensures that we place __thread_data immediately before
__thread_bss. This is what ld64 does, likely for performance reasons. Zerofill
sections must also be at the end of their segments; we were already doing this,
but now we ensure that __thread_bss occurs before __bss, so that it's always
possible to have it contiguous with __thread_data.
Fixes llvm.org/PR48657.
very nit: swap order of ZEROFILL and REGULAR since REGULAR comes first in the TLV area in memory :)