Tail merge is slow and of low value. With regular string deduplication, we can
just use the return value of StringTableBuilder::add.
There is no noticeable performance increase because without deduplication
__cstring is quite small (7.6MiB for chromium_framework).
I tried an ELF port style parallel algorithm (https://reviews.llvm.org/P8275), but did not see an improvement
when linking chromium_framework.