Both sha1 and md5 are considered deprecated, it would be great to support a more recent/secure hashing algorithm to compute build id
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The SHA-1 deprecation, AFIAK, is for digital signature.
ld.lld's --build-id feature, as I know, has no validation tool. It is used for "approximation of true uniqueness across all binaries that might be used by overlapping sets of people" (https://fedoraproject.org/w/index.php?title=RolandMcGrath/BuildID&oldid=16098), not for security purposes.
So I think the description
Both sha1 and md5 are considered deprecated,
does not apply.
That said, adding the option is straightforward/simply. If useful, we can add it, but the justification needs to be higher than "Transport Layer Security has phased out SHA-1, so we follow suit"
Will GNU ld add this option?
Exactly. This patch only provides a way to tick a checkbox on a stupid compliance list. SHA2 provides no security benefit here for the price of slower linking. If we want to use a more modern cryptographic hash function, the primary criterion should be performance, IMO.
I second that opinion :-) Then blake2 (see https://en.wikipedia.org/wiki/BLAKE_(hash_function)) would be a good match. The reference implementation is covered by the Apache Public License 2.0 so that would be fine license wise, and it claims to be "faster than MD5, SHA-1, SHA-2, and SHA-3, on 64-bit x86-64 and ARM architectures"
In 2020-01 I did notice that build-id may be a performance bottleneck. The need to import external code made me unsure whether it is justified.
If you are motivated, you may check the benchmark and check this issue for SHF_STRINGS|SHF_MERGE https://bugs.llvm.org/show_bug.cgi?id=37029
Also note that our SHA1 impl is inefficient. There is room for improvement, e.g. D69295