This is an archive of the discontinued LLVM Phabricator instance.

Support using sha256 as --build-id kind
AbandonedPublic

Authored by serge-sans-paille on Nov 16 2021, 5:48 AM.

Details

Reviewers
MaskRay
Summary

Both sha1 and md5 are considered deprecated, it would be great to support a more recent/secure hashing algorithm to compute build id

Diff Detail

Event Timeline

serge-sans-paille requested review of this revision.Nov 16 2021, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2021, 5:48 AM

No test case yet, I'll add them if the commit makes sense.

The SHA-1 deprecation, AFIAK, is for digital signature.
ld.lld's --build-id feature, as I know, has no validation tool. It is used for "approximation of true uniqueness across all binaries that might be used by overlapping sets of people" (https://fedoraproject.org/w/index.php?title=RolandMcGrath/BuildID&oldid=16098), not for security purposes.
So I think the description

Both sha1 and md5 are considered deprecated,

does not apply.

That said, adding the option is straightforward/simply. If useful, we can add it, but the justification needs to be higher than "Transport Layer Security has phased out SHA-1, so we follow suit"
Will GNU ld add this option?

joerg added a subscriber: joerg.Nov 16 2021, 1:26 PM

Exactly. This patch only provides a way to tick a checkbox on a stupid compliance list. SHA2 provides no security benefit here for the price of slower linking. If we want to use a more modern cryptographic hash function, the primary criterion should be performance, IMO.

Exactly. This patch only provides a way to tick a checkbox on a stupid compliance list. SHA2 provides no security benefit here for the price of slower linking. If we want to use a more modern cryptographic hash function, the primary criterion should be performance, IMO.

I second that opinion :-) Then blake2 (see https://en.wikipedia.org/wiki/BLAKE_(hash_function)) would be a good match. The reference implementation is covered by the Apache Public License 2.0 so that would be fine license wise, and it claims to be "faster than MD5, SHA-1, SHA-2, and SHA-3, on 64-bit x86-64 and ARM architectures"

MaskRay added a comment.EditedNov 17 2021, 6:49 PM

In 2020-01 I did notice that build-id may be a performance bottleneck. The need to import external code made me unsure whether it is justified.

If you are motivated, you may check the benchmark and check this issue for SHF_STRINGS|SHF_MERGE https://bugs.llvm.org/show_bug.cgi?id=37029

Also note that our SHA1 impl is inefficient. There is room for improvement, e.g. D69295

MaskRay requested changes to this revision.Dec 29 2021, 6:45 PM
This revision now requires changes to proceed.Dec 29 2021, 6:45 PM

I created D121531 to use BLAKE3.

Herald added a project: Restricted Project. · View Herald TranscriptMar 12 2022, 12:01 PM

Blake3 adopted instead