This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Fix computation of fuction Hash
ClosedPublic

Authored by serge-sans-paille on May 14 2020, 1:04 PM.

Details

Summary

Previous implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.

Added a new test case that trivially tests that a small function change is
reflected in the hash value.

Not that as this patch fixes the hash computation, it invalidates all hashes
computed before that patch applies, which could be an issue for large build
system that pre-compute the profile data and let client download them as part of
the build process.

Diff Detail

Event Timeline

Update test case

dmajor added a subscriber: dmajor.May 15 2020, 3:53 AM

This patch is correct. A clarification of the description:

Previous implementation was incorrectly passing an integer, that got converted to a pointer, to finalize the hash computation.

Working (an uint64_t) was truncated to an uint8_t, converted to a one-element ArrayRef<uint8_t>, then passed to MD5.update

MaskRay added inline comments.May 17 2020, 11:53 AM
clang/lib/CodeGen/CodeGenPGO.cpp
767–768

Can the using be deleted?

serge-sans-paille edited the summary of this revision. (Show Details)

Updated summary + removed namespace.

serge-sans-paille marked an inline comment as done.May 18 2020, 1:17 AM
hans added a subscriber: hans.May 18 2020, 1:21 AM

Nice fix, and nice test :-)

May I ask how you found this?

May I ask how you found this?

This was a long (and painful) quest when tracking https://bugzilla.redhat.com/show_bug.cgi?id=1827282

hans accepted this revision.May 25 2020, 7:16 AM

lgtm

This revision is now accepted and ready to land.May 25 2020, 7:16 AM
This revision was automatically updated to reflect the committed changes.
ormris added a subscriber: ormris.May 25 2020, 10:20 AM

There's a few test failures related to this change on the PS4 bot. Could you take a look?

http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/68111

serge-sans-paille added a subscriber: bkramer.

@bkramer rightfully reverted this, faster than I could patch it :-) Here is the patch though, with the updated test case. @hans the only meaningful change is that I'm keeping the buggy behavior for hash v1 for compatibility reason.

This revision is now accepted and ready to land.May 25 2020, 11:50 AM
Update profile data hash entries due to hash function update, unless the version
used is V1, in which case we keep the buggy behavior for backward compatibility.
MaskRay accepted this revision.May 25 2020, 12:38 PM

Thanks!

@hans/ @MaskRay I'm starting to wonder if I should bump the version number of the hash function, so that clang could still read profile data generated before that patch?

Bump the version number to be compatible with existing profdata, in a similar fashion to v1/v2 transition.

hans added a comment.May 26 2020, 1:28 AM

Bump the version number to be compatible with existing profdata, in a similar fashion to v1/v2 transition.

Did you forget to include some files? I don't see the bump anywhere.

clang/lib/CodeGen/CodeGenPGO.cpp
757

Maybe explicitly show the conversion to array of uint8_t here, to make it more clear what's going on.

With v3 version + Make cast explicit

hans added a comment.May 26 2020, 4:25 AM

I worry that the hash version bump isn't complete. Doesn't the hash version used need to be read/written with the profile data file somewhere?

clang/lib/CodeGen/CodeGenPGO.cpp
148–150

I guess this needs an update now?

294

Should this be >= PGO_HASH_V2 now? And similarly below?

756

Maybe just "HashVersion < PGO_HASH_V3" would be simpler?

serge-sans-paille marked an inline comment as done.

Update version bump parts

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2020, 6:10 AM
serge-sans-paille marked 3 inline comments as done.May 26 2020, 6:28 AM
hans accepted this revision.May 26 2020, 7:04 AM

lgtm

This revision was automatically updated to reflect the committed changes.

It looks like clang/test/Profile/Inputs/c-general.profdata.v5 is being read as v6 rather than v5. Can you double check?

It looks like clang/test/Profile/Inputs/c-general.profdata.v5 is being read as v6 rather than v5. Can you double check?

Yep, I'll have a look.

It looks like clang/test/Profile/Inputs/c-general.profdata.v5 is being read as v6 rather than v5. Can you double check?

Thanks for spotting the issue. Should be fixed by 63489c39deeffb24a085b3766c5d5ff76a52fa2f