This is an archive of the discontinued LLVM Phabricator instance.

LTO: Hash type identifier resolutions for WholeProgramDevirt.
ClosedPublic

Authored by pcc on Mar 2 2017, 2:36 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Mar 2 2017, 2:36 PM
mehdi_amini added inline comments.Mar 2 2017, 6:38 PM
llvm/lib/LTO/LTO.cpp
104 ↗(On Diff #90400)

Why do we hash bit-by-bit? Is it because of endianness? Do we care about having the same hash on different endian machine?

pcc added inline comments.Mar 2 2017, 7:40 PM
llvm/lib/LTO/LTO.cpp
104 ↗(On Diff #90400)

It makes the code easier to reason about if you always use the same hash function rather than making the code machine dependent.

mehdi_amini added inline comments.Mar 3 2017, 9:22 AM
llvm/lib/LTO/LTO.cpp
104 ↗(On Diff #90400)

I don't really get why the code here is "easier to reason about" than:

auto AddUint64 = [&](uint64_t I) {
    Hasher.update(&I, sizeof(I));
  };
pcc added inline comments.Mar 3 2017, 9:45 AM
llvm/lib/LTO/LTO.cpp
104 ↗(On Diff #90400)

If you reuse the cache on an opposite endian host you'd need to worry about collisions if all integers are byte swapped. I reasoned through this and figured out that it probably can't happen in practice, but the point is that we can avoid needing to rely on such reasoning.

Also, your code breaks strict aliasing rules. But hey, everyone breaks them so maybe it doesn't matter. I don't care that much so if you feel strongly I'll just change it.

mehdi_amini accepted this revision.Mar 3 2017, 9:46 AM
mehdi_amini added inline comments.
llvm/lib/LTO/LTO.cpp
104 ↗(On Diff #90400)

Right I wrote the code as a snippet.
And no I don't feel strongly about this, but I like to understand what's going on :)

This revision is now accepted and ready to land.Mar 3 2017, 9:46 AM
This revision was automatically updated to reflect the committed changes.