This is an archive of the discontinued LLVM Phabricator instance.

[EarlyCSE] Ensure equal keys have the same hash value
ClosedPublic

Authored by JosephTremoulet on May 29 2019, 9:39 PM.

Details

Summary

The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that select (not (cmp sgt X, Y)), X, Y is
equivalent to select (cmp sgt X, Y), Y, X. Without this change,
however, only the latter is recognized as a form of smin X, Y, so the
two expressions receive different hash codes. This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional not and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of smin X, Y.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2019, 9:39 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
spatel added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
352

formatting: result -> Result

llvm/test/Transforms/EarlyCSE/commute.ll
110–112

For this and the related small tests, please commit them with baseline assertions as a preliminary patch, then rebase to show the improvements from this patch.

645–651

How does this bug manifest? Can we cause an assertion failure without this patch?
cc'ing some more potential reviewers to see if anyone has ideas for a smaller than ~1000 line regression test to reproduce the bug. ( @lebedev.ri @efriedma @craig.topper @arsenm ).

JosephTremoulet marked an inline comment as done.Jun 5 2019, 8:59 AM
JosephTremoulet added inline comments.
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

Yes, this same test reproduces an assertion[1] failure without this patch, but it does so nondeterministically -- I see it about every 20-30 runs. The original compiland I was hitting this on would hit every ~2000 runs. If I revert everything except the new assertion I added in isEqual, this test hits the new assertion something like 4 times out of 5 for me.

Happy for ideas how to trim it down / make it more deterministic.

[1] https://github.com/llvm/llvm-project/blob/590b1aee609d30346aee77a699381d21c538dd56/llvm/include/llvm/ADT/DenseMap.h#L404

Fix style, rebase on top of test change

efriedma added inline comments.Jun 5 2019, 11:48 AM
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

It might make sense to write a unit-test that directly checks the result of hashing/equality on SimpleValue? Granted, that would require exposing the SimpleValue API in a header somewhere, I think.

Alternatively, you could add a debugging flag to early-cse to make it do a linear scan over the hashtable entries after each failed hashtable lookup, to make sure it was supposed to fail.

JosephTremoulet marked an inline comment as done.Jun 6 2019, 1:19 PM
JosephTremoulet added inline comments.
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

I started coding up the debugging flag approach, but the hash table in question is a ScopedHashTable, which doesn't provide a way to enumerate the keys (unless I'm just missing it). I'm not convinced that this test merits changing the ScopedHashTable API surface, but I'm happy to do it if there's reason to think there's other merit in adding key/value pair iteration to ScopedHashTable, or something -- thoughts?

I have similar reservations about exposing SimpleValue just for this test. Do we have other examples of things exposed just for testing that I should follow? E.g. would you expect it to be in EarlyCSE.h, or in some new EarlyCSE-private.h or similar?

I could also be convinced to remove or scale back this test. I had two benefits in mind when I wrote it:

  1. It reproduces an assertion failure on current sources
  2. It validates that my changes don't just push the problem one level of looking-through-nots deeper

Regarding #1, if the hashes diverge again for the specific pattern this test repeats, then the selects in the other new tests above would not get CSE'd (except by super-rare hash collision), so the test would fail the CHECK lines there. So we're protected against that particular regression regardless of this 1000-line test. And the tests added above fail on current sources, from the same bug, just with a different symptom (missed optimization opportunity rather than assertion failure).

Regarding #2, I could just scale this down to one instance of the 11-line pattern, and add similar CHECK lines, and again have the coverage.

So I'm leaning toward making this test look a lot like the others, just with an extra "not" or two, but could be convinced to take one of the other approaches if someone cares. Thoughts?

efriedma added inline comments.Jun 6 2019, 4:54 PM
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

If we have some test coverage for the hashing, it isn't that important what we do here, I guess. A debugging flag in EarlyCSE could be helpful to find other issues, but it's probably not worth writing a whole iteration framework for ScopedHashTables just for that.

JosephTremoulet marked an inline comment as done.Jun 6 2019, 8:36 PM
JosephTremoulet added inline comments.
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

Oh, actually, I suppose I could add a flag without changing ScopedHashTable -- since I already have isEqual asserting that the hashes match, it'd be the same effect to have the flag make getHashValue just return a constant (with a touch of refactoring so the assert could still call the "real" hash function).

I'm not sure how to best make use of such a flag .. just use it in the commute.ll test? Other EarlyCSE tests as well? Would it make sense for such a flag to default to "yes run the checks" under #ifdef EXPENSIVE_CHECKS?

  • Add -earlycse-debug-hash flag, shorten new test
JosephTremoulet marked 2 inline comments as done.Jun 11 2019, 7:40 AM
JosephTremoulet added inline comments.
llvm/test/Transforms/EarlyCSE/commute.ll
645–651

Updated to add debug flag (passed in the RUN line in commute.ll and left off everywhere else; can be thrown manually when updating the hash logic) and shorten the test.

spatel added inline comments.Jun 11 2019, 10:28 AM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
83–86

Since the code has an ifdef guard, do we want the ifdef guard on the option itself?
Or just remove the guard at the use, so we can use/test this option even with a release build?

JosephTremoulet marked an inline comment as done.Jun 11 2019, 10:46 AM
JosephTremoulet added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
83–86

My thinking was:

  1. We don't want an ifdef guard on the option itself, because if we had one then running lit on commute.ll with a release build of opt would fail
  2. We do want a guard on the use, to avoid the run-time cost of repeatedly checking the static every time we compute a SimpleValue hash in release builds (and returning the const from the hash function in release builds isn't that valuable since the point of doing it is to exercise the assertion in isEqual that won't exist in release builds anyway)

I could be convinced to change it (especially if there's already some convention for debug-only flags in lit tests), but that was my reasoning.

spatel accepted this revision.Jun 11 2019, 11:00 AM

LGTM - but let's wait a ~day to commit in case anyone else has comments.

This revision is now accepted and ready to land.Jun 11 2019, 11:00 AM
nikic accepted this revision.Jun 11 2019, 11:09 AM

LGTM as well.

llvm/lib/Transforms/Scalar/EarlyCSE.cpp
160

Either this assignment or the one above it can be dropped.

  • Remove useless assignment
JosephTremoulet marked an inline comment as done.Jun 11 2019, 12:13 PM
This revision was automatically updated to reflect the committed changes.
fhahn added a subscriber: fhahn.Jun 14 2019, 3:25 AM

It looks like this patch may have caused https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15236. It would be great if you could have a look.

It looks like this patch may have caused https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15236. It would be great if you could have a look.

That assertion is also seen here:
https://bugs.llvm.org/show_bug.cgi?id=42280

However, as I wrote in that PR, with older opt versions it still crashes now and then, only that it often takes more attempts.
And then we hit a different assertion:

opt: ../include/llvm/ADT/DenseMap.h:404: void llvm::DenseMapBase<llvm::DenseMap<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >, (anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >::moveFromOldBuckets(BucketT *, BucketT *) [DerivedT = llvm::DenseMap<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >, KeyT = (anonymous namespace)::SimpleValue, ValueT = llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, KeyInfoT = llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, BucketT = llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *>]: Assertion `!FoundVal && "Key already in new map?"' failed.

It looks like this patch may have caused https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15236. It would be great if you could have a look.

That assertion is also seen here:
https://bugs.llvm.org/show_bug.cgi?id=42280

However, as I wrote in that PR, with older opt versions it still crashes now and then, only that it often takes more attempts.
And then we hit a different assertion:

opt: ../include/llvm/ADT/DenseMap.h:404: void llvm::DenseMapBase<llvm::DenseMap<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >, (anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >::moveFromOldBuckets(BucketT *, BucketT *) [DerivedT = llvm::DenseMap<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *> >, KeyT = (anonymous namespace)::SimpleValue, ValueT = llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *, KeyInfoT = llvm::DenseMapInfo<(anonymous namespace)::SimpleValue>, BucketT = llvm::detail::DenseMapPair<(anonymous namespace)::SimpleValue, llvm::ScopedHashTableVal<(anonymous namespace)::SimpleValue, llvm::Value *> *>]: Assertion `!FoundVal && "Key already in new map?"' failed.

Taking a look at both. This change had a bugfix and added more checking. The new checking is meant to do exactly what @uabelho describes -- hash/isEqual mismatch would previously have led, nondeterministically and rarely, to the "Key already in new map?" assertion failure, and would now lead, nondeterministically but less rarely, to the new assertion ("getHashValueImpl(LHS) == getHashValueImpl(RHS)"), and with -earlycse-debug-hash would now lead deterministically to the new assertion (for functions in which earlycse processes two expressions that compare equal but hash differently).