This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Fix hashing for MO_ExternalSymbol MachineOperands.
ClosedPublic

Authored by efriedma on May 15 2019, 6:51 PM.

Details

Summary

We were hashing the string pointer, not the string, so two instructions could be identical (isIdenticalTo), but have different hash codes.

This showed up as a very rare, non-deterministic assertion failure rehashing a DenseMap constructed by MachineOutliner. So there's no "real" testcase, just a unittest which checks that the hash function behaves correctly.

I'm a little scared fixing this is going to cause a regression in outlining or MachineCSE, but hopefully we won't run into any issues.

Diff Detail

Repository
rL LLVM

Event Timeline

efriedma created this revision.May 15 2019, 6:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2019, 6:51 PM

Seems good. Would it make sense to also add a test explicitly using llvm::hash_combine, as well as a negative test where hash code comparison fails for a literal and the same literal in a StringRef every time (if it's possible to do reliably)? Just a random suggestion, probably overly paranoid.

Would it make sense to also add a test explicitly using llvm::hash_combine

Using llvm::hash_combine on what, exactly?

as well as a negative test where hash code comparison fails for a literal and the same literal in a StringRef every time

You mean something like ASSERT_NE(SymName1.data(), SymName2.data());?

Would it make sense to also add a test explicitly using llvm::hash_combine

Using llvm::hash_combine on what, exactly?

as well as a negative test where hash code comparison fails for a literal and the same literal in a StringRef every time

You mean something like ASSERT_NE(SymName1.data(), SymName2.data());?

I just realized that this was for specialization of hash_value for MO, I'm blind, disregard the first remark, sorry.

Regarding second, I meant a case of performing hash_combine manually, and then using ASSERT_NE, with first case being a string literal, and second being the same literal but passed in as StringRef. Assuming it's possible to have a string that would never yield identical hash result to hashing the pointer to a literal. I'm not sure if that makes sense as a test or not, thinking about it, probably doesn't aside from "just for the sake of it/out of paranoia". Was just a suggestion, if it doesn't make sense to test it, ignore the remark.

I looked more carefully just as I was about to commit this, and discovered the hashing for MO_RegisterMask is also wrong. I think I'll fix that separately, though.

This revision was not accepted when it landed; it landed in state Needs Review.May 31 2019, 5:06 PM
This revision was automatically updated to reflect the committed changes.