See attached testcase that shows the miscompile / mis-merge. Comparing functions for equality is done via FunctionComparator, which ignores metadata attachments and metadata arguments of instructions, with the intention of ignoring/discarding debug info. That's a problem for @llvm.type.checked.load, and other intrinsics where metadata arguments are actually semantically meaningful (e.g. type id in @llvm.type.checked.load).
Diff Detail
Event Timeline
Shouldn't this be fixed in FunctionComparator?
So that's certainly the right question here. Should it? To me it looks like FunctionComparator is ignoring metadata attachments *on purpose*, and that fits the contract of "determine whether 2 functions will generate machine code with the same behavior".
I'm reasonably sure that ignoring metadata arguments is just an oversight here -- they probably weren't used outside dbg intrinsics when this code was written. We have things like constrained fp intrinsics with rounding/exception mode determined by metadata arguments nowadays, and treating two fp operations with different rounding mode as equivalent would clearly be incorrect.
I think FunctionComparator should be fixed, as the metadata arguments do influence the machine code here (presumably it was assumed that these arguments would only appear on debug intrinsics, where they indeed would not be expected to affect the machine code). Note that the same issue affects llvm.type.test.
Thanks for the feedback! I've changed the approach to teach FunctionComparator to understand metadata in instruction operands/values, and only ignore debug info metadata.
I'm not a fan of how dbg intrinsics are handled here. I think if we don't want to compare these, the principled way to do that would be to skip them entirely, something along these lines:
diff --git a/llvm/lib/Transforms/Utils/FunctionComparator.cpp b/llvm/lib/Transforms/Utils/FunctionComparator.cpp index 326864803d7c..f75b75a57544 100644 --- a/llvm/lib/Transforms/Utils/FunctionComparator.cpp +++ b/llvm/lib/Transforms/Utils/FunctionComparator.cpp @@ -791,8 +791,10 @@ int FunctionComparator::cmpValues(const Value *L, const Value *R) const { // Test whether two basic blocks have equivalent behaviour. int FunctionComparator::cmpBasicBlocks(const BasicBlock *BBL, const BasicBlock *BBR) const { - BasicBlock::const_iterator InstL = BBL->begin(), InstLE = BBL->end(); - BasicBlock::const_iterator InstR = BBR->begin(), InstRE = BBR->end(); + auto WithoutDebugL = BBL->instructionsWithoutDebug(); + auto WithoutDebugR = BBR->instructionsWithoutDebug(); + auto InstL = WithoutDebugL.begin(), InstLE = WithoutDebugL.end(); + auto InstR = WithoutDebugR.begin(), InstRE = WithoutDebugR.end(); do { bool needToCmpOperands = true; @@ -962,7 +964,7 @@ FunctionComparator::FunctionHash FunctionComparator::functionHash(Function &F) { // This random value acts as a block header, as otherwise the partition of // opcodes into BBs wouldn't affect the hash, only the order of the opcodes H.add(45798); - for (auto &Inst : *BB) { + for (auto &Inst : BB->instructionsWithoutDebug()) { H.add(Inst.getOpcode()); } const Instruction *Term = BB->getTerminator();
And then always compare metadata operands on top of that.
@nikic Awesome suggestion, thanks! I didn't realize we have this nice facility to ignore dbginfo instructions. The diff is now really simple, and still fixes the problem.
llvm/lib/Transforms/Utils/FunctionComparator.cpp | ||
---|---|---|
741 | I'm concerned that this is going to cause non-determinism, because addresses are not stable. I think we need to either look into the metadata, or assign numbers (like GlobalNumberState). |
I'm concerned that this is going to cause non-determinism, because addresses are not stable. I think we need to either look into the metadata, or assign numbers (like GlobalNumberState).