This is an archive of the discontinued LLVM Phabricator instance.

[MergeFunctions] Extend FunctionComparator to account for metadata arguments in intrinsics, and only ignore debug info metadata
Needs ReviewPublic

Authored by kubamracek on Oct 30 2021, 8:14 AM.

Details

Summary

See attached testcase that shows the miscompile / mis-merge. Comparing functions for equality is done via FunctionComparator, which ignores metadata attachments and metadata arguments of instructions, with the intention of ignoring/discarding debug info. That's a problem for @llvm.type.checked.load, and other intrinsics where metadata arguments are actually semantically meaningful (e.g. type id in @llvm.type.checked.load).

Diff Detail

Event Timeline

kubamracek created this revision.Oct 30 2021, 8:14 AM
kubamracek requested review of this revision.Oct 30 2021, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 30 2021, 8:14 AM
nikic added a subscriber: nikic.Oct 30 2021, 8:25 AM

Shouldn't this be fixed in FunctionComparator?

Shouldn't this be fixed in FunctionComparator?

So that's certainly the right question here. Should it? To me it looks like FunctionComparator is ignoring metadata attachments *on purpose*, and that fits the contract of "determine whether 2 functions will generate machine code with the same behavior".

nikic added a comment.Oct 30 2021, 8:46 AM

Shouldn't this be fixed in FunctionComparator?

So that's certainly the right question here. Should it? To me it looks like FunctionComparator is ignoring metadata attachments *on purpose*, and that fits the contract of "determine whether 2 functions will generate machine code with the same behavior".

I'm reasonably sure that ignoring metadata arguments is just an oversight here -- they probably weren't used outside dbg intrinsics when this code was written. We have things like constrained fp intrinsics with rounding/exception mode determined by metadata arguments nowadays, and treating two fp operations with different rounding mode as equivalent would clearly be incorrect.

pcc added a comment.Oct 30 2021, 8:50 AM

Shouldn't this be fixed in FunctionComparator?

So that's certainly the right question here. Should it? To me it looks like FunctionComparator is ignoring metadata attachments *on purpose*, and that fits the contract of "determine whether 2 functions will generate machine code with the same behavior".

I think FunctionComparator should be fixed, as the metadata arguments do influence the machine code here (presumably it was assumed that these arguments would only appear on debug intrinsics, where they indeed would not be expected to affect the machine code). Note that the same issue affects llvm.type.test.

kubamracek retitled this revision from [MergeFunctions] Don't merge functions that use @llvm.type.checked.load instrinsics to [MergeFunctions] Extend FunctionComparator to account for metadata arguments in intrinsics, and only ignore debug info metadata.
kubamracek edited the summary of this revision. (Show Details)

Thanks for the feedback! I've changed the approach to teach FunctionComparator to understand metadata in instruction operands/values, and only ignore debug info metadata.

nikic added a comment.Dec 1 2021, 1:59 AM

I'm not a fan of how dbg intrinsics are handled here. I think if we don't want to compare these, the principled way to do that would be to skip them entirely, something along these lines:

diff --git a/llvm/lib/Transforms/Utils/FunctionComparator.cpp b/llvm/lib/Transforms/Utils/FunctionComparator.cpp
index 326864803d7c..f75b75a57544 100644
--- a/llvm/lib/Transforms/Utils/FunctionComparator.cpp
+++ b/llvm/lib/Transforms/Utils/FunctionComparator.cpp
@@ -791,8 +791,10 @@ int FunctionComparator::cmpValues(const Value *L, const Value *R) const {
 // Test whether two basic blocks have equivalent behaviour.
 int FunctionComparator::cmpBasicBlocks(const BasicBlock *BBL,
                                        const BasicBlock *BBR) const {
-  BasicBlock::const_iterator InstL = BBL->begin(), InstLE = BBL->end();
-  BasicBlock::const_iterator InstR = BBR->begin(), InstRE = BBR->end();
+  auto WithoutDebugL = BBL->instructionsWithoutDebug();
+  auto WithoutDebugR = BBR->instructionsWithoutDebug();
+  auto InstL = WithoutDebugL.begin(), InstLE = WithoutDebugL.end();
+  auto InstR = WithoutDebugR.begin(), InstRE = WithoutDebugR.end();
 
   do {
     bool needToCmpOperands = true;
@@ -962,7 +964,7 @@ FunctionComparator::FunctionHash FunctionComparator::functionHash(Function &F) {
     // This random value acts as a block header, as otherwise the partition of
     // opcodes into BBs wouldn't affect the hash, only the order of the opcodes
     H.add(45798);
-    for (auto &Inst : *BB) {
+    for (auto &Inst : BB->instructionsWithoutDebug()) {
       H.add(Inst.getOpcode());
     }
     const Instruction *Term = BB->getTerminator();

And then always compare metadata operands on top of that.

kubamracek updated this revision to Diff 391015.Dec 1 2021, 7:18 AM

@nikic Awesome suggestion, thanks! I didn't realize we have this nice facility to ignore dbginfo instructions. The diff is now really simple, and still fixes the problem.

nikic added inline comments.Dec 1 2021, 7:41 AM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
730

I'm concerned that this is going to cause non-determinism, because addresses are not stable. I think we need to either look into the metadata, or assign numbers (like GlobalNumberState).

ormris removed a subscriber: ormris.Jan 24 2022, 11:14 AM