Bug #63699 shows a hang on arm in instcombine because we do not
propagate known bits for fshl/fshr rotates. We perform the propagation
and add regression test.
Details
- Reviewers
nikic - Commits
- rGeb5d79876a09: [InstCombine] Allow KnownBits to be propagated
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
930–932 | Should the RHS simplification be in an else statement now? |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
926–928 | You still need to return here. I think the only thing that needs to change is the incorrect re-declaration of the LHSKnown / RHSKnown variables. | |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
5 | Don't specify data layout or triple. | |
63 | Please minimize the test -- I don't see how this second function could possibly be related for example. |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
926–928 | Unfortunately that's not the case. If we return then the instruction is not simplified. The problem is that the instruction %or.i = call i32 @llvm.fshl.i32(i32 0, i32 0, i32 16) keeps being checks (infinite loop) and both arguments replaced by a zero. We need to replace this instruction by a zero. That only happens, not when we return, but when we break and go all the way to the block at the end of the function that does the whole instruction simplication: // If the client is only demanding bits that we know, return the known // constant. if (DemandedMask.isSubsetOf(Known.Zero|Known.One)) return Constant::getIntegerValue(VTy, Known.One); return nullptr; Do you see any other way of doing this? We could copy this block into the case statement, but I don't think that makes much sense tbh. | |
930–932 | I was thinking I could do both simplifications if necessary in the same loop thereby avoiding the need for an else statement. | |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
5 | Sure - although I am curious what's the policy here, before submitting this, I did confirm other tests in InstCombine do sometimes specify a triple. | |
63 | I did run llvm-reduce... I wonder if llvm-reduce is not removing functions. I will take a look. |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
926–928 | Okay, I understand the problem now. We are replacing the operand with Constant::getIntegerValue(VTy, LHSKnown.One) here, but it may be that the operand already has that value, in which case we'll go into an infinite loop. I think what you need to do in that case is check whether we're actually going to change something: if (DemandedMaskLHS.isSubsetOf(LHSKnown.Zero | LHSKnown.One)) { Constant *NewLHS = Constant::getIntegerValue(VTy, LHSKnown.One); if (I->getOperand(0) != NewLHS) { replaceOperand(*I, 0, NewLHS); return &I; } } That should make sure we only report a change if it actually happened. | |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
5 | If you specify a triple you also need to specify either REQUIRES or move it into a target-specific directory. It's best to omit triple if it's not strictly necessary. | |
63 | llvm-reduce should be removing functions... |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
---|---|---|
63 | hummm, you're right it does. Which is why in this case the test cannot be reduced further and there's no infinite loop if I manually remove the function bpf_prog_calc_tag. However, I cannot see straightforwardly why that is. |
Update issue to check if we are replacing a constant with itself as per
@nikic suggestion.
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
921–933 | Maybe cleaner is if (DemandedMaskLHS.isSubsetOf(LHSKnown.Zero | LHSKnown.One) && !match(I->getOperand(0), m_SpecificInt(LHSKnown.Zero | LHSKnown.One)) Then you can just keep the original code in the if statement. | |
938 | Do we need to cleanup the constants if we don't actually insert them? |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
921–933 | Thanks - didn't know about m_SpecificInt. But I guess you mean: DemandedMaskLHS.isSubsetOf(LHSKnown.Zero | LHSKnown.One) && !match(I->getOperand(0), m_SpecificInt(LHSKnown.One) right? |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
938 | With the match change you suggested, we don't need to cleanup anymore. |
llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
935 | nit: No else after return. | |
938 | Don't need the separate NewLHS variable now that you use m_SpecificInt. | |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
63 | I don't get an infinite loop with your current test either -- I think your current test might be against something like -O2 rather than -passes=instcombine? Here's a test that works against just instcombine: https://gist.github.com/nikic/a2c0b936686940ae31ae2b23ec9efbcd (Not reduced) |
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
---|---|---|
63 | I wasn't able to reduce it substantially beyond that, but here's a cleaned up test case: declare i1 @llvm.is.constant.i32(i32) define i32 @test(i32 %arg) { entry: %zext = zext i32 %arg to i64 %lshr1 = lshr i64 %zext, 32 %trunc = trunc i64 %lshr1 to i32 %cond = call i1 @llvm.is.constant.i32(i32 %trunc) br i1 %cond, label %else, label %if if: br label %exit else: %fshl = call i32 @llvm.fshl.i32(i32 %trunc, i32 0, i32 0) br label %exit exit: %phi = phi i32 [ %fshl, %else ], [ 0, %if ] %lshr2 = lshr i32 %phi, 1 ret i32 %lshr2 } declare i32 @llvm.fshl.i32(i32, i32, i32) |
Landed - @nikic and @goldstein.w.n thanks for the patient and quick reviews.
llvm/test/Transforms/InstCombine/2023-07-13-arm-infiniteloop.ll | ||
---|---|---|
63 | oh - sorry. Right, I was running on the command line with -O2. Your example works very well though, thanks. |
You still need to return here. I think the only thing that needs to change is the incorrect re-declaration of the LHSKnown / RHSKnown variables.