As per title.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
For reference, the ctpop side of this patch is one of the missing folds noted in PR28668:
https://llvm.org/bugs/show_bug.cgi?id=28668
Just curious, is there a fold or codegen that doesn't work because of the undef?
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1403 ↗ | (On Diff #66724) | I'm not sure where we draw the line on 'auto', but I think the general preference is to use the actual type (Value *) in cases like this. At the least, this should be 'auto *' according to the coding standards. |
1410 ↗ | (On Diff #66724) | Use 'Op0' since we have it now. |
1416–1417 ↗ | (On Diff #66724) | I don't see anything about 'goto' in the coding standard doc, but I was trained to feel nervous anytime I see one. Make this a static helper or lambda instead? |
1421–1438 ↗ | (On Diff #66724) | How about refactoring this whole case into a helper function since it's nearly identical to the cttz case? That would remove the question about the 'goto' too. |
test/Transforms/InstCombine/intrinsics.ll | ||
383 ↗ | (On Diff #66724) | You can remove the label to reuce the test and the CHECKs. |
test/Transforms/InstCombine/intrinsics.ll | ||
---|---|---|
383 ↗ | (On Diff #66724) | typo: |
@spatel , the undef version lead to much better codegen on X86 at least, because bsf/bsr instruction are undefined when the operand is 0.
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66906) | Can't this function simplify to one line of code? II->setArgOperand(1, ConstantInt::getTrue(II->getContext())); Ie, you don't even need to check the current value of the undef operand. If it's already true, setting it again does nothing. |
1414–1452 ↗ | (On Diff #66906) | I was hoping that you could refactor this whole thing, so there's no code duplication. On 2nd look, I see that the existing fold is actually in the wrong place. Since we're able to fold to a constant, I think it belongs in InstSimplify. Can you fix that or add a 'FIXME' comment? I'd still prefer that you refactor this ahead of the functional change. If my first comment about makeCttzCtlzZeroUndef() being one line of code is correct, then it's still just one helper function for this whole thing. Let me know if I'm not seeing this correctly. |
test/Transforms/InstCombine/intrinsics.ll | ||
406 ↗ | (On Diff #66906) | Remove unnecessary label. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66906) | No, you want to patch this only once, not every time, or it'll never terminate. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1414–1452 ↗ | (On Diff #66906) | The bitmask manipulation previous this, while following a similar pattern, are different for cttz and ctlz. This is NOT constant folding in any way. It is making cttz and/or ctlz undefined for 0 when possible as it lead to better codegen, at least on X86. No the function cannot be simplified to a one liner. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1414–1452 ↗ | (On Diff #66989) | I've drafted what I tried to suggest here as D23223. Let me know if that makes sense. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1414–1452 ↗ | (On Diff #66989) | I don't think it really makes the code better. There are more branches and names become ambiguous as they can represent 2 things. Also, this is kind of out of the scope of this diff. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66989) | OK, then: if (match(II->getArgOperand(1), m_Zero())) II->setArgOperand(1, ConstantInt::getTrue(II->getContext())); |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66989) | I'd rather match everything that is NOT true, just in case. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66989) | Sorry - I don't see the difference between the 'm_Zero' version and this? Note that I checked in the refactoring in rL277883. We have too many LLVM bugs due to code duplication, and this function may continue to grow over time. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1240–1253 ↗ | (On Diff #66989) | Alright, I'll rebase on top of it. Not all value are true or false, so matching what is NOT true is not the same as matching false. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1283–1296 ↗ | (On Diff #67132) | Sorry for still not seeing it...can you add test case to show the difference? AFAICT, the undef param of these intrinsics must be an i1 true/false constant? case Intrinsic::ctlz: // llvm.ctlz case Intrinsic::cttz: // llvm.cttz Assert(isa<ConstantInt>(CS.getArgOperand(1)), "is_zero_undef argument of bit counting intrinsics must be a " "constant int", CS); |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1283–1296 ↗ | (On Diff #67132) | I has no idea there was an assertion to make sure of this. I'd still go for the logic "if it is anything but true, make it true" rather than it "if it false, make it true". |
I has no idea there was an assertion to make sure of this. I'd still go for the logic "if it is anything but true, make it true" rather than it "if it false, make it true".
I think that's fair - how about this:
if (KnownOne != 0 || isKnownNonZero(Op0, IC.getDataLayout())) if (!match(II.getArgOperand(1), m_One())) II.setArgOperand(1, ConstantInt::getTrue(II.getContext()));
But I think we need a comment to explain the transform and motivation. Something like
// If the input to cttz/ctlz is known to be non-zero, // then change the 'ZeroIsUndef' parameter to 'true' // because we know the zero behavior can't affect the result.
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1143 ↗ | (On Diff #67463) | I'd use IRBuilder's getTrue here. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1141 ↗ | (On Diff #67479) | Shouldn't this return the updated instruction to signal to the callers that a change was made? If we do that, then we don't need to explicitly add to the worklist either, right? |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1141 ↗ | (On Diff #67479) | No as we don't want to go though replacing the instruction by itself and we especially don't want to replace false by true in the program. |
No as we don't want to go though replacing the instruction by itself and we especially don't want to replace false by true in the program.
I don't understand the 2nd comment. For the first, I'm not suggesting that you replace the instruction by itself.
Looking around InstCombine, the common pattern I see is:
I.setOperand(1, X); return &I;
This adheres to the 'visit' function comments:
// Visitation implementation - Implement instruction combining for different // instruction types. The semantics are as follows: // Return Value: // null - No change was made // I - Change was made, I is still valid, I may be dead though // otherwise - Change was made, replace I with returned instruction
I think we'd have to change the helper function signature to return an Instruction* for this to work, so I'll defer to @majnemer on whether the code here is an acceptable substitute. I have no further comments for this patch.
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1473–1475 ↗ | (On Diff #67974) | In the case where no change is made, we want to continue on in this function: if (Instruction *I = foldCttzCtlz(*II, *this)) return I; break; |