extend simplifyWithOpReplaced to look at more than one instruction for Xor instruction.
Diff Detail
Unit Tests
Event Timeline
Have you verified that this fixes the original bug as well? I feel that should also be added as a test case.
We shouldn't restrict this to just xor, but generally recurse the operand replacement.
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4289 | This would be the place to recursively call simplifyWithOpReplaced(). |
The original bug need another patch, we should change the phi into select, so I'll add that later.
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4289 | Thanks, apply your comment |
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4276 | This isn't what I had in mind. Why can't we do the recursive call in here? |
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4276 | I think there is conflict on the solution. |
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4276 | You need to keep track whether an operand has been replaced or not. Previously this was just done by is_contained, but now you would have to check the return value of the recursive simplifyWithOpReplaced. If there is no replacement, the following code can be skipped. |
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4280 | yes, but there is some regression. Does it make sense to extend this after we find some cases showed this is beneficial? a) the vector compare may has scalar operand, which will crash in above line 4268, such as func2 in file llvm/test/Transforms/LoopVectorize/same-base-access.ll. %17 = insertelement <4 x i32> %16, i32 %13, i64 3 %18 = icmp slt <4 x i32> %17, <i32 4, i32 4, i32 4, i32 4> b) there is many performance regression when enable isa<SExtInst>(V)) and isa<ZExtInst>(V)) , such as case lshr_out_of_range2 in file llvm/test/Transforms/InstCombine/shift.ll. c) When I disable the isa<SExtInst>(V)) and isa<ZExtInst>(V)), there are still some cases change because select instruction, where I'm also not sure if it's beneficial or not. @@ -224,8 +224,8 @@ define i4 @PR45762(i3 %x4) { ; CHECK-NEXT: [[T7:%.*]] = zext i3 [[T4]] to i4 ; CHECK-NEXT: [[ONE_HOT_16:%.*]] = shl nuw i4 1, [[T7]] ; CHECK-NEXT: [[OR_69_NOT:%.*]] = icmp eq i3 [[X4]], 0 -; CHECK-NEXT: [[UMUL_231:%.*]] = select i1 [[OR_69_NOT]], i4 0, i4 [[T7]] -; CHECK-NEXT: [[SEL_71:%.*]] = shl i4 [[ONE_HOT_16]], [[UMUL_231]] +; CHECK-NEXT: [[UMUL_231:%.*]] = shl i4 [[ONE_HOT_16]], [[T7]] +; CHECK-NEXT: [[SEL_71:%.*]] = select i1 [[OR_69_NOT]], i4 -8, i4 [[UMUL_231]] ; CHECK-NEXT: ret i4 [[SEL_71]] |
address some comment
1、whether any replacement happened
2、 add condition to guard MaxRecurse == 0
3、delete the check for BinaryOperator, but add the following 3 type to avoid some regression
**if (isa<SelectInst>(I) || isa<CmpInst>(I) || isa<LoadInst>(I))**
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4285 | Could you explain a bit more about why this is necessary for avoiding a regression? |
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4285 | Thanks for your attrention.
%1 = icmp eq i177 %L, -1, %L = load i177, ptr %A, align 4,i177 -1 %B = select i1 %1, i177 0, i177 %L, %L = load i177, ptr %A, align 4,i177 -1 It is not allowed to deduce the value of %B is i177 0 when we recursive try the selection operand %1 with i1 true. we usual try to replace the compare operands, refer to simplifySelectWithICmpEq.
|
I've ended up implementing this myself in https://github.com/llvm/llvm-project/commit/3d199d086e076f0b9b90d4c59f2226a417a639b5. Additionally, I've landed the following changes to mitigate optimization regressions:
https://github.com/llvm/llvm-project/commit/cd1dcd2c956188521e668e77eec1f8913c01b644
https://github.com/llvm/llvm-project/commit/dc2b2ae7dc333f9c3769785fa147c7872adb9bba
https://github.com/llvm/llvm-project/commit/21827268ada2ee62eaee49fcfa1133ed06a63d25
As we're now doing recursive calls, you need to guard against MaxRecurse == 0 here.