This pass may modify the Cmp operands. However, the flag reg may be used by both the branch and CSEL.
Modifying CMP will have side effect on CSEL.
Details
Diff Detail
Event Timeline
For example:
%cmp = icmp sgt i32 %0, 0 %m = select i1 %cmp, i64 %v, i64 0 store i64 %m, i64* %p br i1 %cmp, label %lor.lhs.false, label %land.lhs.true
will become
...
cmp w8, #1
csel x8, x0, xzr, gt // becomes x8 = (w8 > 1) ? x0 : 0
str x8, [x1]
b.lt .LBB0_2
lib/Target/AArch64/AArch64ConditionOptimizer.cpp | ||
---|---|---|
177–189 | This looks overly complicated to me. From the previous clause, we only care about AArch64::NZCV so I think we can skip the outer loop entirely and just look at that register. Then we've got 2 conditions:
This has the added advantage that it'll be correct if we ever do decide to do something with the other compares, where exactly the same issues occur. The code in this particular switch case really will be issues specific to the actual CMP instructions we're seeing. I don't think we need the special case for Bcc in the new scheme either: I is always before the first terminator, so we should never encounter a Bcc. |
Thanks for reviewing, Tim.
I move the LiveIn check to runOnMachineFunction and piggyback NZCV use check into the same loop as finding CMP because it goes backwards as well.
lib/Target/AArch64/AArch64ConditionOptimizer.cpp | ||
---|---|---|
357–361 | Wouldn't this be better off at the top of findSuitableCompare? We wouldn't need the fairly weird logic to look through both BBs and would get the short-circuiting automatically. |
This looks overly complicated to me. From the previous clause, we only care about AArch64::NZCV so I think we can skip the outer loop entirely and just look at that register.
Then we've got 2 conditions:
This has the added advantage that it'll be correct if we ever do decide to do something with the other compares, where exactly the same issues occur. The code in this particular switch case really will be issues specific to the actual CMP instructions we're seeing.
I don't think we need the special case for Bcc in the new scheme either: I is always before the first terminator, so we should never encounter a Bcc.