Enhance to fold csel into csinc instruction.
Fix https://github.com/llvm/llvm-project/issues/53071
Details
Diff Detail
Event Timeline
fix to support the combine of long long type and
failure of Transforms/LoopVectorize/vplan-widen-select-instruction.ll
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14670 | the type for AArch64CC need be MVT::i32 |
Adding some potential reviewers.
I think it would be good to add the tests with the current (baseline) CHECK lines, so it is easy to see how this patch changes existing codegen.
It would be good to add a link to the motivating bug in the patch description:
https://github.com/llvm/llvm-project/issues/53071
reformat the code according the Pre-merge checking
ok , done, related commit https://reviews.llvm.org/rGdc01fb1d726a8faad7e78cbf580300a4005c6a23
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14635 | This might be easier to read as: Does the a!=0/a==0 matter? Or is it valid for all conditions? | |
14648 | Why does this only trigger from a CopyFromReg? | |
14651 | Should we be checking the input CC (operand 2) is NE? It should probably be using AArch64CC::CondCode without needing to go back to ISD::CondCode, possibly using AArch64CC::getInvertedCondCode to invert the condition. |
Some of the previous comments don't appear to have been addressed. Only looking for a CopyFromReg looks suspicious, and there is no testing for csneg that I can see.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | I don't think this will only be EQ or NE. It can be any of the predicates. The condition code is usually obtained via static_cast<AArch64CC::CondCode>(OpCC->getZExtValue()) or something like it. |
As all case now are csel related, so remove csneg and it will be processed in another new patch
Thanks. And the CopyFromReg? This is currently converting add(csel(C, 1, cc), CopyFromReg(x)) into csinc(add(x, C), x). I'm not sure why it would be limited to a CopyFromReg. I would expect to convert any add(csel(C, 1, cc), y) (or add(y, csel(C, 1, cc)) as add is commutative).
In order to be profitable this seems to also be relying on the fact that the add(x, C) will turn into an ADDri, which wouldn't be true for all constants. It might be neutral for larger constants? It's probably worth adding a test or two.
1、Handle commutivity
2、prevent large const value, and add related test case
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14635 | valid for all conditions, Both of them are add in test file aarch64-isel-csinc.ll | |
14648 |
this may be can expand later, but now I don't have such case.
As it is already restricted scalar Integer type, I think it would be commutative add? | |
14651 | Yes, added checking conditon. and thanks for you detail advice of using AArch64CC::getInvertedCondCode . | |
14652 | done, delete the assert after use API static_cast<AArch64CC::CondCode>(OpCC->getZExtValue()) |
Thanks for detail advice, I updated in three point
1、Handle commutivity
2、prevent large const value , and add related test case
3、delete the condition CopyFromReg
Thanks for the updates - a couple of last comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | I don't think it matters if the RHS has more than 1 use, it should still be OK to transform as we only use the value in another expression. | |
14655–14658 | AArch64CC::CondCode AArch64CC = static_cast<AArch64CC::CondCode>(CSel.getConstantOperandVal(2)); |
retry the commit with condition RHS.hasOneUse()
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | I had delete the condition !RHS.hasOneUse(), but recently the upstream may has some regression, it doen't finish the precommit build checking. so may I need wait a moment to figure out that issue ? | |
14655–14658 | done, thanks |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | Hmm. That sounds suspicious. What was the problem? That would usually imply an infinite loop in DAG combine, but I'm not sure how csinc(add(x, y), x, cc) could get back to add(x, csel(y, 1, cc)). Or why the uses of y matter in that case. I was about to OK the revision, but it would be good to make sure that nothing problematic is happening. |
Now, I can make true the x86 debian fail is brought in by other patch already submited in the upstream,
as it still have the same fail at the commit node Diff 10 403452 after rolling back the change of delete the condition !RHS.hasOneUse(),
and it is successfully at the commit node Diff 8 402198 , which has same change.
Is it reasonable?
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | hi, dmgreen In fact, I can't reproduce the failure base on my local building with delete the condition !RHS.hasOneUse(), so it may be not this change cause the regression, as you can see the other person's commit don't also pass building recently. Just to be sure, do you think it is ok to roll back the change of delete the condition !RHS.hasOneUse(), and submit it firstly ? I'll resubmit the change of delete the condition !RHS.hasOneUse() after the upstream is stable with double checking. The building log is finally came out record in https://reviews.llvm.org/harbormaster/build/215411/, and detail info can be found in x86 debian fail libarcher :: races/critical-unrelated.c libarcher :: races/lock-nested-unrelated.c libarcher :: races/lock-unrelated.c libarcher :: races/parallel-simple.c libarcher :: races/task-dependency.c libarcher :: races/task-taskgroup-unrelated.c libarcher :: races/task-taskwait-nested.c libarcher :: races/task-two.c libarcher :: task/task_late_fulfill.c |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14652 | They sound like they should be unrelated, coming from a x86 machine. I would recommend the version without the !RHS.hasOneUse() check, unless we have some evidence that it is really this check that is breaking something. But I don't think it should do. |
This might be easier to read as:
b + CSEL (c, 1, cc) => CSINC(b+c, b, cc)
b + CSNEG (1, c, cc) => CSINC(b-c, b, !cc)
Does the a!=0/a==0 matter? Or is it valid for all conditions?