Transform a == 0.0 ? 0.0 : x to a == 0.0 ? a : x and a != 0.0 ? x : 0.0 to a != 0.0 ? x : a to avoid materializing 0.0 for FCSEL, since it does not have to be materialized beforehand for FCMP, as it has a form that implies 0.0 as an operand.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Wouldn't this work with any constant? Should we consider it for the common DAGCombiner?
It might, but AArch64 doesn't materialize 0.0 explicitly when comparing against it (e.g., fcmp d0, #0.0), so it was my initial focus. Also, because it seems to be the most common value with CSEL.
I'll investigate it further.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4083 ↗ | (On Diff #72075) | Isn't the goal to check these options less and look at the DAG node flags instead? DAG.getFlags()->hasNoSignedZeros() |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4083 ↗ | (On Diff #72075) | Yes, thank you! |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4083 ↗ | (On Diff #72075) | Am I missing something or SelectionDAG has no such method? Not all SDValue have SDNodeFlags either... |
Thinking more about this suggestion, I think that this change improves code generation only if the comparison instruction has a form with an immediate operand encoding, which is very target dependent (e.g., AArch64). Otherwise, if the target has to materialize the constant for the comparison, it can be used again by the selection (e.g., x86).
I couldn't find or devise an method to query such conditions generically and I'd appreciate suggestions. If so, I believe that this change has to remain specific to AArch64 for this round.
Please, see https://reviews.llvm.org/D24808#554492 and let me know if you need more information.
LGTM. I think the patch looks good as it is.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4058 ↗ | (On Diff #72711) | The change from Evandro looks identical to this transform. |
That answers why not move into the DAGCombiner, not why not make it any constant, right?
That comment also has the answer for "why not make it any constant" as well. Here is what Evandro said:
if the target has to materialize the constant for the comparison, it can be used again by the selection
I think that it answers both questions. Again, unless the target cam perform the comparison against a constant implicit in the corresponding instruction, then this transform provides no improvement. IOW, only if the target does not have to materialize the constant for the comparison does it make sense to not materialize it for the selection. Otherwise, if it has to be materialzied for one, it will likely be around to be used for the other.
Well, that, and the fact that there's a special zero variant for fpcmp that I just found out. :)
I agree this is at least a place as good as any. I'm not sure how this could be done at the DAGCombiner level, being such target-specific transformation.
LGTM. Thanks!
I stated as much in the patch summary above:
FCMP, as it has a form that implies 0.0 as an operand.
Thank you.