This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Avoid materializing 0.0 when generating FP CSEL
ClosedPublic

Authored by evandro on Sep 21 2016, 9:39 AM.

Details

Summary

Transform a == 0.0 ? 0.0 : x to a == 0.0 ? a : x and a != 0.0 ? x : 0.0 to a != 0.0 ? x : a to avoid materializing 0.0 for FCSEL, since it does not have to be materialized beforehand for FCMP, as it has a form that implies 0.0 as an operand.

Diff Detail

Repository
rL LLVM

Event Timeline

evandro updated this revision to Diff 72075.Sep 21 2016, 9:39 AM
evandro retitled this revision from to [AArch64] Avoid materializing 0.0 when generating FP CSEL.
evandro updated this object.
evandro set the repository for this revision to rL LLVM.
evandro added a subscriber: llvm-commits.
MatzeB added a subscriber: MatzeB.Sep 21 2016, 10:01 AM

Wouldn't this work with any constant? Should we consider it for the common DAGCombiner?

Wouldn't this work with any constant? Should we consider it for the common DAGCombiner?

It might, but AArch64 doesn't materialize 0.0 explicitly when comparing against it (e.g., fcmp d0, #0.0), so it was my initial focus. Also, because it seems to be the most common value with CSEL.

I'll investigate it further.

gberry added a subscriber: gberry.Sep 21 2016, 12:16 PM
gberry added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4083 ↗(On Diff #72075)

Isn't the goal to check these options less and look at the DAG node flags instead?
If so, this might be better written as follows?

DAG.getFlags()->hasNoSignedZeros()
evandro marked an inline comment as done.Sep 21 2016, 12:34 PM
evandro added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4083 ↗(On Diff #72075)

Yes, thank you!

mcrosier edited edge metadata.Sep 21 2016, 1:13 PM

Please include full context.

evandro marked 2 inline comments as done.Sep 21 2016, 1:33 PM
evandro added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4083 ↗(On Diff #72075)

Am I missing something or SelectionDAG has no such method? Not all SDValue have SDNodeFlags either...

Wouldn't this work with any constant? Should we consider it for the common DAGCombiner?

Thinking more about this suggestion, I think that this change improves code generation only if the comparison instruction has a form with an immediate operand encoding, which is very target dependent (e.g., AArch64). Otherwise, if the target has to materialize the constant for the comparison, it can be used again by the selection (e.g., x86).

I couldn't find or devise an method to query such conditions generically and I'd appreciate suggestions. If so, I believe that this change has to remain specific to AArch64 for this round.

evandro updated this revision to Diff 72711.Sep 27 2016, 2:36 PM
evandro updated this object.

What about @MatzeB's suggestion to generalise it to any constant?

What about @MatzeB's suggestion to generalise it to any constant?

Please, see https://reviews.llvm.org/D24808#554492 and let me know if you need more information.

sebpop added a subscriber: sebpop.Oct 18 2016, 11:36 AM

Wouldn't this work with any constant? Should we consider it for the common DAGCombiner?

Thinking more about this suggestion, I think that this change improves code generation only if the comparison instruction has a form with an immediate operand encoding, which is very target dependent (e.g., AArch64). Otherwise, if the target has to materialize the constant for the comparison, it can be used again by the selection (e.g., x86).

LGTM. I think the patch looks good as it is.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4058 ↗(On Diff #72711)

The change from Evandro looks identical to this transform.
I don't see why we would need to do what Evandro proposes in the DAGcombine when this one is doing it at lowering.

Please, see https://reviews.llvm.org/D24808#554492 and let me know if you need more information.

That answers why not move into the DAGCombiner, not why not make it any constant, right?

Please, see https://reviews.llvm.org/D24808#554492 and let me know if you need more information.

That answers why not move into the DAGCombiner, not why not make it any constant, right?

That comment also has the answer for "why not make it any constant" as well. Here is what Evandro said:

if the target has to materialize the constant for the comparison, it can be used again by the selection

Please, see https://reviews.llvm.org/D24808#554492 and let me know if you need more information.

That answers why not move into the DAGCombiner, not why not make it any constant, right?

I think that it answers both questions. Again, unless the target cam perform the comparison against a constant implicit in the corresponding instruction, then this transform provides no improvement. IOW, only if the target does not have to materialize the constant for the comparison does it make sense to not materialize it for the selection. Otherwise, if it has to be materialzied for one, it will likely be around to be used for the other.

rengolin accepted this revision.Oct 18 2016, 12:47 PM
rengolin added a reviewer: rengolin.

I think that it answers both questions.

Well, that, and the fact that there's a special zero variant for fpcmp that I just found out. :)

I agree this is at least a place as good as any. I'm not sure how this could be done at the DAGCombiner level, being such target-specific transformation.

LGTM. Thanks!

This revision is now accepted and ready to land.Oct 18 2016, 12:47 PM

Well, that, and the fact that there's a special zero variant for fpcmp that I just found out. :)

I stated as much in the patch summary above:

FCMP, as it has a form that implies 0.0 as an operand.

Thank you.

FCMP, as it has a form that implies 0.0 as an operand.

You did. Completely my fault. Sorry! :)

This revision was automatically updated to reflect the committed changes.