This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Generate 8.1-m CSINC, CSNEG and CSINV instructions.
ClosedPublic

Authored by dmgreen on Aug 20 2019, 10:05 AM.

Details

Summary

Arm 8.1-M adds a number of related CSEL instructions, including CSINC, CSNEG and CSINV. These choose between two values given the content in CPSR and a condition, performing an increment, negation or inverse of the false value.

This adds some selection for them, either from constant values or patterns. It doesn't include CSEL directly, which is currently not always making code better. It is still useful, but we will have to check more carefully where it should and shouldn't be used.

Code by Ranjeet Singh and Simon Tatham, with some modifications from me.

Diff Detail

Repository
rL LLVM

Event Timeline

dmgreen created this revision.Aug 20 2019, 10:05 AM
dmgreen edited the summary of this revision. (Show Details)Aug 20 2019, 10:20 AM

Could you give a brief description of why this is better than just using an "it" block?

llvm/test/CodeGen/Thumb2/csel.ll
21 ↗(On Diff #216172)

There's an opportunity to save another two bytes converting mvn->movs.

llvm/test/CodeGen/Thumb2/mve-vcmpfr.ll
769 ↗(On Diff #216172)

This is cset, right? Why aren't we printing the alias?

dmgreen marked an inline comment as done.Aug 24 2019, 3:31 AM

Sorry for the delay. I found a problem which was proving difficult to reduce, but I'm now pretty sure that is just exposing some problem in loloops.

Could you give a brief description of why this is better than just using an "it" block?

The tests shows this comes up more than I had expected. In general, we are replacing a IT; MVN/ADD/RSB with a single CSINV/CSINC/CSNEG, which can also use zr and includes a "free move". For codesize the IT pair will be either 4 bytes or 6, depending on the instruction needed and what registers are used. The CS* will always be 4, and includes that free move. For cycle times, we are talking about cortex-m here so I would expect the IT may or may not be free depending on if the surrounding instructions can be dual issued. The CS* should always take a single cycle I would presume. So generally smaller or the same, which is what my other testing also shows.

llvm/test/CodeGen/Thumb2/mve-vcmpfr.ll
769 ↗(On Diff #216172)

It looks like this is because the instruction has an implicit CPSR use, which the Alias isn't accounting for. It's only checking for 4 operands, but during codegen the MI instruction will have 5.

I'll split this into a separate patch. It's not a big change but it's probably better split out from here.

dmgreen marked an inline comment as done.Aug 24 2019, 3:57 AM
dmgreen added inline comments.
llvm/test/CodeGen/Thumb2/csel.ll
21 ↗(On Diff #216172)

D66701 for this one.

dmgreen updated this revision to Diff 217016.Aug 24 2019, 4:01 AM

Rebase and some extra csel tests

samparker added inline comments.Sep 2 2019, 8:23 AM
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
46 ↗(On Diff #217016)

Freaky way to control code gen! Do we really need this if code size is better and performance is on par or better..?

llvm/lib/Target/ARM/ARMISelLowering.cpp
4829 ↗(On Diff #217016)

Reduce the nesting here.

4853 ↗(On Diff #217016)

It's not obvious to me why CSINC is excluded here, could you add a comment?

4854 ↗(On Diff #217016)

But FalseVal = TrueVal?

dmgreen updated this revision to Diff 218382.Sep 2 2019, 10:23 AM
dmgreen marked 5 inline comments as done.
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
46 ↗(On Diff #217016)

I think you are right. I've removed the option.

llvm/lib/Target/ARM/ARMISelLowering.cpp
4854 ↗(On Diff #217016)

It does it the other way around in the future (D66701). I've moved that change here, and I think the swap makes it obvious that we are inverting the two conditions, even if we are about to throw one away.

Cheers. Would you mind adding/changing a couple of tests in csel.ll so that we test conditions other than sgt?

dmgreen updated this revision to Diff 218406.Sep 3 2019, 1:17 AM

More tests!

samparker accepted this revision.Sep 3 2019, 2:14 AM

Cheers! LGTM.

This revision is now accepted and ready to land.Sep 3 2019, 2:14 AM
This revision was automatically updated to reflect the committed changes.