This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Improve v8.1-A code-gen for atomic load-subtract
ClosedPublic

Authored by olista01 on Jan 24 2018, 6:57 AM.

Details

Summary

Armv8.1-A added an atomic load-add instruction, but not a load-subtract
instruction. Our current code-generation for atomic load-subtract always
inserts a NEG instruction to negate it's argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-subtract
operation into a subtract and a load-add, allowing the normal DAG
optimisations to work on it.

I've left the old tablegen patterns in because they are still needed for
global isel.

Some of the tests in this patch are copied from D35375 by Chad Rosier (which was abandoned).

Diff Detail

Repository
rL LLVM

Event Timeline

olista01 created this revision.Jan 24 2018, 6:57 AM

Ping (also for the related D42478).

On the previous review (D35375), Geoff Berry suggested to do this in DAG Combine, to see if you can do some further combines, and suggested to not limit this to constants: https://reviews.llvm.org/D35375#810045

Did you consider those points?

lib/Target/AArch64/AArch64ISelLowering.cpp
7394 ↗(On Diff #131261)

Is there no (easy) way to do this in tablegen? I would prefer a tablegen pattern over C code. Even though this C looks nice and tidy.

Geoff Berry suggested to do this in DAG Combine

DAG combine runs after this code, so it is being used to do the optimisations we want. In the *_neg_imm test cases it folds a constant and a sub into a different constant, and in the *_neg_arg test cases it folds two SUBs into nothing.

Is there no (easy) way to do this in tablegen? I would prefer a tablegen pattern over C code. Even though this C looks nice and tidy.

That's what the current code does (I've left the patterns in for now since they are still needed for global isel). The problem with this is that the patterns get used at the end of the SelectionDAG pass (to emit MachineInstrs), so we'd have to re-implement the folding of subtraction instructions at that level. The alternative would be to add extra DAG patterns which match when the operand is a negative immediate, a subtraction etc, but I didn't do that as it would be duplicating the optimisations done by DAGCombine in a narrower context.

christof accepted this revision.Feb 12 2018, 5:48 AM

Thanks for the clarification. Looked at the tests and now see what this is doing.
The code looks good to me.
Thanks

This revision is now accepted and ready to land.Feb 12 2018, 5:48 AM
This revision was automatically updated to reflect the committed changes.