This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Distribute MVE post-increments
ClosedPublic

Authored by dmgreen on Apr 9 2020, 11:09 AM.

Details

Summary

This adds some extra processing into the Pre-RA ARM load/store optimizer to detect and merge MVE loads/stores and adds of the same base. This we don't always turn into a post-inc during ISel, and due to the DAG nature of the graph we don't always know an order to use for the nodes, not knowing which nodes to make post-inc and which to use the new post-inc of. After ISel, we have an order that we can use to post-inc the following instructions.

So this looks for Loads/Store with a starting offset of 0, and an add/sub from the same base, and a number of other loads/stores. We then do some checks and convert the load/store into a postinc variant. Any loads/stores after it have the offset subtracted from their immediates. For example:

LDR #4           LDR #4
LDR #0           LDR_POSTINC #16
LDR #8           LDR #-8
LDR #12          LDR #-4
ADD #16

It only handles MVE loads/stores at the moment. Normal loads/store will be added in a followup patch, they just have some extra details to ensure that we keep generating LDRD/LDM successfully.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 9 2020, 11:09 AM
samparker added inline comments.Apr 14 2020, 8:17 AM
llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
2538

Looks like these helpers should be extracted to ARMBaseInstrInfo, I also have a helper in ARMLowOverheadLoops to get this immediate. And I would have thought that there's already something to legal addressing modes somewhere?

2614

can be assigned from getBaseOp... on initialisation.

2616

Could increment be assigned multiple times? And if so, does that matter?

2617

exit early on inverse instead?

2670

Do we need this condition..?

dmgreen updated this revision to Diff 257758.Apr 15 2020, 10:32 AM
dmgreen marked 6 inline comments as done.
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
2538

I haven't seen anything for addressing modes exactly. They all seems to be spread over ARMISelDAGToDAG, if anywhere.

I'll try and move what I can. The getBaseOperandIndex above is really "shouldTryDistributePostInc" more than anything else, so I've left it here. There appears to be another one for getting add/sub offsets in this very file, but is also testing other things to do with reg usages and predicates.

2616

Hmm. I remember thinking about that and figuring it was OK. Every time I tried, MRI->use_nodbg_instructions seemed to be in order, so I think it would be OK. I'm not sure if that is really guaranteed though.

I'll add a check to make sure there is only one. We can always make it more relaxed in the future if needed.

2670

Like I said above in the comment this is to be on the safe side, and only handle instructions where either BaseAccess dominates Use or Use dominated BaseAccess.

SjoerdMeijer added inline comments.Apr 20 2020, 6:33 AM
llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
2677

Do we always want to run this? I.e., it looks like this is always run as a first step and can generate negative offsets, so was wondering if that can impact code-size.

dmgreen marked an inline comment as done.Apr 20 2020, 7:13 AM
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
2677

I think so far it will only end up removing instructions, so can always be run. It should only ever make things smaller. And I don't believe there is a performance impact in MVE of a postinc load into an offset load.

With normal loads like LDR, which I will hopefully add in another patch, there might be some times when it is better to create LDRD or LDM and we have to be more careful. That's not something that we have to worry about yet though.

Sorry, I thought I had approved this last week... I'm happy if @SjoerdMeijer has no other comments.

SjoerdMeijer accepted this revision.Apr 20 2020, 8:50 AM

Yeah, nice one, LGTM too.
Just some extra nits that don't need another review.

llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp
1389

nit: not a huge fan of macros, also if you grep for an opcode this makes it more difficult to find, but okay, probably fine here. Perhaps only a more descriptive name than DOMVE?

2509

typo: distrubute

2579

typo: postin -> postinc

2595

nit: since we don't use Offset:

if (!Use.getOperand(BaseOp + 1).getImm())
2630

nit: I might be missing something obvious, but I don't know what's meant with "-ve increments"

2677

okidoki, cheers

This revision is now accepted and ready to land.Apr 20 2020, 8:50 AM
dmgreen updated this revision to Diff 259237.Apr 22 2020, 4:33 AM

Thanks for the review. These are those updates.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2020, 6:28 AM