This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Add a store-merging optimization pass and enable for AArch64.
ClosedPublic

Authored by aemerson on Sep 1 2021, 10:42 PM.

Details

Summary

This is a first attempt at a constant value consecutive store merging pass, a counterpart to the DAGCombiner's store merging optimization.

The high level goals of this pass:

  1. Have a simple and efficient algorithm. As close to linear time as we can get. Thus, prioritizing scalability of the algorithm over merging every corner case we can find. The DAGCombiner's store merging code has been the source of compile time and complexity issues in the past and I wanted to avoid that.
  2. Don't introduce any new data structures for ordering memory operations. In MIR, we don't have the concept of chains like we do in the DAG, and the instruction order is stricter than enforcing ordering with graph edges. Although I considered adding something similar, I couldn't justify the overhead.

The pass is current split into 3 main parts. The main store merging code focuses on identifying candidate stores and managing the candidate group that's under consideration for merging. Analyzing addressing of stores is a potentially complex part and for now there's just a basic implementation to identify easy cases. Finally, the other main bit of complexity is the alias analysis, which tries to follow the same logic as the DAG's AA.

Currently this implementation only supports merging of constant stores. Stores of arbitrary variables are technically possible with a very small change, but the DAG chooses not to do this. Doing so here makes most code worse since there's extra overhead in merging values into wider registers.

On AArch64 -Os, this optimization results in very minor savings on CTMark.

Diff Detail

Event Timeline

aemerson created this revision.Sep 1 2021, 10:42 PM
aemerson requested review of this revision.Sep 1 2021, 10:42 PM
aemerson updated this revision to Diff 370165.Sep 1 2021, 10:43 PM
arsenm added inline comments.Sep 2 2021, 10:04 AM
llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
39

Why Optional? Isn't 0 naturally good enough?

llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
66

Don't know what you would want TTI for in a MIR pass, but this is unused

115

Demorgan this?

189

This function seems to largely be recreating MachineInstr::mayAlias

198

Optional seems like overkill, 0 size is naturally a missing size

272

Similar to the change in 9d720dcb89e8da4d12aa1832d74614adc6aa2c82, should this use getModRefInfo instead of isNoAlias?

299

Cache this in the class?

306

This should be a const reference

338

Cache this in the class?

382

Should directly pass WideValueTy, the memory operands track the type now

397–400

Invert condition and unindent the if case?

625

Assuming the pointer size is 64, need to query for this

631

I guess this is just ignoring alignment?

648–650

Wouldn't the target just not add the pass to the pipeline?

paquette added inline comments.Sep 2 2021, 11:12 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
158

Will there ever be a combination of opcodes where Base0Def->getOpcode() != Base1Def->getOpcode()?

Would it be simpler to check something like...

unsigned Opc = Base0Def->getOpcode();
if (Opc !=  Base1Def->getOpcode())
  return false;

if (Opc == TargetOpcode::G_FRAME_INDEX)
   ...

?

404

Wonder if we should actually have remarks for these as well? IIRC people sometimes want remarks from SDAG. Maybe this is something we can improve on with GISel.

paquette added inline comments.Sep 2 2021, 11:20 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
285

might as well be const?

319

is the commented out code here intentional?

aemerson added inline comments.Sep 2 2021, 11:40 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
189

There's some overlap but it doesn't look functionally the same, we're also calling out to our own analysis in this implementation as well via aliasIsKnownForLoadStore().

272

Maybe, I'll take look at that.

319

Nope, I'll remove that. We may support that in future though.

631

Yes, for now we generate stores with the alignment of the original store, so it's conservative.

aemerson added inline comments.Sep 2 2021, 11:44 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
272

Actually that seems to be only accepting Instruction* so we can't use it here.

Matt added a subscriber: Matt.Sep 2 2021, 12:02 PM
aemerson updated this revision to Diff 370367.Sep 2 2021, 12:34 PM

Address most comments.

jroelofs added inline comments.Sep 3 2021, 1:30 PM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
406

should this be in a MORE.emit([&](){ ... }); lambda?

515
llvm/test/CodeGen/AArch64/GlobalISel/store-merging.ll
180

I think it would be valuable to have another version of this one without the store to %ptr2, to check that it gets merged into a single store.

aemerson updated this revision to Diff 370695.Sep 3 2021, 8:45 PM

Fix hard coding pointer size to use DL. Address Jon's comments.

paquette accepted this revision.Sep 8 2021, 3:12 PM

I think I basically only have style nits at this point.

llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
354

pull Stores[0] out into a variable?

356

maybe Stores.size() as well?

488

why not llvm::any_of?

532

pull C.Stores[0] out into a variable?

This revision is now accepted and ready to land.Sep 8 2021, 3:12 PM
This revision was landed with ongoing or failed builds.Nov 15 2021, 9:10 PM
This revision was automatically updated to reflect the committed changes.
uabelho added inline comments.Nov 15 2021, 11:22 PM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

I noticed that this buildbot failed on this include:
https://lab.llvm.org/buildbot/#/builders/110/builds/7989

We see similar problems downstream in our own private bots:

06:37:16 ../lib/CodeGen/GlobalISel/LoadStoreOpt.cpp:39:10: fatal error: 'any' file not found
06:37:16 #include <any>
06:37:16          ^~~~~
06:37:16 1 error generated.

We get that when compiling with clang 8.0

aemerson added inline comments.Nov 16 2021, 12:37 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

Is this a problem with your C++ library? It should be there as part of C++17 support.

fhahn added a subscriber: fhahn.Nov 16 2021, 12:45 AM
fhahn added inline comments.
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

I think LLVM only requires C++14, not C++17, so this probably needs to use something else.

uabelho added inline comments.Nov 16 2021, 12:51 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

From llvm/docs/CMake.rst:

**CMAKE_CXX_STANDARD**:STRING
  Sets the C++ standard to conform to when building LLVM.  Possible values are
  14, 17, 20.  LLVM Requires C++ 14 or higher.  This defaults to 14.

and when I get the compilation error I see

-std=c++14

on the command line so we do compile with C++14.

fhahn added inline comments.Nov 16 2021, 4:01 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

Looks like Amara already removed the include and the bot is back to green: https://lab.llvm.org/buildbot/#/builders/110/builds/7995

Thanks!

uabelho added inline comments.Nov 16 2021, 4:27 AM
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
40

Yep, thanks!