This is an archive of the discontinued LLVM Phabricator instance.

[ARM] generate correct code for armv6-m XO big stack operations
ClosedPublic

Authored by stuij on Jun 30 2023, 9:46 AM.

Details

Summary

The ARM backend codebase is dotted with places where armv6-m will generate
constant pools. Now that we can generate execute-only code for armv6-m, we need
to make sure we use the movs/lsls/adds/lsls/adds/lsls/adds pattern instead of
these.

Big stacks is one of the obvious places. In this patch we take care of two
sites:

  1. take care of big stacks in prologue/epilogue
  2. take care of save/tSTRspi nodes, which implicitly fixes emitThumbRegPlusImmInReg which is used in several frame lowering fns

Diff Detail

Unit TestsFailed

Event Timeline

stuij created this revision.Jun 30 2023, 9:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 9:46 AM
stuij requested review of this revision.Jun 30 2023, 9:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 9:46 AM
efriedma added inline comments.Jun 30 2023, 10:42 AM
llvm/lib/Target/ARM/ARMAsmPrinter.cpp
1361

I'd prefer to actually verify the tLSLri has the expected shift amount/source/destination.

llvm/test/CodeGen/ARM/large-stack.ll
24

The generated code here is ridiculously inefficient; do you have some plan for generating code that's closer to optimal?

stuij updated this revision to Diff 536745.Jul 3 2023, 6:25 AM

address review comment

stuij marked an inline comment as done.Jul 3 2023, 6:37 AM
stuij added inline comments.
llvm/test/CodeGen/ARM/large-stack.ll
24

Luckily armv6-M XO is a partner-requested feature. The current timeline is that we will first make it functionally work (which seems reasonable to me). Then we will benchmark and compare it with XO turned off. We will discuss the result and will optimise where needed.

efriedma accepted this revision.Jul 3 2023, 8:53 AM

LGTM with a couple minor comments

llvm/lib/Target/ARM/ARMAsmPrinter.cpp
1145

For the instructions where it's relevant, can you add an assertion that the source and destination registers are equal?

1360

+=?

This revision is now accepted and ready to land.Jul 3 2023, 8:53 AM
stuij updated this revision to Diff 536841.Jul 3 2023, 10:30 AM
stuij marked an inline comment as done.

address review comments

stuij marked an inline comment as done.Jul 3 2023, 10:42 AM
stuij added inline comments.
llvm/lib/Target/ARM/ARMAsmPrinter.cpp
1360

yea, I went back and forth on this! Using addition which is analogous but won't represent what the processor will actually do felt more wrong than expressing what we expect it to do. But really I'm fine either way. Changed it.

efriedma added inline comments.Jul 3 2023, 10:54 AM
llvm/lib/Target/ARM/ARMAsmPrinter.cpp
1360

My primary concern with my review comments is making sure if someone messes with tMOVi32imm expansion, this code with either continue to work correctly, or assert that it's seeing something unexpected.

efriedma accepted this revision.Jul 3 2023, 10:55 AM
This revision was landed with ongoing or failed builds.Jul 4 2023, 2:40 AM
This revision was automatically updated to reflect the committed changes.
stuij added a comment.Jul 4 2023, 3:30 AM

This came back with buildbot failures as the C in FileCheck wasn't uppercased. To my surprise I figured out that on MacOS HFS+ is often configured to be case-agnostic.

I pushed a fix.