This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Move ARMCodegenPrepare to TypePromotion
ClosedPublic

Authored by samparker on Oct 29 2019, 3:59 AM.

Details

Summary

Convert ARMCodeGenPrepare into a generic type promotion pass by:

  • Removing the insertion of arm specific intrinsics to handle narrow types, as we weren't using these.
  • Removing ARMSubtarget references.
  • Now query a generic TLI object to know which types should be promoted and what they should be promoted to.
  • Move all codegen tests into Transforms folder and testing using opt and not llc, which is how they should have been written in the first place... This had the added advantage of running Alive2 on them.

Diff Detail

Event Timeline

samparker created this revision.Oct 29 2019, 3:59 AM

I guess this step make sense: there was very little ARM specific about this, so why not promote this to a generic codegen pass. It's a generic optimisation, so other targets could benefit from this, but not sure if that then requires some evidence before moving this code.

Some nits inlined.

lib/CodeGen/TypePromotion.cpp
935 ↗(On Diff #226866)

If TLI is not available, could we also just bail here?
Does that avoid the constant and check below, see next inline comment?

962 ↗(On Diff #226866)

This one.

It probably makes sense to verify that this actually helps some other target, yes; there's no point to generalizing it if it isn't useful on other targets. I would guess it has similar benefits on other non-x86 targets, though.

craig.topper added inline comments.
lib/CodeGen/TypePromotion.cpp
47 ↗(On Diff #226866)

So this pass is off by default on ARM?

samparker marked an inline comment as done.Oct 30 2019, 3:02 AM
samparker added subscribers: asb, atanasyan.

Thanks both, yes I'll get some numbers for AArch64 as well as more for ARM... @asb and @atanasyan would you be interested in getting some numbers for your targets?

lib/CodeGen/TypePromotion.cpp
47 ↗(On Diff #226866)

We've had it on downstream for about a year now and I've been bug fixing. I now want to enable it by default, but also generalise it first.

samparker updated this revision to Diff 227060.Oct 30 2019, 3:32 AM
  • Now bailing if pass config isn't available.
  • Rebased onto the mono repo.
Herald added a project: Restricted Project. · View Herald TranscriptOct 30 2019, 3:32 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
craig.topper added inline comments.Nov 1 2019, 11:38 PM
llvm/lib/CodeGen/TypePromotion.cpp
209

Are these two lines needed? The pointer element type should match the result type of the load right? Only asking because this code will need to be updated if we ever move to opaque pointers.

Sorry for the delay on this, setting up running benchmarks on my machine took far longer than I expected! I had forgotten that aarch64 already addressing the uxts by enabling cmps to perform, so these results are just for arm. This pass was originally to target CoreMark, but its helpful in SPEC too:

benchmarkexec_time
External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test2.6%
External/SPEC/CFP2006/450.soplex/450.soplex.test-2.9%
External/SPEC/CINT2006/400.perlbench/400.perlbench.test-1.4%
External/SPEC/CINT2006/403.gcc/403.gcc.test-0.9%
SPEC2006 Geomean-0.3%

I had hoped that xalancbmk would show an improvement, as the pass removes almost half of the unsigned extends from the code, but its execution time was highly variable - even on aarch64. I also wasn't expecting an FP benchmark to be affected, but soplex has two thirds of its uxts removed.

Ping @asb @atanasyan

@hfinkel From a quick look at the power isa, this could also be useful to you?

samparker marked an inline comment as done.Nov 6 2019, 3:25 AM
samparker added inline comments.
llvm/lib/CodeGen/TypePromotion.cpp
209

Yeah, this looks odd.... I'll remove it, thanks.

@asb and @atanasyan would you be interested in getting some numbers for your targets?

Yes, it would be very interesting. Sorry for delay with the answer.

samparker updated this revision to Diff 228376.Nov 8 2019, 12:22 AM

Removed pointer type check.

SjoerdMeijer accepted this revision.Nov 29 2019, 7:18 AM

Hi Sam, about the AArch64 results: I assume you targeted the A32 ISA, can you confirm that? Because that would mean we have 2 data points, at least 2 ISA for which this is beneficial: T32 and A32.

I've checked the latest PowerISA spec Version 3.0 B (https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv), and for the cmp instruction I don't see that extends can be folded into it like in A64 that would make this transformation less or not so beneficial; I don't see extends for register RB. Similarly, I've checked MIPS, and don't see them either there.

So, I think we have 2 data points, and enough suspicion that it is beneficial for other non-X86 targets too, like Eli said. I think that is enough.

Please wait a couple of working days with committing just in case people still want to run numbers.

This revision is now accepted and ready to land.Nov 29 2019, 7:18 AM

Yes, I ran using the armv8 triple on a big boy box and then thumb2 for our embedded targets.

This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/ARM/CGP/arm-cgp-signed.ll