Page MenuHomePhabricator

[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
AbandonedPublic

Authored by courbet on Apr 5 2019, 8:05 AM.

Details

Summary

This opens up numerous possibilities for optimizations.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
  • Move all tests out of CodeGen.
  • Update PowerPC tests.
  • Add EarlyCSE to addMemcmpPasses().
spatel added a comment.May 6 2019, 6:17 AM

The code changes look like what I expected, but I'd still like to hear from at least 1 other reviewer that we are not violating some high-level principle.

llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71

Why are we generating bswap for a big-endian target?

courbet updated this revision to Diff 198281.May 6 2019, 8:03 AM
courbet marked an inline comment as done.

Fixing RUN lines for PowerPC opt tests.

courbet added inline comments.May 6 2019, 8:09 AM
llvm/test/Transforms/ExpandMemCmp/PowerPC/memcmpIR.ll
69–71

Thanks for the catch. We're not, but contrary to llc, opt does not seem to get the data layout from the target, so I/m now explicitly specifying the data layout on the RUN line.

The general approach of expanding memcmp in opt makes sense, I think. The placement seems a little on the early side, but that's probably okay given we don't really have interesting optimizations for memcmp calls besides expanding them.

Not sure about the extra EarlyCSE invocation; it's not free.

llvm/test/Other/opt-O2-pipeline.ll
143

How hard would it be to preserve the domtree?

courbet planned changes to this revision.May 6 2019, 11:37 PM
courbet marked an inline comment as done.

Thanks Eli.

llvm/test/Other/opt-O2-pipeline.ll
143

I think it should not be too hard because we merely add blocks in a diamond in the middle of the graph, so the change is quite local. I'll have a look at that.

RKSimon added a subscriber: RKSimon.

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

@courbet Please can you ensure you have EXPENSIVE_CHECKS enabled in all your builds before going any further - you've been breaking the buildbots for well over a week now and you still haven't fixed the underlying issue.

Thanks for the suggestion (I can't believe I lived without this for so long) - I could finally reproduce and I have a fix (D62193).

courbet updated this revision to Diff 201689.May 28 2019, 9:07 AM
courbet marked an inline comment as done.

Make ExpandMemCmp preserve the DomTree.

ychen added a subscriber: ychen.Jun 20 2019, 3:52 PM

Ping. Any other concerns ?

spatel added inline comments.Jun 24 2019, 10:12 AM
llvm/include/llvm/Analysis/TargetTransformInfo.h
622–623 ↗(On Diff #201689)

Can we take/use the Function's OptSize attribute as a preliminary step for this patch to reduce the number of diffs?

courbet updated this revision to Diff 206390.Jun 25 2019, 1:46 AM
courbet marked an inline comment as done.

Split off options refactoring to r364281, rebase.

spatel added inline comments.Jun 25 2019, 6:49 AM
llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755

Why/how are we checking x86 asm in an IR transform test file?

I don't think there's a good way to do end-to-end testing now within the regression test dir.

We would be better off creating real end-to-end (C source --> x86 asm) tests within test-suite? That way, we can be sure that no passes anywhere in the pipeline are interfering with our memcmp patterns.

courbet marked an inline comment as done.Jun 25 2019, 7:13 AM
courbet added inline comments.
llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
750–755

Right, I think I messed up updating the tests, sorry. The intent was to check for IR here. Will fix.

courbet updated this revision to Diff 206445.Jun 25 2019, 7:46 AM

Remove asm testing.

spatel accepted this revision.Jun 25 2019, 8:22 AM

LGTM - see inline for a few more nits.

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
308–314

Formatting is off here - line that fits 80-col is split, and line that doesn't fit is not split.

435

Formatting - split line.

509

Formatting - split line.

652

Formatting - split line.

llvm/test/Transforms/PhaseOrdering/X86/memcmp.ll
6

Update stale comment:
'This tests interaction between the MergeICmp and ExpandMemCmp IR transform passes.'

This revision is now accepted and ready to land.Jun 25 2019, 8:22 AM
xbolva00 added inline comments.
llvm/lib/Transforms/Scalar/ExpandMemCmp.cpp
252–253

Unused parameter?

courbet updated this revision to Diff 206614.Jun 26 2019, 2:14 AM

Rebase on r364384.

courbet marked 6 inline comments as done.Jun 26 2019, 2:19 AM

I encourage adding small memcmp tests to test-suite as a follow-up, so we know that things won't break going forward.

I'm working on it.

I haven't seen the DomTreeUpdater API before now, so I'm assuming the tests are verifying that we made the correct updates (and the earlier rL361239 has survived in trunk).

That was my first go at it too. Tests do break when I remove the updates, so I'm guessing they do :D

courbet updated this revision to Diff 206629.Jun 26 2019, 4:44 AM

clang-format

For information, I had to roll this back at it breaks sanitizers. Sanitizer passes insert some nobuiltin attributes on all memcmp calls to prevent expansion and still be able to intercept them (maybeMarkSanitizerLibraryCallNoBuiltin), and these passes are added last in opt:

PMBuilder.addExtension(PassManagerBuilder::EP_OptimizerLast, addMemorySanitizerPass);

so they now run after ExpandMemCmp.

The fix would be to move addMemcmpPasses(MPM); after addExtensionsToPM(EP_ScalarOptimizerLate, MPM);. I'll do this after I get the test-suite benchmarks in, so that we can validate the interactions.

test-suite benchmarks: https://reviews.llvm.org/D64082

Full benchmark results for reference (haswell, 10 runs):

The cells highlighted in blue are the statistically significant ones.

@courbet What's happening with this patch?

I rolled back due to some bot breakages that I was not able to see were related or not. I have not come back to it yet, will do.

RKSimon requested changes to this revision.Sep 6 2019, 7:45 AM

reopening until the regressions have been investigated

This revision now requires changes to proceed.Sep 6 2019, 7:45 AM
courbet updated this revision to Diff 219339.Sep 9 2019, 6:12 AM
  • Move mem passes after sanitizer passes.
  • Move codegen test llvm/test/CodeGen/AArch64/bcmp-inline-small.ll to Transforms/ now that AArch64 expands memcmps.
  • Rebase on r371397.

Sorry, this change broke the buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/17395, I reverted it (r371507). Please ensure that you run ninja check-all before committing.

We also detected some compile time regressions in the CTMark subset of the test suite, with lencod regressing by around 4-5%. I haven't fully bisected but the commit list was short and this seemed to be the only suspicious change.

nikic added a subscriber: nikic.Oct 23 2019, 12:07 PM

@courbet what's happening with this patch?

courbet abandoned this revision.Nov 22 2019, 2:43 AM

@courbet what's happening with this patch?

I'm going to abandon it for now. This interferes with the sanitizers in hard to fix ways, and I don't have the bandwith to come back to it. sorry.