This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] pass UnrollLoops/VectorizeLoop/VectorizeSLP in CGOpts down to pass builder in ltobackend
ClosedPublic

Authored by wmi on Jan 7 2020, 11:03 PM.

Details

Summary

Currently UnrollLoops/VectorizeLoop/VectorizeSLP in CGOpts are not passed down to pass builder in ltobackend when new pass manager is used. This is inconsistent with the behavior when new pass manager is used and thinlto is not used. Such inconsistency causes slp vectorization pass not being enabled in ltobackend for O3 + thinlto right now. This patch fixes that.

Diff Detail

Event Timeline

wmi created this revision.Jan 7 2020, 11:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 7 2020, 11:03 PM
wmi updated this revision to Diff 236766.Jan 7 2020, 11:09 PM
wmi updated this revision to Diff 236767.Jan 7 2020, 11:12 PM

update flag in the test.

I think the old PM has some issues as well, although more in the opposite direction. The code in LTOBackend.cpp:runOldPMPasses is simply hardcoding loop and slp vectorization flags to true, but presumably should use whatever is passed through the lto Config as well. Not sure about LoopsInterleaved, ah looks like it is set to the value of the internal option which is enabled by default. That and DisableUnrollLoops should also use whatever is passed through here (similar to how they are set in clang BackendUtil.cpp when configuring the old PM).

clang/lib/CodeGen/BackendUtil.cpp
1440

Looks like the PTO constructor sets this to true, so we were getting lucky here.

1443

Looks like this only affects the values passed to LoopVectorizePass. Is there a way to test that it is being enabled properly now?

1444

Ditto.

1445

Is this enabled by default at -O3? I looked at clang but couldn't tell where that would be happening. If so, then llvm/test/Other/new-pm-thinlto-defaults.ll should be fixed to check that it is added to the postlink pipeline at -O3 to make sure we don't regress here again.

Thinking through this some more - this patch fixes the distributed ThinLTO backend case, but not the in-process ThinLTO case. For that, both lld and gold plugin will need to set this up. And since they don't have the clang options, they will just need to set to some reasonable default, probably either hardwired on like in the old PM or set based on the CGOptLevel they set from linker options. However, they both default to -O2 unless a special plugin or lld linker option is set, so we'd probably want to enable these passes there at O2+.

wmi added a comment.Jan 8 2020, 9:05 AM

Thinking through this some more - this patch fixes the distributed ThinLTO backend case, but not the in-process ThinLTO case. For that, both lld and gold plugin will need to set this up. And since they don't have the clang options, they will just need to set to some reasonable default, probably either hardwired on like in the old PM or set based on the CGOptLevel they set from linker options. However, they both default to -O2 unless a special plugin or lld linker option is set, so we'd probably want to enable these passes there at O2+.

Yes, opt using new pass manager -- NewPMDriver.cpp needs to be fixed too, similar as opt using old PM which is set based on Opt Level.

wmi added a comment.Jan 8 2020, 9:11 AM

Teresa, do we have existing tests where I can add some check to see if some passes are executed for the lld and gold plugin changes?

In D72386#1810284, @wmi wrote:

Teresa, do we have existing tests where I can add some check to see if some passes are executed for the lld and gold plugin changes?

I don't see any. There is a test for each that -debug-pass-manager and new-pass-manager are passed through correctly:
test/tools/gold/X86/new-pm.ll
lld/test/ELF/lto/new-pass-manager.ll

Perhaps those can be extended to actually check the passes that get enabled, similar to llvm/test/Other/new-pm-thinlto-defaults.ll?

ormris added a subscriber: ormris.Jan 8 2020, 10:21 AM
wmi updated this revision to Diff 237170.Jan 9 2020, 1:08 PM
wmi marked 2 inline comments as done.

Enable loop and slp vectorization in O2/O3 (Os for opt as well) in gold plugin, lld, opt and llvm-lto2. Add related tests.

There are still some issues like we cannot override the default setting in O2/O3 using internal flags like -vectorize-loops, -vectorize-slp, and the issues like those setting are now hard coded in old pass manager which Teresa mentioned. I leave them as it is for the moment and let this patch just handle the major issue of slp being disabled in thinlto backend.

wmi added inline comments.Jan 9 2020, 1:09 PM
clang/lib/CodeGen/BackendUtil.cpp
1443

For vectorization, given a function with vectorizable loop, I test it by specifying -force-vector-width=2 and -force-vector-interleave=1 and checking whether the IR will contain llvm.loop.isvectorized metadata at the end of the pipeline.

Similarly for interleaving, I test it by specifying -force-vector-width=1 and -force-vector-interleave=2 and checking whether the IR will contain llvm.loop.isvectorized metadata at the end of the pipeline.

1445

Yes, -vectorize-slp will be added as cc1 option when O2/O3 is specified for clang.

I fixed llvm/test/Other/new-pm-thinlto-defaults.ll, but llvm/test/Other/new-pm-thinlto-defaults.ll depends on the setting in opt instead of clang, so I also enabled slp and loop vectorization in O2, O3 and Os in opt.

Thanks! Looks pretty good, I just have a few comments and questions.

clang/test/CodeGen/thinlto-slp-vectorize-pm.c
4

Can you clarify the difference between -vectorize-slp and -mllvm -vectorize-slp=false? Is this related to you comment about not being able to use the internal options to disable it? Worth a comment in that case. Ditto for the vectorize-loops tests below.

lld/test/ELF/lto/slp-vectorize-pm.ll
12

I think the above testing was cloned from another existing lld test - can it be removed from this one since not related to vectorization?

llvm/include/llvm/LTO/Config.h
132

add doxygen comment

llvm/test/Other/new-pm-defaults.ll
256

Does clang run this at -Os?

llvm/test/tools/gold/X86/slp-vectorize-pm.ll
13

Similar comment to new lld test - I think the above lines are from another test and can be removed from here?

wmi marked 3 inline comments as done.Jan 9 2020, 2:07 PM
wmi added inline comments.
clang/test/CodeGen/thinlto-slp-vectorize-pm.c
4

-vectorize-slp is a cc1 option and will be added automatically when O2/O3/Os/Oz is available. Once -vectorize-slp is enabled, "-mllvm -vectorize-slp=false" won't disable it currently. I added "-mllvm -vectorize-slp=false" here in the test to ensure the slp vectorization is executed because the -vectorize-slp cc1 flag is passed down, not because -mllvm -vectorize-slp is enabled.

llvm/test/Other/new-pm-defaults.ll
256

Yes it does. But surprisingly I find clang add "-vectorize-slp" at Oz as well but doesn't not add "-vectorize-loops" at Oz. This is another inconsistency with opt old pass manager, which enable neither slp nor vectorize-loops at Oz. I am not sure how it should behave.

tejohnson added inline comments.Jan 9 2020, 2:15 PM
clang/test/CodeGen/thinlto-slp-vectorize-pm.c
4

Why add "-mllvm -vectorize-slp=false" to the invocation lines though? If it is to illustrate that the internal option cannot override the cc1 option, then add comments to that effect.

llvm/test/Other/new-pm-defaults.ll
256

Hmm, I think probably best to be consistent with the old PM on this for now, and add a comment where you enable these differently. This should presumably be reevaluated, but probably best if the two are in sync for now.

wmi marked 2 inline comments as done.Jan 9 2020, 2:28 PM
wmi added inline comments.
clang/test/CodeGen/thinlto-slp-vectorize-pm.c
4

I add it to the invocation line because if the default value of "-mllvm -vectorize-slp" is true, even without this patch, the test will still pass. That is why I want to add "-mllvm -vectorize-slp=false" explicitly to ensure the patch is effective.

llvm/test/Other/new-pm-defaults.ll
256

Ok, thanks. I will add a comment at the place where it sets this.

wmi updated this revision to Diff 237212.Jan 9 2020, 2:59 PM

Address Teresa's comment.

tejohnson accepted this revision.Jan 9 2020, 4:51 PM

A few minor suggestions for comment tweaks, but lgtm otherwise.

clang/test/CodeGen/thinlto-slp-vectorize-pm.c
12

maybe add "by default" to the end of the last sentence for clarity? Ditto elsewhere.

llvm/lib/Passes/PassBuilder.cpp
1905

Do you mean "consistent with old pass manager invoked via opt"? It currently reads a little confusing, because the old PM is also invoked via clang.

This revision is now accepted and ready to land.Jan 9 2020, 4:51 PM
wmi marked 2 inline comments as done.Jan 9 2020, 8:56 PM

Thanks for the review!

clang/test/CodeGen/thinlto-slp-vectorize-pm.c
12

Done.

llvm/lib/Passes/PassBuilder.cpp
1905

Fixed.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 9 2020, 9:14 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
xbolva00 added inline comments.
lld/ELF/LTO.cpp
96

COFF?

Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: vporpo, pmatos, asb. · View Herald Transcript
tejohnson added inline comments.May 27 2022, 8:17 AM
lld/ELF/LTO.cpp
96

This came up recently on discourse: https://discourse.llvm.org/t/unintended-lto-configuration-differences-between-elf-and-coff/62636

This patch is a couple years old now and @wmi isn't working actively on LLVM anymore, maybe someone working on COFF who can test it there can send a patch to make this consistent?