Page MenuHomePhabricator

[Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops.
ClosedPublic

Authored by fhahn on Mar 30 2020, 7:11 AM.

Details

Summary

Currently Clang does not respect -fno-unroll-loops during LTO. During
D76916 it was suggested to respect -fno-unroll-loops on a TU basis.

This patch uses the existing llvm.loop.unroll.disable metadata to
disable loop unrolling explicitly for each loop in the TU if
unrolling is disabled. This should ensure that loops from TUs compiled
with -fno-unroll-loops are skipped by the unroller during LTO.

This also means that if a loop from a TU with -fno-unroll-loops
gets inlined into a TU without this option, the loop won't be
unrolled.

Due to the fact that some transforms might drop loop metadata, there
potentially are cases in which we still unroll loops from TUs with
-fno-unroll-loops. I think we should fix those issues rather than
introducing a function attribute to disable loop unrolling during LTO.
Improving the metadata handling will benefit other use cases, like
various loop pragmas, too. And it is an improvement to clang completely
ignoring -fno-unroll-loops during LTO.

If that direction looks good, we can use a similar approach to also
respect -fno-vectorize during LTO, at least for LoopVectorize.

In the future, this might also allow us to remove the UnrollLoops option
LLVM's PassManagerBuilder.

Diff Detail

Event Timeline

fhahn created this revision.Mar 30 2020, 7:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2020, 7:11 AM
Herald added a subscriber: zzheng. · View Herald Transcript

I think this is a good approach, rather than a per-function attribute, since as mentioned this will be preserved through inlining.
@dexonsmith, does that seem reasonable to you? I missed the original patch and agree with you that we don't want to fix this in LTO by passing the option through to LTO.

I think this is a good approach, rather than a per-function attribute, since as mentioned this will be preserved through inlining.
@dexonsmith, does that seem reasonable to you? I missed the original patch and agree with you that we don't want to fix this in LTO by passing the option through to LTO.

SGTM!

fhahn added a comment.Apr 6 2020, 8:16 AM

ping.

@tejohnson are you happy with this approach, given that it sounds good to @dexonsmith as well?

tejohnson accepted this revision.Apr 6 2020, 8:55 AM

yep, LGTM

This revision is now accepted and ready to land.Apr 6 2020, 8:55 AM

Note that loop-metadata is best-effort only and may be forgotten in the optimization pipeline.

Do we also need an equivalent to -Xclang -disable-O0-optnone?

Personally, I don't like to the optnone approach: There have been many post on llvm-dev using clang -emit-llvm and being surprised that opt has no effect.

fhahn added a comment.Apr 6 2020, 11:12 AM

Note that loop-metadata is best-effort only and may be forgotten in the optimization pipeline.

Agreed, that can be a potential issue (I tried to note that in the description), but I think that's pretty much the same issue we have with the loop related pragmas. Ideally there would be even more incentive now to fix the offending transforms.

Do we also need an equivalent to -Xclang -disable-O0-optnone?

Personally, I don't like to the optnone approach: There have been many post on llvm-dev using clang -emit-llvm and being surprised that opt has no effect.

Ah yes, optnone is a common pitfall :(

IIUC we don't need a patch similar like this one for optnone, as it already gets added to the function attributes (for -O0) and has an option to disable adding it (-Xclang -disable-O0-optnone) on a per-TU basis.

LGTM, since it continues current practice. optnone will always be the more annoying.

IIUC we don't need a patch similar like this one for optnone, as it already gets added to the function attributes (for -O0) and has an option to disable adding it (-Xclang -disable-O0-optnone) on a per-TU basis.

My question was the other way around: Do we need something like -xclang -disable-fno-unroll-loops-metadata.

I documented for Polly how to get the IR for further processing. There is clang -Xclang -disable-O0-optnone and another is clang -O1 -Xclang -disable-llvm-passes. Both avoid the optnone attribute, but will yield different results with -fno-unroll-loops. Which is the 'correct' way?

Meinersbur accepted this revision.Apr 7 2020, 4:37 AM
This revision was automatically updated to reflect the committed changes.