This is an archive of the discontinued LLVM Phabricator instance.

[clang] Add support for optimize function attribute
AbandonedPublic

Authored by steplong on Jun 3 2022, 11:34 AM.

Details

Summary

This attribute is similar to GCC's function attribute, but it doesn't support
compiling functions with arbitrary optimization options. This patch only
supports "-Os", "-Oz", "-Ofast", "-O0" because they map to existing
attributes cleanly. We don't intend to support arbitrary options, like GCC,
with this attribute.

i.e. __attribute__((optimize("O0")))

Diff Detail

Event Timeline

steplong created this revision.Jun 3 2022, 11:34 AM
Herald added a project: Restricted Project. · View Herald Transcript
steplong requested review of this revision.Jun 3 2022, 11:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2022, 11:34 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
steplong added inline comments.Jun 3 2022, 11:39 AM
clang/lib/CodeGen/CGCall.cpp
2170

I don't think this is the most ergonomic way. Let me know if you have a better idea of doing this

aaron.ballman added inline comments.Jun 6 2022, 7:24 AM
clang/include/clang/Basic/Attr.td
2265

Something along these lines adds a "fake" argument to the attribute. This means the parsed attribute doesn't care about this argument (so users don't supply it when writing the attribute themselves), but the semantic attribute (OptimizeAttr) will have a member to track the fake argument value and will require extra information when creating the attribute.

This effectively will add:

enum OptLevelKind {
  O0,
  O1,
  O2,
  O3,
  O4
} OptLevel;

to the semantic attribute.

2267

You should add some rudimentary documentation for this, and probably point to the GCC docs for further information.

clang/include/clang/Basic/DiagnosticSemaKinds.td
3019–3020

I'm not 100% in love with the list of valid options in my suggested wording (I had originally listed the valid values manually and that was not much better).

One thing I think is important is that this be a warning rather than an error. Users will pass "-f" strings here which are supported by GCC; they should just be able to ignore the warning in Clang as being harmless, but if it's an error, the user has to change the function signatures in ways that are kind of annoying.

clang/lib/CodeGen/CGCall.cpp
2170

I'll sprinkle some comments around about how I'd investigate handling this.

Based on those suggestions, here you would be able to ask for OptimizeAttr->getOptLevel() and it will return the mapped enumeration value, which should clean this code up to not require string checking.

Btw, hasAttr() followed by getAttr() is generally a code smell (same smell as isa followed by cast). You should switch to logic more like:

if (const auto *OA = TargetDecl->getAttr<OptimizeAttr>()) {

}

so that we only have to traverse the list of attributes once instead of twice.

2172–2173
clang/lib/Sema/SemaDeclAttr.cpp
4841–4850

Then, in here, you can parse the -O<whatever> the user passed as a string, and convert it to an OptimizeAttr::OptLevelKind enumerator and store that in the semantic attribute.

This allows you to map things like -Og to whatever -O level that actually represents, or do any other kind of mapping that works for you.

One question we should probably figure out is whether we also want to support clang-cl optimization strings or not. e.g., __attribute__((optimize("/O0"))) with a slash instead of a dash. Since we're already going to be doing parsing from here anyway, I feel like it might not be a bad idea to also support those. FWIW, here's the list from the user's manual:

/O0                     Disable optimization
/O1                     Optimize for size  (same as /Og     /Os /Oy /Ob2 /GF /Gy)
/O2                     Optimize for speed (same as /Og /Oi /Ot /Oy /Ob2 /GF /Gy)
/Ob0                    Disable function inlining
/Ob1                    Only inline functions which are (explicitly or implicitly) marked inline
/Ob2                    Inline functions as deemed beneficial by the compiler
/Od                     Disable optimization
/Og                     No effect
/Oi-                    Disable use of builtin functions
/Oi                     Enable use of builtin functions
/Os                     Optimize for size
/Ot                     Optimize for speed
/Ox                     Deprecated (same as /Og /Oi /Ot /Oy /Ob2); use /O2 instead
/Oy-                    Disable frame pointer omission (x86 only, default)
/Oy                     Enable frame pointer omission (x86 only)
/O<flags>               Set multiple /O flags at once; e.g. '/O2y-' for '/O2 /Oy-'

(Not all of these would be supported, like enable use of builtin functions, etc.) WDYT?

steplong updated this revision to Diff 434639.Jun 6 2022, 4:12 PM
  • Added docs for attribute
  • Changed attribute to use Enum
  • Optsize includes -Og, -Oz, and -Ofast
  • Change to warning instead of error
steplong added inline comments.Jun 6 2022, 4:14 PM
clang/lib/Sema/SemaDeclAttr.cpp
4841–4850

Hmm I don't think it's necessary to get pragma optimize to work, but it shouldn't be hard to add the parsing logic for some of these strings

I think this is getting close -- mostly just nits at this point.

clang/include/clang/Basic/AttrDocs.td
3463
3465–3467
clang/lib/CodeGen/CodeGenModule.cpp
1932–1934

Coding style nit.

clang/lib/Sema/SemaDeclAttr.cpp
4841–4850

Definitely agreed on the MSVC command line switches, so if those are a burden, feel free to skip them. For the other flags, those seem more important to support because they're also flags present in GCC so users are more likely to expect them to work.

4850

Given that most of the strings we're dealing with are either literals or a StringRef, I think you should use llvm:: StringMap instead of unordered_map. (We tend to avoid STL containers because their performance characteristics are often different than what we need as a compiler.)

4858

Probably meant to remove this?

4868
steplong updated this revision to Diff 435264.Jun 8 2022, 11:16 AM
  • Fix up docs and comments
  • Fix failing pragma-attribute-supported-attributes-list.test
  • Remove debug print
  • Change to StringMap
steplong edited the summary of this revision. (Show Details)Jun 8 2022, 11:18 AM

Two final (I hope) nits! One is in the code, the other is that this should have a release note for the new attribute. Assuming precommit CI pipeline passes, then I think this is all set.

clang/lib/Sema/SemaDeclAttr.cpp
4860

Sorry for missing this before, but on both of the calls to Create here, you should be passing in Arg so that the AST retains full source fidelity (this matters for things like pretty printing).

steplong updated this revision to Diff 435313.Jun 8 2022, 1:42 PM
  • Add Arg when creating Attr
steplong edited the summary of this revision. (Show Details)Jun 8 2022, 1:43 PM

Two final (I hope) nits! One is in the code, the other is that this should have a release note for the new attribute. Assuming precommit CI pipeline passes, then I think this is all set.

Haha, no worries! I appreciate the review

aaron.ballman accepted this revision.Jun 9 2022, 9:43 AM

LGTM, but please add the release note when landing. Thanks!

This revision is now accepted and ready to land.Jun 9 2022, 9:43 AM
steplong updated this revision to Diff 435680.Jun 9 2022, 2:10 PM
  • Added release notes on attribute
aeubanks requested changes to this revision.Jun 9 2022, 3:12 PM
aeubanks added a subscriber: aeubanks.

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

This revision now requires changes to proceed.Jun 9 2022, 3:12 PM

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, this does affect codegen, so I'm not sure if it's purely a clang frontend thing. Maybe someone else can confirm. The motivation behind this was to add support for MSVC's pragma optimize in D125723. https://docs.microsoft.com/en-us/cpp/preprocessor/optimize?view=msvc-170

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, does the pass manager have to support anything here? The only Clang codegen changes are for emitting IR attributes that we already emitted based on command line flags/other attributes, so I had the impression this would not be invasive for the backend at all.

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, this does affect codegen, so I'm not sure if it's purely a clang frontend thing. Maybe someone else can confirm. The motivation behind this was to add support for MSVC's pragma optimize in D125723. https://docs.microsoft.com/en-us/cpp/preprocessor/optimize?view=msvc-170

adding optsize/minsize/optnone attributes to functions is fine and is already handled in optimizations, but being able to specify -O[0-3] would require a lot of new complexity in the pass manager which would likely not be worth it
I think D125723 is fine as is

I agree with @aeubanks , this feature requires major changes to pass manager and I see no value to land this currently.

I see that somebody may prefer “opt for size”, but this is exposed via “minsize” attribute so I see no strong need for optimize(“-Os”)

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, does the pass manager have to support anything here? The only Clang codegen changes are for emitting IR attributes that we already emitted based on command line flags/other attributes, so I had the impression this would not be invasive for the backend at all.

if we're allowing individual functions to specify that they want the -O1 pipeline when everything else in the module should be compiled with -O2, that's a huge change in the pass manager. but perhaps I'm misunderstanding the point of this patch

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, does the pass manager have to support anything here? The only Clang codegen changes are for emitting IR attributes that we already emitted based on command line flags/other attributes, so I had the impression this would not be invasive for the backend at all.

if we're allowing individual functions to specify that they want the -O1 pipeline when everything else in the module should be compiled with -O2, that's a huge change in the pass manager. but perhaps I'm misunderstanding the point of this patch

That makes sense. The MSVC pragma allows 4 options, "stgy":

ParameterOnOff
gDeprecatedDeprecated
sAdd MinSizeRemove MinSize (I think this would be difficult to do if -Os is passed on the cmdline)
tAdd -O2 (We can't support -O2 with this attribute so ignore)Add Optnone
yAdd frame-pointers (We can't support -f arguments with the attribute in this patch so we are ignoring this)No frame-pointers (Same thing as on)

For our use case, I think we only really see #pragma optimize("", off) and #pragma optimize("", on), so I'm not opposed to abandoning this patch and just supporting the common use case for now. I think #pragma optimize("", on) would just mean do nothing and apply whatever is on the command line and #pragma optimize("", off) would mean add optnone to the functions below it

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, does the pass manager have to support anything here? The only Clang codegen changes are for emitting IR attributes that we already emitted based on command line flags/other attributes, so I had the impression this would not be invasive for the backend at all.

if we're allowing individual functions to specify that they want the -O1 pipeline when everything else in the module should be compiled with -O2, that's a huge change in the pass manager. but perhaps I'm misunderstanding the point of this patch

That makes sense. The MSVC pragma allows 4 options, "stgy":

ParameterOnOff
gDeprecatedDeprecated
sAdd MinSizeRemove MinSize (I think this would be difficult to do if -Os is passed on the cmdline)
tAdd -O2 (We can't support -O2 with this attribute so ignore)Add Optnone
yAdd frame-pointers (We can't support -f arguments with the attribute in this patch so we are ignoring this)No frame-pointers (Same thing as on)

For our use case, I think we only really see #pragma optimize("", off) and #pragma optimize("", on), so I'm not opposed to abandoning this patch and just supporting the common use case for now. I think #pragma optimize("", on) would just mean do nothing and apply whatever is on the command line and #pragma optimize("", off) would mean add optnone to the functions below it

looks good to me, I agree that we should just honor whatever optimization level the file is compiled with with t

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

Hmm, does the pass manager have to support anything here? The only Clang codegen changes are for emitting IR attributes that we already emitted based on command line flags/other attributes, so I had the impression this would not be invasive for the backend at all.

if we're allowing individual functions to specify that they want the -O1 pipeline when everything else in the module should be compiled with -O2, that's a huge change in the pass manager. but perhaps I'm misunderstanding the point of this patch

I guess I'm not seeing what burden is being added here (you may still be correct though!)

The codegen changes basically boil down to:

if (!HasOptnone) {
  if (CodeGenOpts.OptimizeSize || HasOptsize) // This was updated for || HasOptsize
    FuncAttrs.addAttribute(llvm::Attribute::OptimizeForSize);
  if (CodeGenOpts.OptimizeSize == 2)
    FuncAttrs.addAttribute(llvm::Attribute::MinSize);
}

and

bool HasOptimizeAttrO0 = false;                                     // NEW
if (const auto *OA = D->getAttr<OptimizeAttr>())
  HasOptimizeAttrO0 = OA->getOptLevel() == OptimizeAttr::O0;

// Add optnone, but do so only if the function isn't always_inline.
if ((ShouldAddOptNone || D->hasAttr<OptimizeNoneAttr>() ||
     HasOptimizeAttrO0) &&                                          // NEW
    !F->hasFnAttribute(llvm::Attribute::AlwaysInline)) {
  B.addAttribute(llvm::Attribute::OptimizeNone);

For my own education, are you saying that these changes are specifying the entire -O<N> pipeline?

The patch looks for O0 in the attribute, but the other O<N> values are noops. We do some mapping though:

OptimizeAttr::OptLevelKind Kind = OA->getOptLevel();
HasOptnone = HasOptnone || (Kind == OptimizeAttr::O0);
HasOptsize = Kind == OptimizeAttr::Os || Kind == OptimizeAttr::Og ||
             Kind == OptimizeAttr::Ofast || Kind == OptimizeAttr::Oz;

but I'm failing to understand why this would be a concern for the backend given that we already support setting the LLVM IR flags based on other attributes. e.g., why is [[clang::optnone]] okay but [[gnu::optimize("O0")]] a problem?

xbolva00 added a comment.EditedJun 10 2022, 12:02 PM

[[gnu::optimize("O0")]] is okay but [[gnu::optimize("O3")]] is not gonna work without major changes. Not sure why we should deliver half-broken new attribute.

Some strong motivation/use cases why somebody needs to compile some functions with -O2 and some with -O3?

I agree with @aeubanks , this feature requires major changes to pass manager and I see no value to land this currently.

I see that somebody may prefer “opt for size”, but this is exposed via “minsize” attribute so I see no strong need for optimize(“-Os”)

Hmm.. We expose minsize attribute in C, (like -Oz), but "optsize" is not exposed (like -Os). I will try to create a patch for exposing it.

[[gnu::optimize("O0")]] is okay but [[gnu::optimize("O3")]] is not gonna work without major changes. Not sure why we should deliver half-broken new attribute.

I don't see it as half-broken given that it's explicitly documented by GCC as "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."

Some strong motivation/use cases why somebody needs to compile some functions with -O2 and some with -O3?

This patch leaves O1-O4 as noops, and really only does anything with O0 and the letter-based ones, so I don't know that we need to do that exercise until we want to make them actually do something. Similarly, we don't allow any of the -f flags like GCC does.

I don't insist on keeping this attribute specifically, but I like the fact that it gives us *one* attribute that we can use as a basis for all these various other ways of spelling optimization hints (optnone, #pragma optimize, minsize, etc). I think I'd ultimately like to see all of those other attributes as alternate spellings of this one and they just map to the appropriate internal "level".

(I'm totally fine with us not supporting optimization hints that the backend would struggle with, this is more about how we model the notion of a per-function user-provided optimization hint without adding a new attribute every time.)

"The optimize attribute should be used for debugging purposes only. It is not suitable in production code."

Until they (users) start and any change in pipeline may surprise them.

Personally I am bigger fan of more targeted attributes like we have noinline / noipa proposed but stalled / and then we could have new ones to disable vectorizers, LICM, unroller, etc..

Yes, we could claim that attribute((optimize("-fno-slp-vectorize") then maps exactly to attribute((noslp)).

Still, I would like to hear some motivation words other than "gcc" has it.

"The optimize attribute should be used for debugging purposes only. It is not suitable in production code."

Until they (users) start and any change in pipeline may surprise them.

Too bad for them? I guess my sympathy button is broken for users who use things in production code that are documented as not being suitable for production code. :-D

Personally I am bigger fan of more targeted attributes like we have noinline / noipa proposed but stalled / and then we could have new ones to disable vectorizers, LICM, unroller, etc..

Yes, we could claim that attribute((optimize("-fno-slp-vectorize") then maps exactly to attribute((noslp)).

Still, I would like to hear some motivation words other than "gcc" has it.

What I want to avoid is the continued proliferation of semantic attributes related to optimizations that are otherwise controlled by command line flags. We have optnone, minsize, Stephen's original patch for the MSVC pragma added another one, you're talking about adding optsize, etc. All of these are semantically doing "the same thing", which is associating some coarse granularity optimization hints with a function definition that would otherwise be even more coarsely controlled via the command line. Having multiple semantic attributes makes supporting this more fragile because everywhere that wants to care about coarse-grained optimizations has to handle the combinatorial matrix of ways they can be mixed together and as that grows, we will invariably get it wrong by forgetting something.

What I don't have a strong opinion on is what attributes we surface to users so they can spell them in their source. I have no problem exposing GCC's attributes, and MSVC's attributes, and our own attributes in whatever fun combinations we want to allow. What I want is that all of those related attributes are semantically modeled via ONE attribute in the AST. When converting these parsed optimization attributes into semantic attributes, I want us to map whatever information is in the parsed attribute onto that single semantic attribute. When we merge attributes on a declaration, I want that one attribute to be updated instead of duplicated with different values. So at the end of the day, when we get to CodeGen, we can query for the one attribute and its semantic effects instead of querying for numerous attributes and trying to decide what to do when more than one attribute is present at that point.

That's why I pushed Stephen to make this patch. The fact that it also happens to expose a feature from GCC that is very closely related to what he's trying to do for the MSVC pragma was a nice added bonus.

This leaves a few questions:

  1. Are you opposed to exposing #pragma optimize? (https://docs.microsoft.com/en-us/cpp/preprocessor/optimize?view=msvc-170) If yes, I think Stephen should run an RFC on Discourse to see if there's general agreement.
  2. Are you opposed to the direction of condensing the optimization semantic attributes (the things in the AST) down into one? If yes, I'd like to understand why better.
  3. Are you still opposed to exposing a neutered form of the GCC optimize attribute as a parsed attribute (the thing users write in their source)? If yes, that's fine by me, but then I'd still like to see most of this patch land, just without a way for the user to spell the attribute themselves. We can adjust the semantic attribute's enumeration to cover only the cases we want to support.
  4. Or are you opposed to the notion of having one semantic attribute to control all of this and you prefer to see multiple individual semantic attributes and all that comes along with them in terms of combinations?

I can't speak for @xbolva00 but the only part I'm against is the user-visible feature of

  • Added preliminary support for GCC's attribute optimize, which allows functions to be compiled with different optimization options than what was specified on the command line.

which implies that we're on the way to support per-function optimization levels (which we aren't)
the internal clang representation changes are all fine

and even for the MSVC pragma #pragma optimize("t", on), what are we supporting if the user compiles their code with -O0? because right now we won't optimize anything with -O0

Are you opposed to exposing #pragma optimize? (https://docs.microsoft.com/en-us/cpp/preprocessor/optimize?view=msvc-170) If yes, I think Stephen should run an RFC on Discourse to see if there's general agreement.

No, I like it, seems more useful and general than "pragma clang optimize on/off"

Are you opposed to the direction of condensing the optimization semantic attributes (the things in the AST) down into one? If yes, I'd like to understand why better.

No :)

Are you still opposed to exposing a neutered form of the GCC optimize attribute as a parsed attribute (the thing users write in their source)? If yes, that's fine by me, but then I'd still like to see most of this patch land, just without a way for the user to spell the attribute themselves. We can adjust the semantic attribute's enumeration to cover only the cases we want to support.

Not entirely opposed, GCC optimize attribute could partially work fine, O0 maps to optnone, Os to optsize, Oz to minsize. I am more worried about next steps, see below.

Or are you opposed to the notion of having one semantic attribute to control all of this and you prefer to see multiple individual semantic attributes and all that comes along with them in terms of combinations?

Not strongly opposed, just some concerns how this could work together with LLVM. Example: attribute((optimize("-fno-licm"))) -> 'optimize="no-licm" '? This could work. Possibly also allow this form: attribute((optimize(no-licm, optsize))) ?

What about current attributes? Should/can we drop them and use for example 'optimize="no-ipa,no-clone" '? Not strongly opposed, but probably a lot of work.

But to use different pipeline for different functions (here I mean -O1, O2, O3) is a major change to LLVM pass manager and I think this use case does not justify it.

But where I think this feature could be very useful in following case from gcc test suite where there is some FP computation..
Imagine you compile you program with -ffast-math and then you have function:

__attribute__ ((optimize ("no-associative-math"))) double
fn3 (double h, double l) /* { dg-message "previous definition" } */
{
  return h + l;
}

So in this case, codegen would just drop llvm attribute "reassoc".

xbolva00 added inline comments.Jun 10 2022, 2:04 PM
clang/test/CodeGen/attr-optimize.c
5

No support for

__attribute__ ((__optimize__ (0)))

? GCC supports it

steplong added inline comments.Jun 10 2022, 2:14 PM
clang/test/CodeGen/attr-optimize.c
5

Nope, I only added support for one argument and only strings. I think gcc supports expressions, multiple args, -f args, and -O args. I wasn't sure how to implement it in Attr.td without making heavy changes.

xbolva00 added inline comments.Jun 11 2022, 2:45 AM
clang/test/CodeGen/attr-optimize.c
17

For -Os, clang adds optsize. For -Oz, clang adds optsize and minsize. So tests should check it, maybe currently this is broken in your patch?

https://godbolt.org/z/dEsffoeW4

steplong updated this revision to Diff 436443.Jun 13 2022, 9:27 AM
  • Add llvm::Attribute::MinSize when OptimizeAttr::Oz
  • Add test for checking minsize

I can't speak for @xbolva00 but the only part I'm against is the user-visible feature of

  • Added preliminary support for GCC's attribute optimize, which allows functions to be compiled with different optimization options than what was specified on the command line.

which implies that we're on the way to support per-function optimization levels (which we aren't)
the internal clang representation changes are all fine

Ah, would you be okay if we retained the user-facing feature but more clearly documented (in release notes and documentation) the differences from GCC and that we do not currently intend to close that gap?

and even for the MSVC pragma #pragma optimize("t", on), what are we supporting if the user compiles their code with -O0? because right now we won't optimize anything with -O0

@steplong -- what are your thoughts on this?

Or are you opposed to the notion of having one semantic attribute to control all of this and you prefer to see multiple individual semantic attributes and all that comes along with them in terms of combinations?

But to use different pipeline for different functions (here I mean -O1, O2, O3) is a major change to LLVM pass manager and I think this use case does not justify it.

Thanks for clarifying! I'd be fine changing the internal enumeration for the attribute to represent a better subset of what we intend to implement support for (rather than making it look like we intend to support O1-O4). Would that work for you (and you as well @aeubanks)?

Would that work for you (and you as well @aeubanks)?

yes :)

and even for the MSVC pragma #pragma optimize("t", on), what are we supporting if the user compiles their code with -O0? because right now we won't optimize anything with -O0

@steplong -- what are your thoughts on this?

Hmm, I think I'm ok with ignoring the pragma when "-O0". In the case of "t", at the moment, we are just going to honor whatever is passed on the commandline. I think with what the patch looks like now, we'll be supporting the pragma optimize like:

ParameterOnOff
gDeprecatedDeprecated
sAdd OptimizeAttr::OsAdd Optnone (Not sure if this makes sense)
tDo nothingAdd Optnone
yDo nothingDo nothing

and even for the MSVC pragma #pragma optimize("t", on), what are we supporting if the user compiles their code with -O0? because right now we won't optimize anything with -O0

@steplong -- what are your thoughts on this?

Hmm, I think I'm ok with ignoring the pragma when "-O0". In the case of "t", at the moment, we are just going to honor whatever is passed on the commandline. I think with what the patch looks like now, we'll be supporting the pragma optimize like:

ParameterOnOff
gDeprecatedDeprecated
sAdd OptimizeAttr::OsAdd Optnone (Not sure if this makes sense)
tDo nothingAdd Optnone
yDo nothingDo nothing

I think that works for me.

clang/docs/ReleaseNotes.rst
331–333

And we can clarify in the release note that we're not intending to fully support this attribute.

clang/include/clang/Basic/Attr.td
2267–2268

Assuming this also addresses @aeubanks 's concerns, I think we should remove O1 through O4 and maybe consider renaming the other enumerations to be less about the command line option and more about the effects. e.g., Fast, MinSize, NoOpts etc. We'll still do the mapping from O0 and whatnot to these values (within SemaDeclAttr.cpp) but this should hopefully clarify that the semantics we're after are not really pipeline semantics.

clang/include/clang/Basic/AttrDocs.td
3462

And we can add to the documentation that we don't intend to fully support the GCC semantics and further comment about ignoring O1 through O4, etc.

steplong updated this revision to Diff 436551.Jun 13 2022, 1:49 PM
  • Remove -Og, and all non-zero optimization levels
  • Fix up docs
  • Modified tests to reflect -Og and non-zero opt level change
steplong retitled this revision from [clang] Add initial support for gcc's optimize function attribute to [clang] Add support for optimize function attribute.Jun 13 2022, 2:11 PM
steplong edited the summary of this revision. (Show Details)
aeubanks added inline comments.Jun 13 2022, 2:29 PM
clang/include/clang/Basic/AttrDocs.td
3469

something about optimize(-Os) still depending on the file's overall optimization level (e.g. clang -O0 won't do anything) would be good

xbolva00 added a comment.EditedJun 13 2022, 2:35 PM

If you compile file with -Ofast and use optimise(-Os) for F - I would expect no fast math flags for function F but I am worried a bit that only “optsize” is added and no fast math flags are removed. Plesse verify and/or add such test.

jdoerfert added a comment.EditedJun 13 2022, 2:45 PM

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

We actually should *not* make this a clang frontend only thing. It is confusing and not helpful. That said, we have code to integrate this into the new PM already as we were planning on proposing something along these lines too. We didn't manage to get to it during last years GSoC but the code could be used as a basis still.

+1 for RFC
strong preference for proper integration of this into the new PM.

@tarinduj has the code in his repo, newest version, I think is: https://github.com/tarinduj/llvm-project/blob/a24b1d1b2033b6bb17b7ad1c58b15d34e078fdb8/llvm/include/llvm/IR/PassManager.h#L464

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

We actually should *not* make this a clang frontend only thing. It is confusing and not helpful. That said, we have code to integrate this into the new PM already as we were planning on proposing something along these lines too. We didn't manage to get to it during last years GSoC but the code could be used as a basis still.

+1 for RFC
strong preference for proper integration of this into the new PM.

I'm not opposed to an RFC to extend this functionality, but it seems to me that we have incremental progress already with this patch and landing this patch unblocks the work @steplong was originally doing for the MSVC pragma. Do you have a concern if we move forward with this less-functional form so that work isn't held up on an RFC for the more fully functional form?

IIRC in the past there was a strong preference to not have the pass manager support this sort of thing
if you want to support this, there should be an RFC for how the optimization part of this will work as it may require invasive changes to the LLVM pass manager

(if this is purely a clang frontend thing then ignore me)

We actually should *not* make this a clang frontend only thing. It is confusing and not helpful. That said, we have code to integrate this into the new PM already as we were planning on proposing something along these lines too. We didn't manage to get to it during last years GSoC but the code could be used as a basis still.

+1 for RFC
strong preference for proper integration of this into the new PM.

I'm not opposed to an RFC to extend this functionality, but it seems to me that we have incremental progress already with this patch and landing this patch unblocks the work @steplong was originally doing for the MSVC pragma. Do you have a concern if we move forward with this less-functional form so that work isn't held up on an RFC for the more fully functional form?

I was thinking about this again and I am more and more unsure about this feature. -Os/-Oz is something more than just some attribute.

Look here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp and see lines (conditions) with SizeLevel and OptLevel.

I was thinking about this again and I am more and more unsure about this feature. -Os/-Oz is something more than just some attribute.

That's not stopped us from exposing attributes like minsize (and you proposed optsize as well) which get described in the same terms as -Os/-Oz in our documentation, so this ship has somewhat sailed.

(FWIW, I don't have a strong opinion on the attributes we expose here. I do have strong opinions about increasing the maintenance burdens by adding more related attributes without carefully considering their semantic and code gen interactions.)

That's not stopped us from exposing attributes like minsize (and you proposed optsize as well) which get described in the same terms as -Os/-Oz in our documentation, so this ship has somewhat sailed.

Well, there is no promise that attribute matches -O flag in the documentation, indeed. But I understand your point and not a issue in practise.

Look here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/PassManagerBuilder.cpp and see lines (conditions) with SizeLevel and OptLevel.

Second look, I think it is possible to rework those lines to respect attributes more; like for example

MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1, PrepareForLTO));

in such way, that in the pass itself, we would do something like OptionXYZ = F.isMinSize() ? 0 : -1;

steplong added a comment.EditedJun 14 2022, 10:45 AM

I'm open to tabling this and just implementing support for an empty optimization list for the pragma (i.e. #pragma optimize("", on | off)). For our use case, at the moment, we only see the pragma being used this way. on would be a noop (honor command-line) and off would mean add optnone to the functions below the pragma

I'm open to tabling this and just implementing support for an empty optimization list for the pragma (i.e. #pragma optimize("", on | off)). For our use case, at the moment, we only see the pragma being used this way. on would be a noop (honor command-line) and off would mean add optnone to the functions below the pragma

If that meets your needs, that is certainly a way to get you unblocked. I'm fine with that approach if you are.

I appreciate your patience while we figure out the right approach here; we don't usually have this many false starts when working through a feature review. :-)

I appreciate your patience while we figure out the right approach here; we don't usually have this many false starts when working through a feature review. :-)

No worries, I appreciate the community doing the due diligence.

xbolva00 added a comment.EditedJun 14 2022, 11:46 AM

Also note that there is a '#pragma GCC optimize' pragma. After this patch, it should not be hard to implement it.

https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html

FWIW, I think we should have these attributes as spelled here, just w/ proper pass manager integration which then requires an RFC.
That said, I'm not opposed to this as an incremental step, albeit confusing until the PM support is integrated if we allow O1/2/3/fast

steplong abandoned this revision.Jun 24 2022, 6:48 AM