This is an archive of the discontinued LLVM Phabricator instance.

[clang-cl] Provide separate flags for all the /O variants
ClosedPublic

Authored by hans on Sep 19 2018, 7:28 AM.

Details

Summary

This provides better help text in "clang-cl /?".

Also it cleans things up a bit: previously "/Od" could be handled either as a separate flag aliased to "-O0", or by the main optimization flag processing in TranslateOptArg. With this patch, all the flags get aliased back to /O so they're handled by TranslateOptArg.

Thanks to Nico for the idea!

Diff Detail

Event Timeline

hans created this revision.Sep 19 2018, 7:28 AM
compnerd added inline comments.
test/Driver/Xarch.c
5

I know that this isn’t your doing, but while you’re here, would you mind replacing the temp files with pipes and grep with FileCheck?

thakis added inline comments.Sep 19 2018, 8:10 AM
include/clang/Driver/CLCompatOptions.td
129

I took a stab at this myself for a bit, and did a few things differently (but most the same way).

Here's how they /? output looks in my local binary:

/O1                     Minimize size, equivalent to /Og /Os /Oy /Ob2 /Gs /GF /Gy
/O2                     Maximize speed, equivalent to /Og /Oi /Ot /Oy /Ob2 /Gs /GF /Gy
/Ob0                    Disable function inlining
/Ob1                    Only inline functions explicitly or implicitly marked inline
/Ob2                    Allow compiler to inline all functions it deems beneficial
/Od                     Disable optimization
/Og                     No effect
/Oi-                    Disable use of builtin functions
/Oi                     Enable use of builtin functions
/Os                     Optimize for size
/Ot                     Optimize for speed
/Ox                     Deprecated, use O2 instead; equivalent to /Og /Oi /Ot /Oy /Ob2
/Oy-                    Don't omit frame pointers (default)
/Oy                     Omit frame pointers (x86 only)
/O<value>               Set several /O flags at once; '/O2y-' is the same as '/O2 /y-' for example

Since we reference them now, I also added these:

  /GF                     Enable string pooling (default)
...
  /Gs                     Same as /Gs0
  /Gs<value>              Set stack probe size (default 4096)
...
  /Gy-                    Don't put each function in its own section (default)
  /Gy                     Put each function in its own section

As far as I can tell we do not make /O1 and /O2 imply /Gs -- we keep it at the default value of 4096 (?) Fixing this probably means increasing chrome's size (?).

cl.exe doesn't support O0; I wonder why rnk added it. I asked on the revision that added it.

The name after def isn't needed; maybe we should omit the names of flags we don't reference?

Here's my local diff, mix and match as you see fit:

Index: include/clang/Driver/CLCompatOptions.td
===================================================================
--- include/clang/Driver/CLCompatOptions.td	(revision 342334)
+++ include/clang/Driver/CLCompatOptions.td	(working copy)
@@ -85,16 +85,21 @@
   HelpText<"Assume thread-local variables are defined in the executable">;
 def _SLASH_GR : CLFlag<"GR">, HelpText<"Enable emission of RTTI data">;
 def _SLASH_GR_ : CLFlag<"GR-">, HelpText<"Disable emission of RTTI data">;
+def _SLASH_GF : CLIgnoredFlag<"GF">,
+  HelpText<"Enable string pooling (default)">;
 def _SLASH_GF_ : CLFlag<"GF-">, HelpText<"Disable string pooling">,
   Alias<fwritable_strings>;
-def _SLASH_GS : CLFlag<"GS">, HelpText<"Enable buffer security check">;
+def _SLASH_GS : CLFlag<"GS">,
+  HelpText<"Enable buffer security check (default)">;
 def _SLASH_GS_ : CLFlag<"GS-">, HelpText<"Disable buffer security check">;
-def _SLASH_Gs : CLJoined<"Gs">, HelpText<"Set stack probe size">,
+def: CLFlag<"Gs">, HelpText<"Same as /Gs0">,
+  Alias<mstack_probe_size>, AliasArgs<["0"]>;
+def _SLASH_Gs : CLJoined<"Gs">, HelpText<"Set stack probe size (default 4096)">,
   Alias<mstack_probe_size>;
 def _SLASH_Gy : CLFlag<"Gy">, HelpText<"Put each function in its own section">,
   Alias<ffunction_sections>;
 def _SLASH_Gy_ : CLFlag<"Gy-">,
-  HelpText<"Don't put each function in its own section">,
+  HelpText<"Don't put each function in its own section (default)">,
   Alias<fno_function_sections>;
 def _SLASH_Gw : CLFlag<"Gw">, HelpText<"Put each data item in its own section">,
   Alias<fdata_sections>;
@@ -109,18 +114,50 @@
   Alias<I>;
 def _SLASH_J : CLFlag<"J">, HelpText<"Make char type unsigned">,
   Alias<funsigned_char>;
+
+// FIXME: cl.exe doesn't understand this, not sure why clang-cl does.
 def _SLASH_O0 : CLFlag<"O0">, Alias<O0>;
-// /Oy- is handled by the /O option because /Oy- only has an effect on 32-bit.
-def _SLASH_O : CLJoined<"O">, HelpText<"Optimization level">;
-def _SLASH_Od : CLFlag<"Od">, HelpText<"Disable optimization">, Alias<O0>;
-def _SLASH_Oi : CLFlag<"Oi">, HelpText<"Enable use of builtin functions">,
-  Alias<fbuiltin>;
-def _SLASH_Oi_ : CLFlag<"Oi-">, HelpText<"Disable use of builtin functions">,
-  Alias<fno_builtin>;
-def _SLASH_Os : CLFlag<"Os">, HelpText<"Optimize for size">, Alias<O>,
-  AliasArgs<["s"]>;
-def _SLASH_Ot : CLFlag<"Ot">, HelpText<"Optimize for speed">, Alias<O>,
-  AliasArgs<["2"]>;
+
+def _SLASH_O : CLJoined<"O">,
+  HelpText<"Set several /O flags at once; "
+           "'/O2y-' is the same as '/O2 /y-' for example">;
+
+// FIXME: Looks like this doesn't actually imply /Gs in clang-cl yet.
+def _SLASH_O1 : CLFlag<"O1">,
+  HelpText<"Minimize size, equivalent to /Og /Os /Oy /Ob2 /Gs /GF /Gy">,
+  Alias<_SLASH_O>, AliasArgs<["1"]>;
+def _SLASH_O2 : CLFlag<"O2">,
+  HelpText<"Maximize speed, equivalent to /Og /Oi /Ot /Oy /Ob2 /Gs /GF /Gy">,
+  Alias<_SLASH_O>, AliasArgs<["2"]>;
+def _SLASH_Ox : CLFlag<"Ox">,
+  HelpText<"Deprecated, use O2 instead; equivalent to /Og /Oi /Ot /Oy /Ob2">,
+  Alias<_SLASH_O>, AliasArgs<["x"]>;
+def _SLASH_Od : CLFlag<"Od">, HelpText<"Disable optimization">,
+  Alias<_SLASH_O>, AliasArgs<["d"]>;
+
+def: CLFlag<"Ob0">, HelpText<"Disable function inlining">,
+  Alias<_SLASH_O>, AliasArgs<["b0"]>;
+def: CLFlag<"Ob1">,
+  HelpText<"Only inline functions explicitly or implicitly marked inline">,
+  Alias<_SLASH_O>, AliasArgs<["b1"]>;
+def: CLFlag<"Ob2">,
+  HelpText<"Allow compiler to inline all functions it deems beneficial">,
+  Alias<_SLASH_O>, AliasArgs<["b2"]>;
+def: CLFlag<"Og">, HelpText<"No effect">,
+  Alias<_SLASH_O>, AliasArgs<["g"]>;
+def: CLFlag<"Oi">, HelpText<"Enable use of builtin functions">,
+  Alias<_SLASH_O>, AliasArgs<["i"]>;
+def: CLFlag<"Oi-">, HelpText<"Disable use of builtin functions">,
+  Alias<_SLASH_O>, AliasArgs<["i-"]>;
+def: CLFlag<"Os">, HelpText<"Optimize for size">,
+  Alias<_SLASH_O>, AliasArgs<["s"]>;
+def: CLFlag<"Ot">, HelpText<"Optimize for speed">,
+  Alias<_SLASH_O>, AliasArgs<["t"]>;
+def: CLFlag<"Oy">, HelpText<"Omit frame pointers (x86 only)">,
+  Alias<_SLASH_O>, AliasArgs<["y"]>;
+def: CLFlag<"Oy-">, HelpText<"Don't omit frame pointers (default)">,
+  Alias<_SLASH_O>, AliasArgs<["y-"]>;
+
 def _SLASH_QUESTION : CLFlag<"?">, Alias<help>,
   HelpText<"Display available options">;
 def _SLASH_Qvec : CLFlag<"Qvec">,
@@ -326,10 +363,8 @@
 def _SLASH_FC : CLIgnoredFlag<"FC">;
 def _SLASH_Fd : CLIgnoredJoined<"Fd">;
 def _SLASH_FS : CLIgnoredFlag<"FS">;
-def _SLASH_GF : CLIgnoredFlag<"GF">;
 def _SLASH_kernel_ : CLIgnoredFlag<"kernel-">;
 def _SLASH_nologo : CLIgnoredFlag<"nologo">;
-def _SLASH_Og : CLIgnoredFlag<"Og">;
 def _SLASH_openmp_ : CLIgnoredFlag<"openmp-">;
 def _SLASH_permissive_ : CLIgnoredFlag<"permissive-">;
 def _SLASH_RTC : CLIgnoredJoined<"RTC">;

FWIW the recommendation against /Ox in my version is because of https://github.com/ulfjack/ryu/pull/70#issuecomment-412168459

hans added a comment.Sep 20 2018, 3:08 AM

Sorry, I didn't realize we both set off to do this in parallel. I've incorporated your changes into my patch.

test/Driver/Xarch.c
5

r342636

hans updated this revision to Diff 166252.Sep 20 2018, 3:08 AM

Uploading new diff.

Sorry, I didn't realize we both set off to do this in parallel. I've incorporated your changes into my patch.

No worries, I didn't do anything I wouldn't have done for reviewing this :-)

Thoughts on "As far as I can tell we do not make /O1 and /O2 imply /Gs -- we keep it at the default value of 4096 (?) Fixing this probably means increasing chrome's size (?)."?

include/clang/Driver/CLCompatOptions.td
94

Want to add "(default 4096)" here?

114

Move FIXME down one so it's next to the O0 flag?

117

Nit: Just "optimize for size" / "optimize for speed" is shorter

122

Why not alias this to _SLASH_O, d like the rest?

127

nit: I liked the old help text for the previous 4 more, it was more concise

128

nit: With this punctuation it looks ambiguous to me if the parenthetical refers to /O2 or /Ox.

thakis added inline comments.Sep 20 2018, 11:09 AM
include/clang/Driver/CLCompatOptions.td
120

Also, can we try to keep these at 80 columns?

The other flags in this file put HelpText before Alias and AliasArg (see my diff), but your order is fine. The 80 columns help trying to keep the help text concise, so it encourages better UI in this case :-)

hans marked 7 inline comments as done.Sep 25 2018, 6:07 AM

Thoughts on "As far as I can tell we do not make /O1 and /O2 imply /Gs -- we keep it at the default value of 4096 (?) Fixing this probably means increasing chrome's size (?)."?

The -mstack-probe-size option and /Gs was added later than the /O options (in r226601) so that's why it wasn't hooked up to /O1 and /O2 originally.

But yes, making /O1 and /O2 imply /Gs seems like a bad idea. That would mean emitting __chkstk in every function, increasing both size and slowing down execution. Even MSVC's docs say "/Gs0 activates stack probes for every function call that requires storage for local variables. This can have a negative impact on performance." (and "If the /Gs option is specified without a size argument, it is the same as specifying /Gs0").

Actually, trying this out with MSVC, I don't see any __chkstk calls with /O1, or with eg. /Gs1 for that matter:

a.cc:
int f(int x) { volatile int a[128] = {0}; return a[0]; 

cl /c /O1 a.cc && dumpbin /disasm a.obj

...

?f@@YAHH@Z (int __cdecl f(int)):
  00000000: 55                 push        ebp
  00000001: 8B EC              mov         ebp,esp
  00000003: 81 EC 00 02 00 00  sub         esp,200h
  00000009: 68 00 02 00 00     push        200h
  0000000E: 8D 85 00 FE FF FF  lea         eax,[ebp-200h]
  00000014: 6A 00              push        0
  00000016: 50                 push        eax
  00000017: E8 00 00 00 00     call        _memset
  0000001C: 8B 85 00 FE FF FF  mov         eax,dword ptr [ebp-200h]
  00000022: 83 C4 0C           add         esp,0Ch
  00000025: 8B E5              mov         esp,ebp
  00000027: 5D                 pop         ebp
  00000028: C3                 ret

Maybe MSVC started ignoring /Gs but didn't update the docs?

hans updated this revision to Diff 166862.Sep 25 2018, 6:07 AM

Addressing all comments.

thakis accepted this revision.Sep 25 2018, 6:33 AM

lgtm, thanks!

Maybe file a bug on figuring out the /Gs story and add a FIXME linking to it. Weird.

include/clang/Driver/CLCompatOptions.td
116

very nit: Maybe s/many/several/ or s/many/multiple/

This revision is now accepted and ready to land.Sep 25 2018, 6:33 AM

Actually, trying this out with MSVC, I don't see any __chkstk calls with /O1, or with eg. /Gs1 for that matter:

(.guess the bug should also cover to eventually not alias /Gs to /Gs0 and/or /Gs to -mstack-probe-size either if cl doesn't do that)

hans added a comment.Sep 25 2018, 7:08 AM

Actually, trying this out with MSVC, I don't see any __chkstk calls with /O1, or with eg. /Gs1 for that matter:

(.guess the bug should also cover to eventually not alias /Gs to /Gs0 and/or /Gs to -mstack-probe-size either if cl doesn't do that)

Filed https://bugs.llvm.org/show_bug.cgi?id=39074

include/clang/Driver/CLCompatOptions.td
116

Ah, you noticed that. I switched to many for the 80-col limit. But multiple is better, so let's use that.

This revision was automatically updated to reflect the committed changes.