Enable IR-level instrumentation -fprofile-generate or -fprofile-generate=
When applying profile data: -fprofile-use=/path/to/profdata
Some basic comments.
|483 ↗||(On Diff #62138)|
This help text is stale.
|486 ↗||(On Diff #62138)|
You should probably update this help text to say something about applying.
|3474 ↗||(On Diff #62138)|
Call this PGOTrainArg for consistency.
|3496 ↗||(On Diff #62138)|
Call this PGOApplyArg for consistency.
Thanks for doing this. I like the patch in general. Hers are my minor feedbacks:
(1) It seems the original options of -fprofile-instr-generate and -fprofile-instr-use will be kept. Is so, what is the guideline to use these two sets of options. Also, we need some documentations for the new user-visible
(2) One difference b/w -fprofile-instr-generate/-fprofile-instr-use and the new options is -fpgo-train= can coexist with -fpgo-apply=<file>. I don't think they work as intended in current implementation. I would suggest to not allow it.
(3) I would request a driver option to specify the profile name. While LLVM_PROFILE_FILE environment variable can do that, a command line option is gonna be useful (at least in our setup).
Sorry I'm not sure I understand what you're asking, but it's my understanding that we would simply detail the lack of interoperability between the existing set of flags and the ones added by this patch.
Also, we need some documentations for the new user-visible
I'm happy to add some info to the Clang UserManual in this patch if everyone is happy with the flag names.
(2) One difference b/w -fprofile-instr-generate/-fprofile-instr-use and the new options is -fpgo-train=can coexist with -fpgo-apply=<file>. I don't think they work as intended in current implementation. I would suggest to not allow it.
Sure, I've updated the patch to not allow this.
|507 ↗||(On Diff #62527)|
Why is it called 'default' output? The default output is the one used why user does not specify any name.
Why not just -fpgo-train-output=... ?
|3560 ↗||(On Diff #62527)|
I think it is better to make the selector 'source' vs 'cfg'.
|3560 ↗||(On Diff #62527)|
So would -fpgo-train=cfg enable icall instrumentation or not?
My thinking is that down the road we will have one flag for each independent instrumentation capability (and of course some are mutually incompatible). This mirrors what the sanitizers do. For example, we would have:
We can then have -fpgo-train=optimizer enable e.g. -fpgo-train=optimizer-cfg,optimizer-icall.
Since instrumentation can have different overheads or different runtime requirements, users may want to disable some instrumentation.
I should have brought it up earlier, but I forgot. I think a better (and simpler) proposal is to make -fprofile-generate and -fprofile-use turn on IR based PGO.
-fprofile-generate and -fprofile-use options were introduced by Diego (cc'ed) recently for GCC option compatibility. At that time, IR based PGO was not yet ready, so they were made aliases to FE instrumentation options -fprofile-instr-generate and -fprofile-instr-use. Now I think it is time to make it right. The documentation only says that these two options are gcc compatible options to control profile generation and use, but does not specify the underlying implementation. It is also likely that Google is the only user of this option. If a user using this option really want FE instrumentation, there is always -fprofile-instr-generate to use. It also more likely that IR instrumentation is what user wants when using GCC compatible options (as they behave more similarly).
|3493 ↗||(On Diff #63367)|
Is this check needed?
|3513 ↗||(On Diff #63367)|
Merge this with ProfileUseArg.
|3531–3532 ↗||(On Diff #63367)|
|3535 ↗||(On Diff #63367)|
|3539 ↗||(On Diff #63367)|
|3543 ↗||(On Diff #63367)|
if ((ProfileGenerateArg || ProfileTrainArg) && ProfileUseArg)
|3568 ↗||(On Diff #63367)|
It may cause some user confusion since there is no obvious distinction between "profile generate" and "profile instr generate" from a user perspective. But we can avoid that with improved documentation.
My main concern is that the existing documentation already says that -fprofile-generate behaves identically to -fprofile-instr-generate http://clang.llvm.org/docs/UsersManual.html#cmdoption-fprofile-generate
However, I think it is reasonable to change this behavior (and of course the documentation), as users of this option are likely using it for compatibility with GCC and so they likely don't care about the specifics of clang FEPGO.
We probably want to at least leave a small note in the documentation regarding this change of behavior.
Please also update user manual: docs/UserManual.rst
|3596 ↗||(On Diff #63634)|
This should probably split out as a follow up.
|90 ↗||(On Diff #63634)|
Probably change this test into the check of disallowed option combination : -fprofile-instr-generate -fprofile-generate
|10 ↗||(On Diff #63634)|
Please add a FIXME here. I noticed that there is a bug in IR PGO implementation that -fprofile-generate still invokes overrider api in instrumentation which is unnecessary.