This is an archive of the discontinued LLVM Phabricator instance.

[CMake] Add support for generating profdata for clang from training files
ClosedPublic

Authored by beanz on Dec 11 2015, 1:02 PM.

Details

Summary

This patch adds support for using LIT to drive generating PGO profile data for clang.

This first pass implementation should work on Linux and Unix based platforms. If you build clang using CMake with LLVM_BUILD_INSTRUMENTED=On the CMake build generates a generate-profdata target that will use the just-built clang to build any test files (see hello_world.cpp as an example). Each test compile will generate profraw files for each clang process. After all tests have run CMake will merge the profraw files using llvm-profdata.

Future opportunities for extension:

  • Support for Build->Profile->Build bootstrapping
  • Support for linker order file generation using a similar mechanism and the same training data
  • Support for Windows

Diff Detail

Repository
rL LLVM

Event Timeline

beanz updated this revision to Diff 42561.Dec 11 2015, 1:02 PM
beanz retitled this revision from to [CMake] Add support for generating profdata for clang from training files.
beanz updated this object.
beanz added a subscriber: cfe-commits.
silvas edited edge metadata.Dec 11 2015, 6:17 PM

Can you elaborate on why this patch is restricted to unix?

Also, this probably requires some documentation in
http://llvm.org/docs/CMake.html
(we don't really have an analogous page just for clang currently, so the llvm one is probably the best place right now)

utils/perf-training/CMakeLists.txt
21 ↗(On Diff #42561)

Can you write a tiny pure-Python helper script for clear-profraw and another for generate-profdata? That will be more portable.

If this is the only thing blocking windows support, I think that shooting for windows support on the initial patch is worth a shot.

vsk edited edge metadata.Dec 11 2015, 6:52 PM

Thanks! Interesting approach. I think @cmatthews would appreciate a generate-profdata-from-lnt target if you're up for it :).

Comments inline --

utils/perf-training/CMakeLists.txt
2 ↗(On Diff #42561)

Afaict, lines 2-8 look like they're provided by configure_lit_site_cfg (AddLLVM.cmake).

8 ↗(On Diff #42561)

Is it possible for this line to interact poorly with the one in clang/test/CMakeLists.txt? E.g something weird like if CMAKE_CFG_INTDIR is a substring of LLVM_BUILD_MODE?

17 ↗(On Diff #42561)

Can we build llvm-profdata too?

25 ↗(On Diff #42561)

We should use a llvm-profdata built out of the current source. Does the llvm_find_program function help with this?

33 ↗(On Diff #42561)

Sean's comment about relying on find applies here too.

utils/perf-training/lit.cfg
7 ↗(On Diff #42561)

This duplicates code in tools/clang/test/lit.cfg. Can we lift it into a shared module?

utils/perf-training/lit.site.cfg.in
3 ↗(On Diff #42561)

I don't know too much about what's going on here. It seems weird to check in auto generated files.. why do we need to do that?

beanz added a comment.Dec 14 2015, 9:14 AM

Sean,

The reason for restricting to Unix is two fold. (1) the shell script goop, which I can replace with python and (2) I don't have a windows box to test on, so I didn't want people to think it worked.

If I replace the shell goop with Python this will probably "Just Work" everywhere, so I can do that.

I also agree about your documentation point. That will need to be a separate patch.

More comments inline.

utils/perf-training/CMakeLists.txt
2 ↗(On Diff #42561)

configure_lit_site_cfg doesn't set CLANG_TOOLS_DIR, which needs to be set and uses this code. We should probably clean this all up so we don't need this duplication, but that is a separate problem.

8 ↗(On Diff #42561)

It won't interact with the other directory because CMake does level-based scoping. The tests are processed before this code, and the directory pops back out to the root before traversing this.

17 ↗(On Diff #42561)

You actually don't want that. You want your llvm-profdata to match the profile runtime you're building with. Since this builds with the profile runtime coming with the compiler you're using (which is usually the system compiler) you'll want a system profdata tool.

I do plan to support bootstrap PGO generation where if you do a 3-stage build. In that world stage1 will be your toolchain to use to build the instrumented compiler and stage1 would provide llvm-profdata.

21 ↗(On Diff #42561)

I'll give it a go today.

25 ↗(On Diff #42561)

No, we really don't want to do that. See my comment above (Note: I had implemented that and it doesn't work).

utils/perf-training/lit.cfg
7 ↗(On Diff #42561)

This code probably isn't needed. Since it should always use just-built clang, I can probably just use lit.util.which directly instead of this wrapper. I'll make that change.

utils/perf-training/lit.site.cfg.in
3 ↗(On Diff #42561)

This file is actually the input to the auto-generation. It needs to be checked in. That comment is all over the lit.*.in files so that people know the outputted files are auto-generated.

This probably ins't super important anymore because we don't support in-source builds, but back in the day it was because you'd end up with all these auto-generated lit.site.cfg files all over you source directory.

beanz updated this revision to Diff 42761.Dec 14 2015, 1:22 PM
beanz edited edge metadata.

Updates based on feedback from Sean and Vedant.

  • Should work on Windows now
  • Cleaned up lit config code

Sean,

The reason for restricting to Unix is two fold. (1) the shell script goop, which I can replace with python and (2) I don't have a windows box to test on, so I didn't want people to think it worked.

If I replace the shell goop with Python this will probably "Just Work" everywhere, so I can do that.

The lack of a windows box is totally understandable. Should Work is fine. I'll probably give this a shot on windows some time and we'll catch anything then.

utils/perf-training/perf-helper.py
14 ↗(On Diff #42761)

typo: Proofraw

28 ↗(On Diff #42761)

The explicit return isn't needed if you aren't returning anything (but I think you want to return 0).

33 ↗(On Diff #42761)

You seem to be using the return value of this function (and clean) as though they returned a value, but they don't seem to. Did you mean return 1 or whatever for these return's?

36 ↗(On Diff #42761)

I think you can use subprocess.call or subprocess.check_call here (they are just sugar API around Popen, but they cover the common cases like this case).

40 ↗(On Diff #42761)

I don't think you need this variable. Everywhere you set it you could just do sys.exit(<the thing you were setting return_code to>)

44 ↗(On Diff #42761)

Cute eval, but probably better to just use an explicit dictionary. Personally, I don't think we need the error handling (this script is never invoked by hand in regular duty, I don't think).

Something like this:

def main():
    COMMANDS = {'merge': merge, 'clean': clean}
    f = COMMANDS[sys.argv[1]]
    sys.exit(f(sys.argv[2:]))

That is one of the beauties (IMO) of python. The native safety of the language and "sufficiently readable" behavior when an error occurs (e.g. sys.argv[1] doesn't exist, invalid command is specified, etc.; you will get a stack trace showing the line where the error occurred and a readable error message) means that you can actually go quite far for scripts that are not meant to be explicitly and frequently "human-invoked" as their primary purpose.

I don't think it's worth going overboard for a quick script like this.

beanz updated this revision to Diff 42891.Dec 15 2015, 12:30 PM

Cleaning up the python helper as per Sean's suggestions.

Thanks. This LGTM but I'd wait for Duncan or Justin to sign off on it; they're likely to have more higher-level thoughts.

utils/perf-training/perf-helper.py
36 ↗(On Diff #42891)

I don't think you need the PIPE arguments (or maybe look into check_output if you want the output)

This revision was automatically updated to reflect the committed changes.