Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
817 | I'm a little concerned that this is randomly determined. Could we think about using a function attribute instead that opts functions into groups by name? Or is the intent to have "sampling" in the coverage information instead of full coverage? |
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
817 | The intention is to have something deterministic across builds but yeah sampled-ish across the entire codebase. By iterating over the groups with different builds we are hoping to cover every function eventually. |
The idea seems fine.
By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once.
How large the overhead is? This is somewhat surprising to me.
clang/include/clang/Driver/Options.td | ||
---|---|---|
1343 | Joined. Separate is for a space between the option name and its value. | |
1345 | It is "N groups", not "N functions". | |
clang/lib/CodeGen/CodeGenFunction.cpp | ||
814 | For one group, the branch is skipped, which does not seem correct | |
815 | const ArrayRef<uint8_t> FuncName(..., ...) | |
820 | Omit braces around one-line simple statements. | |
clang/test/CodeGen/xray-function-groups.cpp | ||
4 | You can omit -x c++ -std=c++11 . They are insignificant. Then consider packing more options on one line to reduce the total number of lines. |
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
814 | Should we check or assert XRaySelectedFunctionGroups > 0 && XRaySelectedFunctionGroup < XRaySelectedFunctionGroups or the former at minimum? |
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
814 | Such checks are usually done in Driver. In CodeGen, as long as it doesn't crash, there is no need imposing stricter checks. |
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
814 | If there is only 1 group (the default value), then all functions that would be otherwise instrumented are. So there should be no need to do anything. |
It is the binary size overhead, not the runtime overhead, that we are limited on (when deployed on Android).
clang/lib/Frontend/CompilerInvocation.cpp | ||
---|---|---|
1136 | Errors here are not necessary. The driver has reported errors. |
This looks good from my viewpoint. One thing remains unanswered (https://reviews.llvm.org/D87953#2284071) is how the overhead is so large that you want to have multiple groups.
Deploying N versions of an executable can bring more challenges to the build system and deployment system.
I am also concerned about some internal functions whose names are optional (in ELF symbol table they don't really need a name).
I'd want to know how you weigh the tradeoff and decide to go this route.
This is a larger stuff that I want @dberris to sign off.
clang/lib/CodeGen/CodeGenFunction.cpp | ||
---|---|---|
815 | You can use some variants of makeArrayRef |
Thanks @MaskRay - I tried to answer that question in https://reviews.llvm.org/D87953#2286430. At present we are deploying instrumentation to an arbitrary subset of our application using the instruction threshold. I would like to make the selection of how many and which functions more deterministic, and be able to instrument different subsets over time. The overhead we are concerned with is purely binary size as we are deploying to Android devices. We are using features from my previous XRay patches to omit the function index already, but the sheer number of sleds and size of the associated xray_instr_map are the limiting factor of how much we can instrument in any given app release. For our use case it is fine to gradually over a period of weeks work our way across the entire app group by group.
Thanks for clarification. It is the size overhead, instead of the runtime overhead:) This makes sense. You probably want to enhance the test to check internal linkage (e.g. static functions). You can change bar or yarr instead of introducing new functions.
I probably have asked too much... but it'd be nice if you also add some documentation to llvm/docs/XRay*.rst That helps users :)
There's two tensions I'm trying to resolve in my mind with this patch:
- Is there a way to make this more deterministic, instead of using a random sampling mechanism based on the name?
- This looks like it's a subset of a more general facility to group functions by "module" or some logical grouping mechanism. Will this be superseded by a feature that allows for example to provide filters in the driver to define groupings? If so, wouldn't something like that be more flexible for your needs (group names in the IR)?
I can update the docs @MaskRay, not a problem. I'll tweak the test a little too per your comment.
@dberris I think if we wanted to control it in a non-'random' (the crc32 was chosen so it would be consistent even though it's not really predictable up front) way we'd have to emit some lists of function attributes and use the -fxray-attr-list=FILE option to really dial that in. If we wanted entire modules we could have the build system apply xray's flags to each module differently, but we are unlikely to want to instrument every function of a module at the same time so we are back to having to use instruction thresholds to cut down on it. Ultimately I think those two suggestions are legitimate alternative use cases but don't, for what we're trying to do, replace this strategy.
Okay, I think I understand the motivation a little bit more now (i.e. sampling seems to be a useful thing to do anyway, regardless of whether we have the other facilities).
Joined.
Separate is for a space between the option name and its value.