Adds a top down order to the ModuleInliner. A top down inlining order ensures
that all possible inlining decisions are possible (depending on what the
InliningAdvisor chooses), e.g., callsites in callees can be independently
|60,040 ms||x64 debian > libFuzzer.libFuzzer::minimize_crash.test|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/NullDerefTest.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/minimize_crash.test.tmp-NullDerefTest
|60,090 ms||x64 debian > libFuzzer.libFuzzer::value-profile-load.test|
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/lib/fuzzer -m64 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/fuzzer/LoadTest.cpp -fsanitize-coverage=trace-gep -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/fuzzer/X86_64DefaultLinuxConfig/Output/value-profile-load.test.tmp-LoadTest
Put a period CD..
Are you duplicating much of InlineOrder because you use NodeCallCount as cache and do not care about updating the priority while the priority queue is consumed? I'm wondering if there is a good way for you to take what you need without duplicating the rest.
Maybe "how many places" instead of "how many times". Otherwise, we might sound like we are talking about profiling counts.
Capitalize the first letter and put a period at the end. Check which call sites caller has least calls.
Please rename this to LeftCount.
This applies to other variables in the same function.
How do you ensure the top down inlining when you have a run of single edges with NodeCallCount alone? Here, each one of B, C, D, etc is called from exactly one place, but there is a clear order in the call graph.
A / \ B C | | D E | | F G | | H I
Put a period at the end like functions..
How does this work? It looks like NodeCallCount keeps changing while you are pushing all call sites at the beginning of the inlining pass (see ModuleInliner.cpp:145). So, a call site inserted at the beginning of the population loop and another call site inserted toward the end of the population loop are based on totally different values of NodeCallCount even though you haven't done any inlining at all. Unless you call std::make_heap at the end of the initial population loop, I am not sure if call sites are in any particular order that makes sense.
Put some comment like:
// A map from a function to the number of call sites it is statically called from.
Please add a new line. The message "No newline at end of file" is a bit distracting now and when somebody adds more lines to the file and examines the diff.
You make a good point, I forgot to decrease the call count when we pop the Caller node. This should be addressed now.
I've provided a more indepth explanation as a response to one of your later comments.
The idea is basically that we only ever consider callers that appear the least number of times in our "NodeCallCount" however as we pop calls we also remove them from our "NodeCallCount" so in the above example.
A / \ B C | | D E | | F G | | H I
If we decide not to inline A->B or A->C, once poped we would effectively end up counts that represent a call graph like:
B C | | D E | | F G | | H I
Which would then consider B->D or C->E, and so on...
If the calls do get inlined we would end up with counts for a similar call graph:
AB AC | | D E | | F G | | H I
The bulk of the call sites an the heap aren't necessarily in a particular order, only the top of the heap, but since it's only ever accessed by popping the top is the only value that really needs to be in the right place.
Do you mean duplication from PriorityInlineOrder?
Most of the duplication is to keep track of InlineHistory, and managing what CallBases are corrently being considered (heap). I'm not sure which parts we would be able to remove.
We could replace the Heap with a vector or set and not worry about the orider but I'm not sure if that's the kind of change you were suggesting.
Thanks for the review @hiraditya I've come to realize this patch might not fit well with the other inline-order options. Instead I think it might make more sense to implement the functionality using custom plugins.
I opened a separate PR in this direction https://reviews.llvm.org/D140637. I would appreciate it if you could take a look at that.