Instructions that contribute only to the computation of invariants are considered to be 'ephemeral'; the inliner needs to understand that these are free.
Diff Detail
Event Timeline
Rebased (and completely rewritten, except for the test case). In the original attempt I'd made ephemeral values an analysis pass. However, I've reconsidered because a) it would be really hard to preserve (we'd need to audit essentially all callers of RAUW, etc.) and is only used by a few passes and b) since it would often be invalidated it would interrupt the loop-level pass manager is undesirable ways (and it would need to be a module/cgscc-level pass for use by the inliner). So here I've just written a set of functions in CodeMetrics that return the set of ephemeral values which can be used by CodeMetrics directly, or the caller in some other way, as appropriate.
include/llvm/Analysis/CodeMetrics.h | ||
---|---|---|
97 | Use a SmallPtrSetImpl<const Value*>? | |
lib/Analysis/CodeMetrics.cpp | ||
56 | SmallPtrSet is significantly more efficient than DenseSet, please use it everywhere. | |
71 | nit: invert and use 'continue' to reduce indentation | |
76–77 | I think we have an ->operands() that will let you write this as a range loop. Even if not, auto would help. | |
78 | Why being "ephemeral" is equivalent to being safe to speculate isn't immediately obvious to me. might be nice to add a comment at least ... | |
lib/Analysis/IPA/InlineCost.cpp | ||
1110 | Oof, so, we can't do it this way in the inline cost analysis for terrible reasons. The reason why inline cost analysis ends up duplicating so much code from code metrics is because an essential property of inline cost analysis is that the amount of work done should be bounded by the constant threshold, not by the size of the function (or basic block). This is important because we may end up doing a linear number of calls to compute the inline cost where the linear factor would go quadratic if we do a linear amount of work here. =/ This is particularly problematic for ephemeral values because their very nature is defined backwards. I'm happy with this as an initial implementation but it should have a "FIXME" to revisit this before we start emitting them really heavily. |
lib/Analysis/CodeMetrics.cpp | ||
---|---|---|
78 | The essential property of an "ephemeral" instruction is that it will go away (no actual code will be generate corresponding to the operation it represents). If something is not safe to speculate, then it will not really go away because it might have side effects. Does that make sense? | |
lib/Analysis/IPA/InlineCost.cpp | ||
1110 | Alright; we'll need to think of a better way. What if we did this only once per function, and cached the result. When a function is inlined, the inliner updates a ephemeral value map so the list of instructions in the caller can be updated. Now it is possible for inlining to introduce new ephemeral values in the caller, but only from things that used to be function arguments so we could then do a more-limited propagation. The problem is that the scan is already linear, so I'd be ruining your complexity guarantee as soon as the functionality is committed regardless of the number of invariants. |
Used SmallPtrSet instead of DenseSet (and some control flow cleanup as per Chandler's review).
I don't spot any obvious bugs, but I don't feel comfortable signing off on this code either. You need to find a reviewer who is familiar with this code.
Also, your tests need improvement. Some points mentioned inline. You're testing for the impact on assume, but not for the impact on the other intrinsics.
lib/Analysis/CodeMetrics.cpp | ||
---|---|---|
48 | Most of this change is driven by the assume intrinsic changes, but this actually changes behaviour for existing code as well. It might be good to separate this infrastructure into two distinct changes and get the addition of the other intrinsics more widely reviewed. (i.e. despite the title and description, this *isn't* specific to assume/invariant) I'm mostly suggesting this for risk management and blame purposes. | |
57 | A useful optimization might be to seed the EphValues set with every single use value in WorkSet. This would help cases like: With your current implementation, if you consider 'a' first, you'll discard it because 'c' is not known ephemeral. You'll eventually revisit it (through 'c''s operands), but this is wasted work. You could even go a step further and make the worklist include only items just found to be ephemeral. You'd essentially be growing the frontier of ephemeral values back through the value graph. This might be a more obvious algorithm. Worth noting is that neither algorithm handles phi cycles. Both are conservative, not optimistic. (In the classic data flow sense.) I think that's fine for now, but worth noting explicitly. | |
lib/Analysis/IPA/InlineCost.cpp | ||
903 | It looks like the previous check is now redundant with this one? It might be worth leaving it in place for clarity, but if so, an explicit comment might be warranted. | |
test/Transforms/Inline/ephemeral.ll | ||
24 | This test appears really fragile w.r.t. our current inlining heuristics. Is there a way you could restructure this to avoid that? If not, add some comments explaining how to adjust the test if the inlining heuristic change. You also need to add some form of negative example to show that the ephemeral logic actually contributes to the inlining. | |
test/Transforms/LoopUnroll/ephemeral.ll | ||
8 | Same comments as above w.r.t. test structure. Also, a comment explaining the test approach would be helpful. It looks like you're relying on the return to be constant folded after unrolling? |
Thanks again for the review!
lib/Analysis/CodeMetrics.cpp | ||
---|---|---|
48 | You're right, but the blame will be clear regardless. This needs to be committed before we actually start *generating* @llvm.assume in Clang for anything (and before anyone else should generate them either). As it turns out, most of the people interested in this feature are the people who best understand the impact on costing for the other intrinsics too. Thus, I'm not concerned. | |
57 | Ack; yep. I actually had it that way originally, but made it so that only one place in the code inserted things into EphValues while I was debugging something, and then forgot to change it back. You're correct, that should be noted. | |
test/Transforms/Inline/ephemeral.ll | ||
24 | I could hard code the current threshold into the RUN line (that having been said, I adapted this from a similar test that will also need to be changed should the heuristic be altered). I don't understand your second comment. If you don't account for the ephemeral values, then the extra instructions in this example prevent inlining. If you do account for them, then the inlining happens. | |
test/Transforms/LoopUnroll/ephemeral.ll | ||
8 | Hey, the threshold is even hard-coded in this one. ;) No, I'm adding enough extra instructions such that unrolling will be prevented unless you account for the fact that those extra instructions are ephemeral (and, thus, free). But you're right about one thing... some comments are needed. :-) |
lib/Analysis/IPA/InlineCost.cpp | ||
---|---|---|
903 | It is redundant, only if the EphValues set is fully populated. I don't want the rest of the infrastructure to depend on that (at least for now). I'll add a comment. |
lib/Analysis/CodeMetrics.cpp | ||
---|---|---|
48 | That said, the next version of this patch will handle only @llvm.assume (because they'll be an easy way to find them, thus decreasing the cost). |
Addressed some review comments (others are no longer relevant because we only look at @llvm.assume via the AssumptionTracker). Thanks!
Meta comment on patch structure: I think you'd do well to separate most of the per-pass changes into separate changes. Get the infrastructure in, with one key example. Then, fix the passes which want to use assumptions multiple times, then trace through the pass order fixing preservation and use at each pass as encountered.
LGTM. I'd still prefer you have each of the specific passes reviewed by someone knowledgeable of those sections.
lib/Analysis/CodeMetrics.cpp | ||
---|---|---|
65 | This code still has issues where you'll miss eph values based on visit order. This isn't a correctness problem, just possible missed optimizations. | |
74 | Arguably, we should stop looking at eph values outside the loop during the search, but that's minor. They're likely to have other uses as well. | |
test/Transforms/Inline/ephemeral.ll | ||
24 | For the second comment, you need a test case that shows the inliner not inling unless the values are ephemeral. Your current tests could be made to pass by unconditional inlining. Arguably, this is an inliner test, not an ephemeral value test and might already exists. |
Use a SmallPtrSetImpl<const Value*>?