Committed in r296282 with a couple of changes:
- Made it so that parameter definitions have a 'child' relation with their function/method (instead of 'containedBy'). This allows looking up parameters of a function by looking up their children.
- Given that parameters show up in parent-child hierarchies, I think it is a bit better to make them to a separate 'SymbolKind' instead of a sub-kind. This preserves the property that 'variable' symbol kind does not ever enter a parent-child hierarchy.
This is the first patch I've contributed here, so I'm not familiar with the whole process. I assume this code is ready to land? When exactly does it get merged into master, and is there something else that I still need to do to make that happen?
@davide Thanks for taking a quick look !
Also, I think the test from the PR (after some polishing, maybe) is good to add. If somebody removes these lines at least all (or a subset of) the bots will timeout and we'll notice the regression.
I actually forgot to recommit this one :(
The issue with the original patch was that I folded the call inside the assert so ConstantFoldTerminator() was compiled away in !DEBUG mode.
Oh, well. Your reasoning is correct (actually, was my original reasoning when I committed the patch).
Looks good. I will push shortly.
if BP is not correct, it is better to improve static branch prediction. We explicitly added a threshold for the cost based analysis result to kick in just to be conservative when the branch probability is not biased enough. Even for the long chain case, tail dup is enabled for 50/50 case, but the real profile is 40/60, taildup will hurt performance. I don't see the reason to by pass the branch prob + cost analysis by just looking at the shape.
btw, down the road you may want to have this pass really know in detail the encoded length of each instruction on x86. There are quite a few *single instructions* that would be beneficial from a code size perspective to outline (if the outlined function is set to have alignment of 1). A quick analysis of an LLD binary (which contains all of LLVM linked in for LTO) shows there is over 5% code size savings just from outlining single instructions (since many x86 instructions encode to be larger than a CALL instruction which is 5 bytes). About half of the benefit (so about 2-3% of the total on this test case) comes from instructions that reference the stack via %rsp (mostly zeroing out stack slots), which could still be outlined if the offset was rewritten.
Are there other cases that you've noticed?