Initial version of indirect call optimization. The pass transforms profitable indirect call/invoke instructions into direct call/invoke instructions guarded by icmp's. The current heurustics approximate ones in "Eliminating virtual function calls in C++ programs" by Aigner and Holzle (ECOOP'96). In addition, we allow up to two hottest targets to be promoted. This patch will use indirect call metadata generated by an incoming clang patch.
Reduce the if nesting level with
if (MD->getNumOperands() < 4)
Use TotalICCount in hotness selection does not make sense -- it is not a program level metric but varies module by module. For non-LTO case, this end up boosting indirect calls in modules with relatively small total ic count.
To save memory usage, very soon the in-memory value profile data for indirect calls will no longer use C strings -- that means the meta-data for indirect call will also need to use md5 of the target names. This means a md5 to function map will need to be created at the start of the module pass.
Use a wrapper function to get Function from MD + OperandIndex. As mentioned above, do not assume the target is in string form.
The target string is the profile key of the target function. For static functions, they are different from their mangled names -- so this code does not work for static functions which are indirectly called. This is the reason why this pass needs to establish a key to function mapping first.
Well, not necessarily, right? I can see BFI and/or BPI adding the new indirect_call_targets MD based on static heuristics.
This is not updating IC. Why not just pass it as 'Instruction *IC'?
The code just removed IC. This looks wrong, or at the very least confusing. The code in this function seems to do a whole sequence of transformations. Maybe it would be easier to follow if it was factored out a bit?
Perhaps, factor out the block builders for the diamond selecting whether to call the direct function or the indirect one.
Hi David, Diego, thanks for the comments. Sorry for the delayed response.
How do you suggest this to be rephrased?
The function updates IC.
The function updates IC and its metadata by removing the peeled target data out of it.
We assume that this pass is invoked at whole-program LTO. For non-LTO, TotalCCount does not make sense.
Could you point an example of a md5 to function map and its use somewhere else in LLVM?
The interface exists:
Note that the returned function name already strips off the prefix, so you can use it directly to find the 'Function' object.
Hi Ivan, Betul,
Are there plans to continue this work? I started playing with this patch but looks like that the metadata was changed during the review on the clang side so this patch needs an update.
I can probably figure this out but was curious if you guys had something ready that would already do the trick.