This is an archive of the discontinued LLVM Phabricator instance.

[PGO, clang] Comment how function pointers for indirect calls are mapped to function names
ClosedPublic

Authored by anemet on Mar 26 2016, 2:05 AM.

Diff Detail

Event Timeline

anemet updated this revision to Diff 51713.Mar 26 2016, 2:05 AM
anemet retitled this revision from to [PGO] Comment how function pointers for indirect calls are mapped to function names.
anemet updated this object.
anemet added reviewers: bogner, davidxl.
anemet added a subscriber: cfe-commits.
anemet retitled this revision from [PGO] Comment how function pointers for indirect calls are mapped to function names to [PGO, clang] Comment how function pointers for indirect calls are mapped to function names.Mar 26 2016, 12:08 PM
anemet updated this object.
davidxl added inline comments.Mar 26 2016, 1:17 PM
lib/CodeGen/CodeGenPGO.cpp
758

CodeGenPGO::valueProfile is a common API which can also be used for other kinds of value profile, so the comments belong to the callsite of this method in CGCall.cpp.

Suggested wording:

For indirect function call value profiling, the addresses of the target functions are profiled by the instrumented code. The target addresses are written in the raw profile data and converted to target function name's MD5 hash by the profile reader during deserialization.

anemet added inline comments.Mar 28 2016, 10:00 AM
lib/CodeGen/CodeGenPGO.cpp
758

Hi David,

Thanks for taking a look.

I don't mind rewording the comment if you have some specific issues but your version drops many of the details that was painful for me to discover. I carefully worded my comment to describe some of the design details for indirect profiling that are not covered elsewhere:

  1. the remapping from function pointer to hashes of function names happens during profdata merging

    It was surprisingly hard to figure out what the PGO file contained for indirect call targets: function pointers or func name hashes? And of course as stated, the answer is -- it depends.
  1. the remapping happens pretty deep in the infrastructure code during deserializing of the rawdata

    I was looking at several more logical places to find where this happens and this was a bit unexpected location for this functionality.
  1. how do we know the function addresses necessary for the mapping from function address -> func name -> hash

    The entire code to populate the FunctionAddr using macro magic in InstrProfiling::getOrCreateRegionCounters is pretty hard to follow. I much rather have an overview of the logic somewhere centrally.

From the above list I feel that your version dropped 1 and 3 and only kept 2. Of course, feel free to add more context to my comments and fix inaccuracies/oversimplifications but I would want want to drop any of the points mentioned above.

Thanks.

davidxl added inline comments.Mar 28 2016, 10:11 AM
lib/CodeGen/CodeGenPGO.cpp
758

Actually bullet 1 is not dropped from my suggested version: each time when a raw profile data is read, the reader interface will covert the address into MD5 -- profile merging is simply a user of the reader.

About bullet 3, it is ok to add it if you think it is useful. ( I did not suggest it because ProfData is needed here is not for the purpose of conversion, but to pass the context for the runtime to find/set the counter arrays.)

anemet added inline comments.Mar 28 2016, 10:40 AM
lib/CodeGen/CodeGenPGO.cpp
758

Actually bullet 1 is not dropped from my suggested version: each time when a raw profile data is read, the reader interface will covert the address into MD5 -- profile merging is simply a user of the reader.

Sure but that is still pretty implicit. I'd like to describe this in terms of the well established phases of PGO: instrumentation, profile merge, recompile with profile data.

How about:

For indirect function call value profiling, the addresses of the target functions are profiled by the instrumented code. The target addresses are written in the raw profile data and converted to target function name's MD5 hash by the profile reader during deserialization.  Typically, this happens when the the raw profile data is read during profile merging.

About bullet 3, it is ok to add it if you think it is useful. ( I did not suggest it because ProfData is needed here is not for the purpose of conversion, but to pass the context for the runtime to find/set the counter arrays.)

I am not completely sure I understand what you're saying here but if you mean that the comment does not really apply to the surrounding code then that I guess is expected. I wanted to give a high-level overview of *what* we collect at run-time, and *how* we map that into function names hashes that are then used in the IR.

I can also put this in the function comment if you think that's better.

anemet added inline comments.Mar 28 2016, 12:31 PM
lib/CodeGen/CodeGenPGO.cpp
758

David,

CodeGenPGO::valueProfile is a common API which can also be used for other kinds of value profile, so the comments belong to the callsite of this method in CGCall.cpp.

How would you actually feel about moving this comment to InstrProfData.inc as well, just before the definition of IPVK_IndirectCallTarget?

I think that's a better place in terms of its centrality since this applies to both early and late instrumentation. What do you think?

Adam

davidxl edited edge metadata.Mar 28 2016, 12:37 PM
davidxl added a subscriber: davidxl.

What I meant is that putting the comments in InstrProfData.inc file, and
update the one in CodeGenPGO.cpp to reference that.

David

What I meant is that putting the comments in InstrProfData.inc file, and
update the one in CodeGenPGO.cpp to reference that.

Sounds good. You mean CGCall.cpp instead of CodeGenPGO.cpp though, correct?

So to summarize, I'll move this comment to InstrProfData.inc and just have a reference to it in CGCall.cpp before the call to valueProfile.

OK?

Sure.

thanks,

David

anemet accepted this revision.Mar 28 2016, 3:11 PM
anemet added a reviewer: anemet.

Was accepted by davidxl.

This revision is now accepted and ready to land.Mar 28 2016, 3:11 PM
This revision was automatically updated to reflect the committed changes.