Page MenuHomePhabricator

[CSSPGO] Enable loading MD5 CS profile.
ClosedPublic

Authored by hoy on Aug 18 2021, 4:37 PM.

Details

Summary

Adding the compiler support of MD5 CS profile based on pervious context split work D107299. A MD5 CS profile is about 40% smaller than the string-based extbinary profile. As a result, the compilation is 15% faster.

There are a few conversion from real names to md5 names that have been made on the sample loader and context tracker side to get it work.

Diff Detail

Event Timeline

hoy created this revision.Aug 18 2021, 4:37 PM
hoy requested review of this revision.Aug 18 2021, 4:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 4:37 PM
hoy updated this revision to Diff 367360.Aug 18 2021, 4:45 PM

Updating D108342: [CSSPGO] Enable md5-based CS profile.

hoy retitled this revision from [CSSPGO] Enable md5-based CS profile. to [CSSPGO] Enable loading MD5 CS profile..Aug 18 2021, 4:53 PM
hoy edited the summary of this revision. (Show Details)
hoy added reviewers: wenlei, wlei, wmi.

The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.

Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?

llvm/lib/ProfileData/SampleProfReader.cpp
762

As we discussed, use SampleProfileReader::ProfileIsCS instead.

766

OrderedNames -> OrderedContexts

994

Can we move FunctionSamples::UseMD5 = useMD5() to be before readImpl() inside read() and then avoid this extra UseMD5 setting?

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
487

Remove copies:

MD5Names.emplace_back();
getRepInFormat(Location.second, FunctionSamples::UseMD5, MD5Names.back());
llvm/lib/Transforms/IPO/SampleProfile.cpp
987

Good catch here and below.

llvm/tools/llvm-profgen/CSPreInliner.cpp
42–43

I think this would break preinliner when building profiled call graph - ContextTracker.getFuncNameFor would fail.

Add a test case for md5 pre-inliner? perhaps just add md5 version for an existing test case.

hoy marked 3 inline comments as done.Mon, Aug 30, 1:29 PM

The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.

Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?

Yes, that's the way we were using previously. There is an optimization we can do before it comes to the writer for pseudo probe. Pseudo probe comes with GUID so we can use that inside in the profile generator and unwinder. I think @wlei had a change for that which we can land separately.

llvm/lib/ProfileData/SampleProfReader.cpp
994

Actually useMD5 relies on readImpl() to be set. It is until the reader actually reads the SecNameTable we know we are reading a md5 profile or not. The reader checks the SecFlagMD5Name flag to see if SecNameTable is filled with real names or md5 names.

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
487

Sounds good.

llvm/tools/llvm-profgen/CSPreInliner.cpp
42–43

md5 is not enabled with llvm-profgen for now. Will a test when it is enabled.

It also depends on how we achieve md5 profile generation. If we rely on the writer to convert real names to md5, then CSPreliner should see real names here. Otherwise, it can see md5 codes and we need to build a guid map for it.

hoy updated this revision to Diff 369540.Mon, Aug 30, 1:30 PM

Addressing Wenlei's comment.

wlei added a comment.Mon, Aug 30, 1:39 PM

The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.

Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?

Yes, that's the way we were using previously. There is an optimization we can do before it comes to the writer for pseudo probe. Pseudo probe comes with GUID so we can use that inside in the profile generator and unwinder. I think @wlei had a change for that which we can land separately.

Yes, it's in https://reviews.llvm.org/D107779 (TODO: change the writer to convert MD5 back to std::string), I can continue with it when we check in this.

wenlei added inline comments.Mon, Aug 30, 1:44 PM
llvm/lib/ProfileData/SampleProfReader.cpp
994

Ok.. Taking another look, where in reader do we depend on FunctionSamples::UseMD5 though? With the current definition (pasted below) for useMD5(), we could check useMD5() within reader, then we avoid this special case here?

Similar to ProfileIsCS, in general it's more consistent to use reader flag while we're in reader.

bool useMD5() override { return MD5StringBuf.get(); }
llvm/tools/llvm-profgen/CSPreInliner.cpp
42–43

Sounds good, add a TODO here then?

The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.

Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?

Yes, that's the way we were using previously. There is an optimization we can do before it comes to the writer for pseudo probe. Pseudo probe comes with GUID so we can use that inside in the profile generator and unwinder. I think @wlei had a change for that which we can land separately.

Okay, but is it not working at at all without Lei's optimization, or is it too slow? Asking because I'm wondering why you had to use llvm-profdata to generate md5 profile, instead of just let writer take care of md5 from llvm-profgen directly.

hoy added a comment.Mon, Aug 30, 2:07 PM

The work on llvm-profgen side is not included. Currently I'm relying on llvm-profdata to get a md5 cs profile.

Is there blocker for this to be done in llvm-profgen? I imagine if llvm-profdata can generate md5 cs profile, llvm-profgen should be no different by reusing the writer to handle this?

Yes, that's the way we were using previously. There is an optimization we can do before it comes to the writer for pseudo probe. Pseudo probe comes with GUID so we can use that inside in the profile generator and unwinder. I think @wlei had a change for that which we can land separately.

Okay, but is it not working at at all without Lei's optimization, or is it too slow? Asking because I'm wondering why you had to use llvm-profdata to generate md5 profile, instead of just let writer take care of md5 from llvm-profgen directly.

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

llvm/lib/ProfileData/SampleProfReader.cpp
994

Good question. The dependency is from using FunctionSamples::getGUID when loading func profiles. That method checks if FunctionSamples::UseMD5 is set.

// Assume the input \p Name is a name coming from FunctionSamples itself.
// If UseMD5 is true, the name is already a GUID and we
// don't want to return the GUID of GUID.
static uint64_t getGUID(StringRef Name) {
  return UseMD5 ? std::stoull(Name.data()) : Function::getGUID(Name);
}

We can just use std::stoull(Name.data()) instead to avoid this change.

hoy updated this revision to Diff 369550.Mon, Aug 30, 2:18 PM

Updating D108342: [CSSPGO] Enable loading MD5 CS profile.

wenlei accepted this revision.Mon, Aug 30, 7:03 PM

lgtm, thanks!

This revision is now accepted and ready to land.Mon, Aug 30, 7:03 PM

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

hoy added a comment.Mon, Aug 30, 9:53 PM

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

I see. It should be straightforward to add md5 support to llvm-profgen. I just added here.

hoy updated this revision to Diff 369621.Mon, Aug 30, 9:54 PM

Adding md5 support to llvm-profgen.

hoy updated this revision to Diff 369622.Mon, Aug 30, 9:55 PM

Updating D108342: [CSSPGO] Enable loading MD5 CS profile.

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

I see. It should be straightforward to add md5 support to llvm-profgen. I just added here.

Ok, thanks. Yeah this is what I'm trying to understand - why not adding it here since it should be one-liner. :)

Can you add md5 test case for llvm-profgen too? (We also need to address preinliner with md5 asap, ok for a later patch)

llvm/tools/llvm-profgen/ProfileGenerator.cpp
32

Similar to llvm-profdata, call out in description that this is only meaningful for -extbinary.

hoy updated this revision to Diff 369623.Mon, Aug 30, 10:09 PM

Updating D108342: [CSSPGO] Enable loading MD5 CS profile.

hoy added a comment.EditedMon, Aug 30, 10:10 PM

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

I see. It should be straightforward to add md5 support to llvm-profgen. I just added here.

Ok, thanks. Yeah this is what I'm trying to understand - why not adding it here since it should be one-liner. :)

Can you add md5 test case for llvm-profgen too? (We also need to address preinliner with md5 asap, ok for a later patch)

A test is already added to noinline-cs-noprobe.test. Do you mean in another form? Since md5 profile is not text, I'm not sure what's a good way to check its content.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
32

Done.

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

I see. It should be straightforward to add md5 support to llvm-profgen. I just added here.

Ok, thanks. Yeah this is what I'm trying to understand - why not adding it here since it should be one-liner. :)

Can you add md5 test case for llvm-profgen too? (We also need to address preinliner with md5 asap, ok for a later patch)

A test is already added to noinline-cs-noprobe.test. Do you mean in another form? Since md5 profile is not text, I'm not sure what's a good way to check its content.

Ok, I didn't notice that one. But looking at it, I think we could check the profile summary (detailed summary) is the same as non-md5 profile?

hoy added a comment.Tue, Aug 31, 8:32 AM

I had to use llvm-profdata simply because of the support of md5 in llvm-profgen was not ready. It shouldn't be too slow now for llvm-profgen to use real names with the context split work. We could give Lei's approach a shot to see how much win can be get.

What I was wondering is what does it take for md5 support to be ready for llvm-profgen? I thought it's close to a no-op as llvm-profgen use the same writer, and the handling md5 is in the writer. Of course we can use Lei's change to make it faster, but that's just optimization.

I see. It should be straightforward to add md5 support to llvm-profgen. I just added here.

Ok, thanks. Yeah this is what I'm trying to understand - why not adding it here since it should be one-liner. :)

Can you add md5 test case for llvm-profgen too? (We also need to address preinliner with md5 asap, ok for a later patch)

A test is already added to noinline-cs-noprobe.test. Do you mean in another form? Since md5 profile is not text, I'm not sure what's a good way to check its content.

Ok, I didn't notice that one. But looking at it, I think we could check the profile summary (detailed summary) is the same as non-md5 profile?

That should work.

hoy updated this revision to Diff 369710.Tue, Aug 31, 8:33 AM

Updating D108342: [CSSPGO] Enable loading MD5 CS profile.

wmi added inline comments.Tue, Aug 31, 8:58 AM
llvm/lib/ProfileData/SampleProfReader.cpp
762

Since Function in non context profile can also be represented using SampleContext, can we merge the case for cs and non-cs profile cases here?

785–786

It can be collapsed into one if.
if (useMD() && !FuncGuidsToUse.count(std::stoull(FuncName.data())) ||

   (!FuncsToUse.count(FuncName) &&
            (!Remapper || !Remapper->exist(FuncName))))
continue;
llvm/lib/Transforms/IPO/SampleProfile.cpp
987

+1 for the good catch.

hoy added inline comments.Tue, Aug 31, 9:21 AM
llvm/lib/ProfileData/SampleProfReader.cpp
762

The processing of cs profile is a bit different from non-cs in that cs profile needs to be loaded in some order that profiles should be loaded for not only functions in the current module, but also for the callee functions out of current module. Note that OrderedContexts used only in the cs case diverges from non-cs case. I'm also changing how to load profiles for cs profiles in another patch which makes them diverged even more.

785–786

Currently in both Md5 and string case, FuncsToUse contains the function real names computed by FunctionSamples::getCanonicalFnName. We could change that to include md5 names for the md5 case to unify the two paths. However, the md5 name string for a function needs to be stored somewhere before added to FuncsToUse which only takes a stringRef. That seems a bit complicated. What do you think?

wmi added inline comments.Tue, Aug 31, 10:49 AM
llvm/lib/ProfileData/SampleProfReader.cpp
762

Ok, if we can come up with a solution to differentiate them in a query function or shrink the guarded range of ProfileIsCS, that will be ideal. I thought a bit but didn't get it right.

785–786

Ok, then can we have

if (useMD() && !FuncGuidsToUse.count(std::stoull(FuncName.data())) ||
    (!useMD() && !FuncsToUse.count(FuncName) &&
            (!Remapper || !Remapper->exist(FuncName))))
continue;
hoy added inline comments.Tue, Aug 31, 10:56 AM
llvm/lib/ProfileData/SampleProfReader.cpp
762

Sure, let me see if I can merge them with upcoming work that is based on a presorted func offset table.

785–786

Sounds good.

hoy updated this revision to Diff 369742.Tue, Aug 31, 10:57 AM

Addressing Wei's comment.

hoy edited the summary of this revision. (Show Details)Wed, Sep 1, 8:44 AM
wmi accepted this revision.Wed, Sep 1, 8:56 AM

LGTM.

This revision was landed with ongoing or failed builds.Wed, Sep 1, 9:20 AM
This revision was automatically updated to reflect the committed changes.