This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO] split context string - compiler changes
ClosedPublic

Authored by hoy on Aug 19 2021, 6:54 PM.

Details

Reviewers
wmi
wenlei
wlei
Summary

This is the compiler part of the context split work. Please refer to D107299 for the original whole patch.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
wmi added a comment.Aug 23 2021, 8:35 PM

LGTM, thanks.

This revision is now accepted and ready to land.Aug 23 2021, 8:35 PM

Thanks for working on split context representation. This really makes context first class citizen in the framework, and the speed up it brought us is also very nice.

llvm/include/llvm/ProfileData/SampleProf.h
406

nit: having the word Type in type name does not add value, and the class also doesn't have anything specific for sample. Suggest we call it CallSiteLocation.

There're a few other others with Type in its name too.

423

LineOffset 0 can be a valid call site location given this is offset not absolute line number.

443–444

SampleContextRefType makes people think of SampleContext&, how about SampleContextStorageType -> SampleContextFrameVector, SampleContextRefType -> SampleContextFrames?

Accordingly SampleContext::getFullContext->SampleContext::getContextFrames.

449–458

This needs update.

One thing I was looking for a clarification from comments somewhere but didn't find it: whether context-less profile would have a non-empty FullContext which only contains leaf? From the code it seems answer is no? But then this is not consistent since FullContext of CS profile contains leaf frame.

Or would it be possible to remove leaf frame in FullContext for CS profile too? leaf doesn't need call site location either.

551–552

Can we remove this and replace all its call with getName?

575

Use std::ostringstream to help repeated string append?

581

Seems std::string toString() const is the conventional name for string converter.

592

Sanity check/assert to make sure the new name does not conflict leaf frame in FullContext? Or is this only allowed for non-CS case?

598

Should we assert that CState is not unknown? i.e. even if Context only has one frame, this is still considered context profile (something like [foo] before, as opposed to foo).

648

Clarify in comment whether this is empty for non-CS profile.

648–655

This sequence of copy, edit and compare could be simplified?

if (ThisContext.back().CallerName != ThatContext.back().CallerName)
  return false;

Even better, would it be possible to canonicalize context, so location of leaf frame is always (0, 0)?

llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
93–107

This is a comparer only as it doesn't provide sorting directly. So ProfileComparer?

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
67

One step further, ContextFramesToRemove?

307–309

Curious what triggered this order change?

hoy marked 2 inline comments as done.Aug 24 2021, 3:30 PM
hoy added inline comments.
llvm/include/llvm/ProfileData/SampleProf.h
406

CallSiteLocation sounds good.

423

Good catch. Switched to using INT32_MAX as an invalid line offset.

443–444

Sounds good.

449–458

The new context representation should be consistent with the old one for CS profile, i.e, it can contain a leaf frame only. The sample reader creates such leaf-only context. For non-CS profile, the context string will be empty.

Excluding leaf frame from FullContext should also work and it is cleaner by design. But the current way is more natural to implement, for example the IsPrefixOf operation, and some operations in llvm-profgen that appends the leaf probe into an existing calling context. Also the constructor takes in an arrayRef of the underlying vector, which, if separated, will need some special handling in the reader.

551–552

Yes, done.

575

Done.

581

Changed to toString

592

Assert added. This supposed to use by non-CS case.

598

Sounds good, assert added.

648

Done.

648–655

The API can be used to compare A:1 @ B with A:1 @ B:2 @ C, where we have to drop the line number of B:2.

The location of leaf frame was (0,0) previously, now I'm changing it to (-1,0). But the that context can still have a valid line number like B:2 where we have to drop the line number for comparision.

llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
93–107

Sounds good.

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
67

Renamed to ContextFramesToRemove.

307–309

The context of an inlined or merged node may be out-of-sync with its tree path so getContextFor will AV in that case. This can happen when functions on a tree path are not processed in top-down order, due to recursions.

hoy updated this revision to Diff 368502.Aug 24 2021, 5:19 PM
hoy marked 2 inline comments as done.

Addressing Wenlei's comment.

hoy updated this revision to Diff 368507.Aug 24 2021, 5:27 PM

Updating D108433: [CSSPGO] split context string - compiler changes

hoy updated this revision to Diff 368508.Aug 24 2021, 5:29 PM

Updating D108433: [CSSPGO] split context string - compiler changes

wenlei added inline comments.Aug 24 2021, 5:53 PM
llvm/include/llvm/ProfileData/SampleProf.h
449–458

makes sense, thanks for clarification. though it is indeed inconsistent with non-CS profile in the sense that FullContext doesn't contain leaf for non-CS, but contain leaf for CS profile.

648–655

Ok, if we have to ignore the location part of B, can we simplified the copy, edit and compare sequence as suggested above still?

(maybe i'm missing something, but it looks to me that you answered the 2nd question only?)

hoy added inline comments.Aug 24 2021, 6:03 PM
llvm/include/llvm/ProfileData/SampleProf.h
648–655

Do you mean something like below?

if (ThisContext.back().CallerName != ThatContext.back().CallerName)
  return false;

if (ThisContext.back().hasLineLocation()) 
   return ThisContext.back().linelocation == ThatContext.back().linelocation 
   
return true;
hoy added inline comments.Aug 24 2021, 6:04 PM
llvm/include/llvm/ProfileData/SampleProf.h
648–655

return true --> return ThisContext.drop_back() == ThatContext.drop_back()

wenlei added inline comments.Aug 24 2021, 6:09 PM
llvm/include/llvm/ProfileData/SampleProf.h
648–655

Yes if actually expect ThisContext (the prefix) to have location for its leaf (i thought the prefix's leaf won't have location as shown in your example), otherwise we could omit the location compare too.

hoy added inline comments.Aug 24 2021, 6:17 PM
llvm/include/llvm/ProfileData/SampleProf.h
648–655

That's true with the current uses of this function. I'm adding an assert for that and removing the line location check. Supporting the location compare is slow, especially this function is frequently used in profile reading.

modimo added inline comments.Aug 24 2021, 6:20 PM
llvm/include/llvm/ProfileData/SampleProf.h
543

The implicit operator here seems unnecessary given you have getContextFrames below providing the same thing. Is this needed?

612

The == and != operators are not always opposites. If another instance of this object has the same State/Name/FullContext == and != will both evaluate as true. Should != check on all of these fields/be a direct negation of ==?

hoy updated this revision to Diff 368525.Aug 24 2021, 6:26 PM

Updating D108433: [CSSPGO] split context string - compiler changes

hoy added inline comments.Aug 24 2021, 6:44 PM
llvm/include/llvm/ProfileData/SampleProf.h
543

It's needed for an implicit type cast where a SampleContext object is used as a SampleContextFrames object.

612

Can you elaborate more? Why would != return true for two objects having the same State/Name/FullContext?

wenlei added inline comments.Aug 24 2021, 9:26 PM
llvm/include/llvm/ProfileData/SampleProf.h
426

This one can now be removed as well.

hoy updated this revision to Diff 368553.Aug 24 2021, 9:54 PM

Removing hasLineLocation.

wlei added inline comments.Aug 24 2021, 10:02 PM
llvm/include/llvm/ProfileData/SampleProf.h
695

is this more heavy operator than direct implementation of >(like <).

say if two contexts are very long and only the last frame is different.

it will take 2 * n for (*this != That) plus !(*this < That), rather than n.

hoy added inline comments.Aug 24 2021, 10:06 PM
llvm/include/llvm/ProfileData/SampleProf.h
423

On the second thought, sticking with LineOffset 0 for leaf frame is more natural. We have a bench places in the compiler/reader/writer/llvm-profgen that uses 0 as leaf line location.

By default, SampleContext should represent a valid call context with the leaf frame, so most of the time we should just automatically strip off the leaf line location., such as when doing prefix check, getting the full context string. However there is a case in llvm-profgen where SampleContext is used to represent intermediate context for pseudo probes when the leaf probe is not ready. This causes diversion, especially in the full context string printing. Given that's the only place we diverge, I'm adding a flag to getContextString to tolerate that, instead of introducing the hasLineLocation API.

So with the latest change, by default, we should assume the last element of the context vector is the leaf frame, except in the special llvm-profgen case. Does that make sense?

wenlei added inline comments.Aug 24 2021, 10:09 PM
llvm/include/llvm/ProfileData/SampleProf.h
26

I don't see DJB hash used anywhere, remove as well?

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
307–309

Could you share a concrete example for such case? I was expecting context to be in-sync with its position in trie, because when we promote a node in trie, we also adjust its context accordingly. Does it also happen before this change?

wenlei added inline comments.Aug 24 2021, 10:35 PM
llvm/include/llvm/ProfileData/SampleProf.h
423

I think as long as we don't use 0 line offset to indicate no line info, it's fine.

Actually if we use CallSiteLocation to represent leaf, this really isn't a call site (call site always has a line location), but rather a frame. The comment is also inaccurate: "Represent a callsite with caller function name and line location".

We have typedef std::pair<std::string, LineLocation> FrameLocation; in llvm-profgen, once we have string buffers owning the underlying strings, that FrameLocation can be merged too.

Btw, I hope we can unify string converter to be toString. Right now we have CallSiteLocation::str(), SampleContext::getContextString() and SampleContext::toString.

SampleContext::toString properly dispatches for CS and non-CS case, but SampleContext::getContextString() doesn't. How about merge the two and let toString take a boolean?

wlei added inline comments.Aug 24 2021, 10:58 PM
llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
98

I remember you mentioned this is the bottleneck, so how about adding another check to filter more cases, specifically use the frame size(or context string size). like the above code.
I'm also curious why here it changed to use std::set instead of SmallVector, I guess it will change to semantic of the original code if the context has two same frame function name.

hoy added inline comments.Aug 24 2021, 11:06 PM
llvm/include/llvm/ProfileData/SampleProf.h
26

Removed.

423

Right CallSiteLocation is really for a callsite, we just take advantage of it to represent a leaf frame in SampleContext, and we intentionally ignore the line location for leaf frame. This isn't ideal, but easier to implement. Alternatively, the LineLocation field can be made an Optional field, but that adds the complexity most of the time we don't need.

The FrameLocation type is being replaced by D108437.

I'm renaming CallSiteLocation::str() as toString. SampleContext::getContextString() is a static function that works on an ArrayRef object. This is mostly for the sake of intermediate contexts (without leaf probe) in llvm-profgen. SampleContext::toString is not supposed to work with intermediate contexts. I can move getContextString out of SampleContext if it is confusing here.

695

Good point. I just removed the > operator which is only used by sorting profiles for dumping. The sorting will use < instead.

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
307–309

Unfortunately I don't seem to find that extreme case after adjusting the context promotion order, i.e, with a comparer. On the other hand, if we are not going to do anything for an inlined or merged context, we can bail out earlier without taking time to find its context node?

hoy added inline comments.Aug 24 2021, 11:11 PM
llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
98

This is not the bottleneck. The bottleneck is in sample reader when trying to load contexts out of the current module. There we sort all contexts in the profile first and selectively load the contexts which appear to be a subtree of a given context.

Using ordered set here is to enforce an order of promoting the contexts of a function later. We have seen cases that different promoting order led to different codegen, especially when recursion is involved.

hoy updated this revision to Diff 368558.Aug 24 2021, 11:15 PM

Addressing comments.

wenlei added inline comments.Aug 24 2021, 11:20 PM
llvm/include/llvm/ProfileData/SampleProf.h
423

Ok, keeping static function getContextString is fine.

Given we have SampleContextFrames, SampleContextFrameVector, perhaps this CallSiteLocation should be renamed to ContextFrame to reflect what it really is?

llvm/lib/Transforms/IPO/SampleContextTracker.cpp
307–309

Avoid context trie look up sounds good. Just wanted to make sure we don't have lurking issues as I don't see how they can be out of sync - it could be some other issues.

hoy added inline comments.Aug 24 2021, 11:28 PM
llvm/include/llvm/ProfileData/SampleProf.h
423

ContextFrame sounds good. How about SampleContextFrame?

wenlei added inline comments.Aug 24 2021, 11:40 PM
llvm/include/llvm/ProfileData/SampleProf.h
423

Sounds good too, though it may look too similar to SampleContextFrames, which can cause confusion. But either way is fine.

hoy updated this revision to Diff 368561.Aug 24 2021, 11:47 PM

Renaming CallsiteLocation to SampleContextFrame.

wenlei accepted this revision.Aug 25 2021, 8:58 AM

lgtm, thanks

wlei accepted this revision.Aug 25 2021, 9:02 AM
wlei added inline comments.
llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
98

I see, thanks for the clarification.

modimo added inline comments.Aug 25 2021, 10:31 AM
llvm/include/llvm/ProfileData/SampleProf.h
612

Ah I missed that it was comparing the dereferenced this which is invoking the == operator directly. Thought it was comparing the pointers.

hoy updated this revision to Diff 369560.Aug 30 2021, 2:48 PM

Updating D108433: [CSSPGO] split context string - compiler changes

hoy closed this revision.Aug 31 2021, 4:19 PM