This is an archive of the discontinued LLVM Phabricator instance.

[MemProf] Context disambiguation cloning pass [patch 2/3]
ClosedPublic

Authored by tejohnson on Jan 3 2023, 8:25 PM.

Details

Summary

Performs cloning on the CallsiteContextGraph (not on the IR or summary
index), in order to uniquely identify the allocation behavior of an
allocation call given its context. In order to do this, the graph is
recursively traversed starting from the allocation nodes, until we
identify a point where the allocation behavior is unambiguous (the edges
have a single allocation type). Nodes are then cloned as we unwind the
recursion. We try to perform the minimal amount of cloning required to
disambiguate the contexts.

The follow-on patch will contain the support for applying the cloning to
the IR.

Depends on D140908 and D145836.

Diff Detail

Event Timeline

tejohnson created this revision.Jan 3 2023, 8:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 8:25 PM
tejohnson requested review of this revision.Jan 3 2023, 8:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 8:25 PM

I've attached the post-cloning dot file (and a png of it) for llvm/test/Transforms/PGHOContextDisambiguation/basic.ll.

Enna1 added a subscriber: Enna1.Jan 8 2023, 6:59 PM

Rebase on top of patch 1 changes.

Small fix made while testing on a large application

snehasish added inline comments.Jan 13 2023, 11:29 AM
llvm/lib/Transforms/IPO/PGHOContextDisambiguation.cpp
634

Should we use shared_ptr for edges so that we don't have to manually track ownership?

1959

nit: constexpr

2006

Still trying to understand this but it wasn't clear to me how we terminate this loop if we only increment the iterator here?

tejohnson marked 2 inline comments as done.Mar 24 2023, 7:03 PM

Finished rebasing the patch on the earlier commits.

llvm/lib/Transforms/IPO/PGHOContextDisambiguation.cpp
2006

See the calls to moveEdgeTo*CalleeClone below which takes the iterator. The current edge is erased from the iterator which is thus advanced. Mostly we terminate the loop when after enough cloning the Node has a single alloc type (see the break up near the top of the loop). Added a comment above the loop, and above the calls to moveEdgeTo* below.

tejohnson updated this revision to Diff 508269.Mar 24 2023, 7:05 PM

Rebase on committed patches 1a/1b, address comments

snehasish added inline comments.Apr 10 2023, 2:49 PM
llvm/lib/Passes/PassBuilder.cpp
124 ↗(On Diff #508269)

I don't see anything used from this header in this file?

llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
211 ↗(On Diff #508269)

Should we assert that Clone->CloneOf is unset?

1181 ↗(On Diff #508269)

I think this var shadows the LastNode var outside the loop nest. Is this intentional? Should we rename this var if not?

1942 ↗(On Diff #508269)

nit: "iterate on that"?

1946 ↗(On Diff #508269)

Should we bound the scope of CallerEdges just to make it clear to the reader and avoid accidental use?

A c++20 init-statement would be nice but we don't allow c++20 in LLVM yet.

1949 ↗(On Diff #508269)

Can we also drop this Edge from the Node->CallerEdges set if both Caller and Callee are set to null?

1957 ↗(On Diff #508269)

The code below allows 1 caller?

1966 ↗(On Diff #508269)

Can we make this functor a static function defined after allocTypeToUse? AllocTypesMatch doesn't depend on any function local state and it contains another functor; reducing the nesting would make it easier to read. It is also defined a bit further away from the use.

1994 ↗(On Diff #508269)

I think EnumeratedArray [1] is a good fit for this use case. I think it would also provide some checks against out of bounds access if the AllocType enum class changed.

[1] llvm-project/llvm/include/llvm/ADT/EnumeratedArray.h

1998 ↗(On Diff #508269)

We only need to capture AllocTypeCloningPriority here. Do you know if a more permissive capture has any overhead?

2013 ↗(On Diff #508269)

nit: cases

2018 ↗(On Diff #508269)

auto CallerEdge would make the code below a little easier to read.

2027 ↗(On Diff #508269)

nit: CalleeEdgeAllocTypesForCallerEdge.reserve(Node->CalleeEdges.size()); to avoid reallocations.

tejohnson marked 11 inline comments as done.Apr 20 2023, 10:53 AM
tejohnson added inline comments.
llvm/lib/Passes/PassBuilder.cpp
124 ↗(On Diff #508269)

This turns out to be a merge issue. We actually do need the header (otherwise we get errors because of the use in PassRegistry.def which is included from this file), but I already have it included in the properly alphabetized location just above here at line 120. Removed this one.

llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
1181 ↗(On Diff #508269)

Not intentional, but thankfully not causing a bug. I simply removed LastNode and use CurNode directly in the one use below.

1949 ↗(On Diff #508269)

It would have been removed. Added assert.

1957 ↗(On Diff #508269)

Fixed wording - I mean only a single caller.

1994 ↗(On Diff #508269)

I don't think an EnumeratedArray will work, because index 3 is not an actual AllocationType enum value - it is the OR of the NotCold and Cold enumeration values (AllocTypes is a uint8_t as a result). As an aside I did not want to add it as an enum value because as we extend AllocationTypes with different types (as documented in the definition, all need to be powers of 2 to allow the ORing) you could presumably have various combinations, and it didn't seem practical to encode all of them in the enum itself.

Instead I added an "All" type to the enum that is documented as needing to be the OR of all values, and I use that in an assert here.

1998 ↗(On Diff #508269)

I don't believe it adds overhead, just expresses intent to the frontend. In any case, turns out constexpr don't need to be captured, so I could remove the "&" completely.

tejohnson marked 4 inline comments as done.

Address comments

snehasish accepted this revision.Apr 20 2023, 12:51 PM

lgtm

llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
1994 ↗(On Diff #508269)

Sounds good, thanks for the explanation.

1998 ↗(On Diff #508269)

TIL that constexpr doesn't need to be captured, makes sense in retrospect.

This revision is now accepted and ready to land.Apr 20 2023, 12:51 PM
This revision was landed with ongoing or failed builds.Apr 21 2023, 2:32 PM
This revision was automatically updated to reflect the committed changes.
tejohnson added inline comments.Apr 21 2023, 3:44 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
1998 ↗(On Diff #508269)

Well...it turns out it is not so simple. I got bot failures with other compilers that gave errors like:

'AllocTypeCloningPriority' cannot be implicitly captured because no default capture mode has been specified

I was able to reproduce an error building with gcc. Unfortunately, clang complains if I try to explicitly capture this constexpr variable. Therefore I am back to the earlier version that just had a default "&" capture.

tejohnson reopened this revision.Apr 21 2023, 3:52 PM

Will update shortly with couple fixes for bot failures when building with other compilers. I'll wait until the bots settle down and I finish going through all the failures to make sure there aren't any other issues that show up before I push again.

This revision is now accepted and ready to land.Apr 21 2023, 3:52 PM
tejohnson updated this revision to Diff 515945.Apr 21 2023, 3:54 PM

Fixes for bot failures

aganea added a subscriber: aganea.Apr 22 2023, 2:41 PM
aganea added inline comments.
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)

Hello @tejohnson! Quite note - This patch fails on Windows with MSVC. Historically, MSVC has troubles capturing constexpr variables in lambdas (on L2018). Changing to const locally works. See diagnostics:

FAILED: lib/Transforms/IPO/CMakeFiles/LLVMipo.dir/MemProfContextDisambiguation.cpp.obj
C:\PROGRA~1\MIB055~1\2022\PROFES~1\VC\Tools\MSVC\1433~1.316\bin\HostX64\x64\cl.exe  /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_LIBCPP_ENABLE_ASSERTIONS -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\git\llvm-project\stage1_msvc_debug\lib\Transforms\IPO -IC:\git\llvm-project\llvm\lib\Transforms\IPO -IC:\git\llvm-project\stage1_msvc_debug\include -IC:\git\llvm-project\llvm\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:preprocessor /Zc:__cplusplus /Oi /bigobj /permissive- /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd5105 -wd4324 -w14062 -we4238 /MTd /Zi /Ob0 /Od /RTC1  /EHs-c- /GR- -std:c++17 /showIncludes /Folib\Transforms\IPO\CMakeFiles\LLVMipo.dir\MemProfContextDisambiguation.cpp.obj /Fdlib\Transforms\IPO\CMakeFiles\LLVMipo.dir\LLVMipo.pdb /FS -c C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2018): error C2326: 'bool CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones::<lambda_1>::operator ()(const std::shared_ptr<CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextEdge> &,const std::shared_ptr<CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextEdge> &) const': function cannot access 'AllocTypeCloningPriority'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1946): note: while compiling class template member function 'void CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones(CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *,llvm::DenseSet<const CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *,llvm::DenseMapInfo<ValueT,void>> &)'
        with
        [
            ValueT=const CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *
        ]
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1941): note: see reference to function template instantiation 'void CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones(CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *,llvm::DenseSet<const CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *,llvm::DenseMapInfo<ValueT,void>> &)' being compiled
        with
        [
            ValueT=const CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextNode *
        ]
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1938): note: while compiling class template member function 'void CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones(void)'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2139): note: see reference to function template instantiation 'void CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones(void)' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2127): note: while compiling class template member function 'bool CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::process(void)'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2159): note: see reference to function template instantiation 'bool CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::process(void)' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(470): note: see reference to class template instantiation 'CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2019): error C2326: 'bool CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::identifyClones::<lambda_1>::operator ()(const std::shared_ptr<CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextEdge> &,const std::shared_ptr<CallsiteContextGraph<ModuleCallsiteContextGraph,llvm::Function,llvm::Instruction *>::ContextEdge> &) const': function cannot access 'AllocTypeCloningPriority'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2018): error C2326: 'bool CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones::<lambda_1>::operator ()(const std::shared_ptr<CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextEdge> &,const std::shared_ptr<CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextEdge> &) const': function cannot access 'AllocTypeCloningPriority'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1946): note: while compiling class template member function 'void CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones(CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *,llvm::DenseSet<const CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *,llvm::DenseMapInfo<ValueT,void>> &)'
        with
        [
            ValueT=const CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *
        ]
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1941): note: see reference to function template instantiation 'void CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones(CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *,llvm::DenseSet<const CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *,llvm::DenseMapInfo<ValueT,void>> &)' being compiled
        with
        [
            ValueT=const CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextNode *
        ]
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(1938): note: while compiling class template member function 'void CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones(void)'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2139): note: see reference to function template instantiation 'void CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones(void)' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2127): note: while compiling class template member function 'bool CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::process(void)'
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2176): note: see reference to function template instantiation 'bool CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::process(void)' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(515): note: see reference to class template instantiation 'CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>' being compiled
C:\git\llvm-project\llvm\lib\Transforms\IPO\MemProfContextDisambiguation.cpp(2019): error C2326: 'bool CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::identifyClones::<lambda_1>::operator ()(const std::shared_ptr<CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextEdge> &,const std::shared_ptr<CallsiteContextGraph<IndexCallsiteContextGraph,llvm::FunctionSummary,IndexCall>::ContextEdge> &) const': function cannot access 'AllocTypeCloningPriority'
tejohnson added inline comments.Apr 22 2023, 2:55 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)

Thanks for the report. I didn't see any bot failure reports - do you know if this showed up on one?

Just to confirm, I should remove the constexpr and make it:

const unsigned AllocTypeCloningPriority[] = ...

?

tejohnson added inline comments.Apr 22 2023, 3:26 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)

Also, just to confirm, is this is happening with the most recent version of the patch that was committed (a104e27030587507a711cef0e2b0ddb447fe68fe)? A number of compilers complained about the lambda capture of this constexpr before I changed it slightly.

aganea added inline comments.Apr 22 2023, 6:49 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)

Yes to both questions. From quickly browsing through the builders configs it seems none of them use MSVC 2022, where the issue happens.

tejohnson added inline comments.Apr 23 2023, 3:04 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)
aganea added inline comments.Apr 23 2023, 5:55 PM
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
2004 ↗(On Diff #515979)

Thank you!