This is an archive of the discontinued LLVM Phabricator instance.

[PartialInlining] Enable recursive partial inlining.
ClosedPublic

Authored by rudkx on Oct 20 2022, 2:44 PM.

Download Raw Diff

Details

Reviewers

etiotto
chandlerc
silvas
aeubanks
efriedma
fhahn

Commits

rGe96925ce0b03: [PartialInlining] Enable recursive partial inlining.

Summary

It seems unnecessarily limiting to disallow recursive partial
inlining, and there are clearly cases where it can benefit
code by avoiding a function call and potentially enabling
other transformations like dead argument elimination
in cases where an argument is only used prior to the early-out
test at the top of the function.

The pass already properly rewrites the recursive calls
within the body of the freshly cloned function, so the only
change here is removing the bail-out when recursion is
detected.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rudkx created this revision.Oct 20 2022, 2:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2022, 2:44 PM

Herald added subscribers: ormris, hiraditya. · View Herald Transcript

rudkx requested review of this revision.Oct 20 2022, 2:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2022, 2:44 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

rudkx added a reviewer: etiotto.Oct 20 2022, 2:54 PM

Harbormaster completed remote builds in B193336: Diff 469364.Oct 20 2022, 3:35 PM

rudkx added reviewers: chandlerc, silvas.Oct 25 2022, 1:31 PM

Can one of the listed reviewers please review, or suggest alternative reviewers?

This was the best list I could come up with based on recent active history in this file that wasn't code clean up, and based on CODE_OWNERS.TXT.

Thanks!

xbolva00 added reviewers: aeubanks, efriedma.Nov 11 2022, 12:41 PM

are you see that this is beneficial in practice?

llvm/test/Transforms/PartialInlining/recursive_partial_inlining.ll
3	can you remove the simplifycfg run so we can more accurately see what the pass is doing?

In D136383#3922395, @aeubanks wrote:

are you see that this is beneficial in practice?

Yes, I have an internal test case for which this results in the only references to a pointer argument being moved out of the function, which then allows dead argument elimination to remove that argument, which further results in that stack data that was being passed indirectly to be split by SROA and moved off the stack into registers.

Update test case to not use --simplify-cfg.

rudkx marked an inline comment as done.Nov 11 2022, 3:29 PM

rudkx added inline comments.

llvm/test/Transforms/PartialInlining/recursive_partial_inlining.ll
3	Done. Thanks for taking a look!

Harbormaster completed remote builds in B197319: Diff 474875.Nov 11 2022, 4:13 PM

aeubanks added a reviewer: fhahn.Nov 13 2022, 2:38 PM

The obvious concern for making inlining more aggressive is potential codesize growth. Does the cost model prevent us from cloning the function multiple times? How much growth are we talking about for practical code?

llvm/test/Transforms/PartialInlining/recursive_partial_inlining.ll
2	update_test_checks.py has a flag --include-generated-funcs you might want to use here.

ChuanqiXu added a subscriber: ChuanqiXu.Nov 13 2022, 11:04 PM

rudkx marked an inline comment as done.Nov 16 2022, 8:22 AM

rudkx added inline comments.

llvm/test/Transforms/PartialInlining/recursive_partial_inlining.ll
2	I tried using that flag, and unfortunately it seems to be generating check lines that FileCheck fails to match. The check lines look fine to me, so it may be a FileCheck bug. I've had similar issues in the past with hand-written lines where FileCheck seems to either get hung-up on trailing text on a line, or is having a problem with variable captures that look identical but somehow it doesn't think match.

In D136383#3924032, @efriedma wrote:

The obvious concern for making inlining more aggressive is potential codesize growth. Does the cost model prevent us from cloning the function multiple times? How much growth are we talking about for practical code?

Hi @efriedma. The majority of the body of the function will get cloned exactly once. The early-exit portion of the function (leading to the first branch) will get cloned once per call-site. Since the only change here is to also allow partial inlining to happen for recursive functions, I would expect code growth in general to be quite small over just having this enabled for non-recursive functions. In other words, I would expect if there were a code growth problem as a result of partial inlining we'd already be seeing it since non-recursive functions are going to be the much more common case.

Having said that, I was hoping to try something like a bootstrap build of clang to see if there was any noticeable impact, but the instructions I found online for doing those builds don't seem to work. In an internal test suite I found that this only fired a handful of times (most likely because the code shape that this pass detects is very specific, and the cost model is pretty conservative).

If you still have concerns I'd be happy to gather some data if you have a suggestion of what you think would be meaningful.

Not sure how the CMake 2-stage bits work, but you can always manually "bootstrap": build clang, then do a separate clang build with -DCMAKE_CXX_COMPILER=/path/to/built/clang++ .

In D136383#3967176, @efriedma wrote:

Not sure how the CMake 2-stage bits work, but you can always manually "bootstrap": build clang, then do a separate clang build with -DCMAKE_CXX_COMPILER=/path/to/built/clang++ .

Yes good point. I gave that a try and I see no difference in size in the text section of clang, so it looks like it's either not hitting at all, or it's not causing enough code growth to round to the next page.

In that case, I'm not really concerned here; LGTM

This revision is now accepted and ready to land.Dec 5 2022, 9:49 AM

Great, thanks again @efriedma.

Rebase on top of tree.

This revision was landed with ongoing or failed builds.Dec 5 2022, 11:10 PM

Closed by commit rGe96925ce0b03: [PartialInlining] Enable recursive partial inlining. (authored by rudkx). · Explain Why

This revision was automatically updated to reflect the committed changes.

rudkx added a commit: rGe96925ce0b03: [PartialInlining] Enable recursive partial inlining..

Harbormaster completed remote builds in B201281: Diff 480326.Dec 6 2022, 2:33 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

IPO/

PartialInlining.cpp

10 lines

test/

Transforms/

PartialInlining/

recursive_partial_inlining.ll

31 lines

Diff 480339

llvm/lib/Transforms/IPO/PartialInlining.cpp

Show First 20 Lines • Show All 1,483 Lines • ▼ Show 20 Lines	bool PartialInlinerImpl::run(Module &M) {
bool Changed = false;		bool Changed = false;
while (!Worklist.empty()) {		while (!Worklist.empty()) {
Function *CurrFunc = Worklist.back();		Function *CurrFunc = Worklist.back();
Worklist.pop_back();		Worklist.pop_back();

if (CurrFunc->use_empty())		if (CurrFunc->use_empty())
continue;		continue;

bool Recursive = false;
for (User *U : CurrFunc->users())
if (Instruction *I = dyn_cast<Instruction>(U))
if (I->getParent()->getParent() == CurrFunc) {
Recursive = true;
break;
}
if (Recursive)
continue;

std::pair<bool, Function > Result = unswitchFunction(CurrFunc);		std::pair<bool, Function > Result = unswitchFunction(CurrFunc);
if (Result.second)		if (Result.second)
Worklist.push_back(Result.second);		Worklist.push_back(Result.second);
Changed \|= Result.first;		Changed \|= Result.first;
}		}

return Changed;		return Changed;
}		}
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/Transforms/PartialInlining/recursive_partial_inlining.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -partial-inliner -skip-partial-inlining-cost-analysis -S < %s \| FileCheck %s
				efriedmaUnsubmitted Not Done Reply Inline Actions update_test_checks.py has a flag --include-generated-funcs you might want to use here. efriedma: update_test_checks.py has a flag --include-generated-funcs you might want to use here.
				rudkxAuthorUnsubmitted Done Reply Inline Actions I tried using that flag, and unfortunately it seems to be generating check lines that FileCheck fails to match. The check lines look fine to me, so it may be a FileCheck bug. I've had similar issues in the past with hand-written lines where FileCheck seems to either get hung-up on trailing text on a line, or is having a problem with variable captures that look identical but somehow it doesn't think match. rudkx: I tried using that flag, and unfortunately it seems to be generating check lines that FileCheck…
				define void @_Z26recursive_partial_inliningi(i32 noundef %i) local_unnamed_addr {
				aeubanksUnsubmitted Done Reply Inline Actions can you remove the simplifycfg run so we can more accurately see what the pass is doing? aeubanks: can you remove the simplifycfg run so we can more accurately see what the pass is doing?
				rudkxAuthorUnsubmitted Done Reply Inline Actions Done. Thanks for taking a look! rudkx: Done. Thanks for taking a look!
				; CHECK-LABEL: @_Z26recursive_partial_inliningi(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[I:%.]], 0
				; CHECK-NEXT: br i1 [[CMP]], label [[COMMON_RET2:%.]], label [[IF_END:%.]]
				; CHECK: common.ret2:
				; CHECK-NEXT: ret void
				; CHECK: if.end:
				; CHECK-NEXT: [[SUB:%.*]] = add nsw i32 [[I]], -1
				; CHECK-NEXT: [[CMP_I:%.*]] = icmp slt i32 [[SUB]], 0
				; CHECK-NEXT: br i1 [[CMP_I]], label [[_Z26RECURSIVE_PARTIAL_INLININGI_1_EXIT:%.]], label [[CODEREPL_I:%.]]
				; CHECK: codeRepl.i:
				; CHECK-NEXT: call void @_Z26recursive_partial_inliningi.1.if.end(i32 [[SUB]])
				; CHECK-NEXT: br label [[_Z26RECURSIVE_PARTIAL_INLININGI_1_EXIT]]
				; CHECK: _Z26recursive_partial_inliningi.1.exit:
				; CHECK-NEXT: br label [[COMMON_RET2]]
				;
				entry:
				%cmp = icmp slt i32 %i, 0
				br i1 %cmp, label %common.ret2, label %if.end

				common.ret2: ; preds = %entry, %if.end
				ret void

				if.end: ; preds = %entry
				%sub = add nsw i32 %i, -1
				tail call void @_Z26recursive_partial_inliningi(i32 noundef %sub)
				br label %common.ret2
				}