This is an archive of the discontinued LLVM Phabricator instance.

Add support for function attribute "notail"
ClosedPublic

Authored by ahatanak on Sep 16 2015, 7:27 PM.

Details

Summary

This patch adds support for a new IR function attribute "notail". The attribute is used to disable tail call optimization on calls to functions marked with the attribute.

This attribute is different from the existing attribute "disable-tail-calls", which disables tail call optimizations on all call sites within the marked function.

The patch to add support for the corresponding source-level function attribute is here:
http://reviews.llvm.org/D12922

Diff Detail

Repository
rL LLVM

Event Timeline

ahatanak updated this revision to Diff 34961.Sep 16 2015, 7:27 PM
ahatanak retitled this revision from to Add support for function attribute "notail".
ahatanak updated this object.
ahatanak added a subscriber: llvm-commits.
sanjoy added a subscriber: sanjoy.EditedSep 16 2015, 7:35 PM

Does this mean LLVM will not longer be able to generate indirect tail calls (tail calls when the target function is not known statically); because the target function /could/ have been marked notail?

For instance, clang 3.7.0 optimizes

typedef int (g)(void *, int);

int f(g* ptr, int x) {
  return ptr(ptr, x);
}

into

	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 10
	.globl	_f
	.align	4, 0x90
_f:                                     ## @f
	.cfi_startproc
## BB#0:
	pushq	%rbp
Ltmp0:
	.cfi_def_cfa_offset 16
Ltmp1:
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
Ltmp2:
	.cfi_def_cfa_register %rbp
	popq	%rbp
	jmpq	*%rdi                   ## TAILCALL
	.cfi_endproc


.subsections_via_symbols

[Edit] And this is now illegal since ptr can be a function in another translation unit marked notail.

vsk added a subscriber: vsk.Sep 16 2015, 7:36 PM

Could you add a small test to Bitcode/compatibility.ll in the function attributes section? E.g

; CHECK: define void @f.notail() #35
define void @f.notail() notail {
  ret void
}
; CHECK: attributes #35 = { notail }

No, tail call optimization doesn't have to be disabled on indirect calls.

No, tail call optimization doesn't have to be disabled on indirect calls.

What if the indirect callee is marked with notail? Isn't the whole point of notail that functions marked notail won't have tail calls to them?

Perhaps I'm missing something here, so it helps to be concrete -- in the C example I gave, what's the expected behavior of the program if the function pointer passed to f via ptr is a notail function?

I think the notail should be on the call instruction, not on the callee; as some sort of "inverse" of musttail. Then we can teach the optimizer to never codegen a notail call as a tail call.

ahatanak updated this revision to Diff 34964.Sep 16 2015, 10:04 PM
  • Added a small test to Bitcode/compatibility.ll.
  • Changed attr-notail.ll to demonstrate attribute "notail" overrides the "tail" marker but doesn't override the "musttail" marker.

No, tail call optimization doesn't have to be disabled on indirect calls.

An IR example of the issue I'm trying to point out is this:

define i32 @caller(i32 %a) {
entry:
  %p = alloca i32(i32)*
  store i32(i32)* @callee, i32(i32)** %p

  %ptr = load i32(i32)*, i32(i32)** %p
  %call = call i32 %ptr(i32 %a)
  ret i32 %call
}

declare i32 @callee(i32) #0

attributes #0 = { notail }

If you pass the above through opt -tailcallelim -mem2reg (opt built with this patch) you'll get

define i32 @caller(i32 %a) {
entry:
  %call = tail call i32 @callee(i32 %a)
  ret i32 %call
}

; Function Attrs: notail
declare i32 @callee(i32) #0

attributes #0 = { notail }

which is what you're trying to avoid with notail, if I understand this change correctly.

No, tail call optimization doesn't have to be disabled on indirect calls.

What if the indirect callee is marked with notail? Isn't the whole point of notail that functions marked notail won't have tail calls to them?

Yes, that's correct. We want to avoid doing tail call optimization on a call site if the compiler knows it is a call to a notail function. If the call is an indirect call, it's not always possible to know that.

Perhaps I'm missing something here, so it helps to be concrete -- in the C example I gave, what's the expected behavior of the program if the function pointer passed to f via ptr is a notail function?

It should have no effect. If the compiler cannot determine a call site is a call to a nontail function, it doesn't block tail call optimization.

I think the notail should be on the call instruction, not on the callee; as some sort of "inverse" of musttail. Then we can teach the optimizer to never codegen a notail call as a tail call.

I think attaching notail to the call instruction is more limiting than attaching it to the callee function in some cases. Suppose we are compiling a code like this with -O3:

int f(g* ptr, int x) {

return ptr(ptr, x);

}

int __attribute((notail)) foo2(void*, int);

int foo1(int a) {

return f(foo2, a);

}

After all the optimization passes (including the inliner) are run, the IR will look like this:

define i32 @foo1(i32 %a) #0 {
entry:

%call.i = call i32 @foo2(i8* bitcast (i32 (i8*, i32)* @foo2 to i8*), i32 %a) #2
ret i32 %call.i

}

If the notail attribute was on the call instruction, there would be no way to tell tail call optimization shouldn't be done on the call to foo2.

So, in your example, you wouldn't be able to block tail call optimization if the attribute was on the call site. Is that correct?

To elaborate on my previous comment, the notail attributes should block tail call optimization on as many call sites as possible, but it's okay if it can't block some of the indirect call sites if the compiler cannot determine which function is being called.

It should have no effect. If the compiler cannot determine a call site is a call to a nontail function, it doesn't block tail call optimization.

So this (and what you said below) is fine as long as not obeying notail won't result in an
incorrect program; and notail on a function only prevents tail calls
to that function on a best effort basis. Is that the direction of
this patch? If so, then please clarify that on the langref.

Otherwise (i.e. if you *require* notail to prevent tail calls for
correctness) I don't see how you can get avoid making all indirect
calls non-tail calls.

I think the notail should be on the call instruction, not on the callee; as some sort of "inverse" of musttail. Then we can teach the optimizer to never codegen a notail call as a tail call.

I think attaching notail to the call instruction is more limiting than attaching it to the callee function in some cases. Suppose we are compiling a code like this with -O3:

int f(g* ptr, int x) {

return ptr(ptr, x);

}

int __attribute((notail)) foo2(void*, int);

int foo1(int a) {

return f(foo2, a);

}

After all the optimization passes (including the inliner) are run, the IR will look like this:

define i32 @foo1(i32 %a) #0 {
entry:

%call.i = call i32 @foo2(i8* bitcast (i32 (i8*, i32)* @foo2 to i8*), i32 %a) #2
ret i32 %call.i

}

If the notail attribute was on the call instruction, there would be no way to tell tail call optimization shouldn't be done on the call to foo2.

That's correct. Preventing tail call on a best effort basis is good enough for the use cases we care about. I'll update the langref to clarify that.

ahatanak updated this revision to Diff 37188.Oct 12 2015, 3:12 PM

This patch makes changes to add the "notail" marker to call instructions instead of using a function attribute.

rnk added a subscriber: rnk.Nov 4 2015, 3:30 PM

Seems reasonable. Where is the LangRef change for this?

ahatanak updated this revision to Diff 39283.Nov 4 2015, 4:00 PM

Added description in LangRef.

Let me know if you think there are changes I can make to improve it.

rnk accepted this revision.Nov 4 2015, 4:05 PM
rnk added a reviewer: rnk.

lgtm

This revision is now accepted and ready to land.Nov 4 2015, 4:05 PM
spatel added a subscriber: spatel.Nov 5 2015, 8:56 AM
spatel added inline comments.
lib/Bitcode/Writer/BitcodeWriter.cpp
2132–2134 ↗(On Diff #39283)

Hi Akira -

Can you give these bitfields proper names in a struct or enum in LLVMBitCodes.h? It took me a while to understand why we have this encoding (no code comments...).

The other reason I ask is because I was about to swipe bit 16 myself. :)

I think that's the only backwards-compatible way to add fast-math-flags to a call ( PR21290 ). We can't use the usual method of tacking an optional field to the end of the record because the record length is unknown for a call with varargs.

ahatanak added inline comments.Nov 5 2015, 7:04 PM
lib/Bitcode/Writer/BitcodeWriter.cpp
2132–2134 ↗(On Diff #39283)

Hi Sanjay.

I've made the changes you suggested in my local branch, but I think they should be in a follow-up patch to separate the changes related to notail from the changes related to the bitfield names.

These are the enums I defined in LLVMBitCodes.h:

enum CallMarkersFlags {

CALL_TAIL = 0,
CALL_CCONV = 1,
CALL_MUSTTAIL = 14,
CALL_EXPLICIT_TYPE = 15,
CALL_NOTAIL = 16

};

spatel added inline comments.Nov 6 2015, 8:28 AM
lib/Bitcode/Writer/BitcodeWriter.cpp
2132–2134 ↗(On Diff #39283)

Yes, a follow-up NFC patch sounds right. Thanks!

This revision was automatically updated to reflect the committed changes.