This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/CodeGen/
-
lib/
-
CodeGen/
2/3
CGCall.cpp
3/5
CGClass.cpp
-
CodeGenFunction.h

Differential D137872

Implement lambdas with inalloca parameters by forwarding to function without inalloca calling convention.
ClosedPublic

Authored by akhuang on Nov 11 2022, 3:19 PM.

Download Raw Diff

Details

Reviewers

rnk
efriedma

Commits

rG015049338d7e: Try to implement lambdas with inalloca parameters by forwarding without use of…

Summary

Lambda functions with inalloca parameters (which are used in 32-bit Windows) currently aren't supported because the call operator delegates to another function and forwards its arguments, and inalloca parameters can't be forwarded. (The same issue exists for variadic arguments.)

This patch emits the lambda call operator again with a different calling convention without inallocas. Then the original call operator and static invoker delegate to this new function. We also modify arrangeLLVMFunctionInfo to be able to generate a different CGFunctionInfo for delegate calls that doesn't use inalloca.

Fixes https://github.com/llvm/llvm-project/issues/28673.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	70 ms	x64 debian > Clang.CodeGenCXX::inalloca-lambda.cpp
	730 ms	x64 debian > XRay-x86_64-linux.TestCases/Posix::basic-filtering.cpp

Event Timeline

akhuang created this revision.Nov 11 2022, 3:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2022, 3:19 PM

akhuang requested review of this revision.Nov 11 2022, 3:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2022, 3:19 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B197317: Diff 474872.Nov 11 2022, 3:49 PM

akhuang mentioned this in D136998: Try to implement lambdas with inalloca parameters by inlining the call operator function..Nov 11 2022, 4:30 PM

I'm not quite sure I understand what's happening here. Does this actually avoid generating two copies of the function body if both the call operator and the conversion are used?

Clean up existing code and add code to make the call operator also call the new function.

cleanup

In D137872#3925723, @efriedma wrote:

I'm not quite sure I understand what's happening here. Does this actually avoid generating two copies of the function body if both the call operator and the conversion are used?

That was intended but missing; patch is now updated to make both the invoker and the call operator call the new function.

akhuang added reviewers: rnk, efriedma.Nov 18 2022, 4:43 PM

Harbormaster completed remote builds in B198584: Diff 476634.Nov 18 2022, 5:36 PM

rnk added inline comments.Nov 21 2022, 11:52 AM

clang/include/clang/CodeGen/CGFunctionInfo.h
571 ↗	(On Diff #476634)	This is an amusing name, but we should try to find something better. :) Can we build on this flag to support re-writing a varargs prototype to receive a va_list? If so, we should plan for that, and name this to cover both use cases. Elsewhere in clang we use the terminology "delegate call" see EmitDelegateCallArg. Could this be something like `ForDelegation` or `ForDelegateCall` or `IsDelegateCall` or `IsDelegateTarget` or something like that? Or Thunk? I think I like `ForDelegateCall`, but I'm not sure. Does anyone know what a "chain call" is? It wasn't clear to me immediately.
clang/lib/CodeGen/CGCall.cpp
194	This still requires boolean parameters in order. I think LLVM tends to use flag enums for this, something like: enum class FnInfoOptions { None = 0, IsInstanceMethod = 1 << 0, IsChainCall = 1 << 1, IsDelegateCall = 1 << 2, }; Call sites look like: arrangeLLVMFunctionInfo(RetTy, FnInfoOptions::None, None, FTNP->getExtInfo()...) arrangeLLVMFunctionInfo(RetTy, FnInfoOptions::IsInstanceMethod, None, FTNP->getExtInfo()...)
clang/lib/CodeGen/CGClass.cpp
3012	I believe this is only necessary for x86 platforms, so we should check the architecture too to avoid the extra work on x64.
3156	We should use a name which demangles correctly, as we do for `__invoke`. This is a good first draft, though.
clang/test/CodeGenCXX/inalloca-lambda.cpp
4 ↗	(On Diff #476634)	Just to make the IR slightly nicer, give this an `int` member.
8 ↗	(On Diff #476634)	Please expand the testing further to include a case where the same lambda is called directly and goes through function pointer decay, something like this: https://gcc.godbolt.org/z/q7x3ov5e1 This is to ensure that we don't end up double-emitting the implementation of the lambda.

add to test case, modify name mangling, change fn info opts enum type

Harbormaster completed remote builds in B199044: Diff 477289.Nov 22 2022, 3:41 PM

rnk added inline comments.Nov 22 2022, 3:55 PM

clang/lib/CodeGen/CGCall.cpp
735	We could avoid static_cast by defining the standard flag operators, similar to what we do in this macro: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/DebugInfo/CodeView/CodeView.h#L52 Alternatively, just define operator\| and operator&.
clang/lib/CodeGen/TargetInfo.cpp
1190 ↗	(On Diff #477289)	Please update the old parameter name
1841 ↗	(On Diff #477289)	As a simplification, you can put `\|\| isDelegateCall` here, we're basically pretending that this returns RAA_Indirect.

Address comments

Harbormaster completed remote builds in B199832: Diff 478324.Nov 28 2022, 1:56 PM

akhuang retitled this revision from Try to implement lambdas with inalloca parameters by forwarding without use of inallocas. to Implement lambdas with inalloca parameters by forwarding to function without inalloca calling convention..Dec 5 2022, 2:47 PM

akhuang edited the summary of this revision. (Show Details)

ping @efriedma, do you mind looking at this again? thanks!

clang/include/clang/CodeGen/CGFunctionInfo.h
571 ↗	(On Diff #476634)	Yeah, that naming makes sense. chain call: https://reviews.llvm.org/D6332?
clang/lib/CodeGen/CGClass.cpp
3156	Fixed to replace different part of the mangled name with `__impl`. Don't know if it would be better to use the actual mangling functions, but the string replace here seems simple enough.

whoops, I've left this here for a while.. @efriedma, are you able to review it?

I'm having a bit of trouble following how exactly the thunk creation is working here... do we generate different code depending on whether the call operator and/or the static invoker are referenced? Why is the function in EmitLambdaInAllocaCallOpFn not getting defined using the normal CodeGenModule machinery?

clang/lib/CodeGen/CGCall.cpp
770	I think you need to pass isDelegateCall to CGFunctionInfo::Profile.

In D137872#4327615, @efriedma wrote:

I'm having a bit of trouble following how exactly the thunk creation is working here... do we generate different code depending on whether the call operator and/or the static invoker are referenced?

So I think it used to be that the static invoker calls the call operator which contains the body of the lambda? And now both the static invoker and the call operator are delegating to this new __impl function. In EmitLambdaStaticInvokeBody it first calls EmitLambdaInAllocaCallOpFn to make the new call operator (which is populated using EmitLambdaDelegatingInvokeBody). And then inside EmitLambdaDelegatingInvokeBody it generates the new impl function and calls it in the body. Not sure if that answers the question, I agree the code is a bit roundabout.

Why is the function in EmitLambdaInAllocaCallOpFn not getting defined using the normal CodeGenModule machinery?

Do you mean why it's not using CodeGenModule machinery to generate the new call op body or why we're changing the original call op body?

In D137872#4348314, @akhuang wrote:

In D137872#4327615, @efriedma wrote:

I'm having a bit of trouble following how exactly the thunk creation is working here... do we generate different code depending on whether the call operator and/or the static invoker are referenced?

So I think it used to be that the static invoker calls the call operator which contains the body of the lambda? And now both the static invoker and the call operator are delegating to this new __impl function. In EmitLambdaStaticInvokeBody it first calls EmitLambdaInAllocaCallOpFn to make the new call operator (which is populated using EmitLambdaDelegatingInvokeBody). And then inside EmitLambdaDelegatingInvokeBody it generates the new impl function and calls it in the body. Not sure if that answers the question, I agree the code is a bit roundabout.

Why is the function in EmitLambdaInAllocaCallOpFn not getting defined using the normal CodeGenModule machinery?

Do you mean why it's not using CodeGenModule machinery to generate the new call op body or why we're changing the original call op body?

I mean why it's not using the CodeGenModule machinery. I understand there are three function bodies involved. But I would expect that when you "emit" the lambda call operator, it emits the entry point for the call and for the call impl, and when you emit the static invoker, it just emits a call to the call impl.

Or... hmm. Is this actually erasing the existing definition of the call operator, if it was already emitted? That probably doesn't actually work in general; once a function is emitted, you can't re-emit it. Emitting a function emits other related stuff like static variables; you can't EmitFunctionBody() more than once. Either you need to decide when it's first emitted, or you need to rewrite the signature of the definition and splice the existing body in.

move call operator emission into the path in GenerateCode
formatting

remove print statements

In D137872#4357268, @efriedma wrote:

In D137872#4348314, @akhuang wrote:

In D137872#4327615, @efriedma wrote:

I'm having a bit of trouble following how exactly the thunk creation is working here... do we generate different code depending on whether the call operator and/or the static invoker are referenced?

So I think it used to be that the static invoker calls the call operator which contains the body of the lambda? And now both the static invoker and the call operator are delegating to this new __impl function. In EmitLambdaStaticInvokeBody it first calls EmitLambdaInAllocaCallOpFn to make the new call operator (which is populated using EmitLambdaDelegatingInvokeBody). And then inside EmitLambdaDelegatingInvokeBody it generates the new impl function and calls it in the body. Not sure if that answers the question, I agree the code is a bit roundabout.

Why is the function in EmitLambdaInAllocaCallOpFn not getting defined using the normal CodeGenModule machinery?

Do you mean why it's not using CodeGenModule machinery to generate the new call op body or why we're changing the original call op body?

I mean why it's not using the CodeGenModule machinery. I understand there are three function bodies involved. But I would expect that when you "emit" the lambda call operator, it emits the entry point for the call and for the call impl, and when you emit the static invoker, it just emits a call to the call impl.

Or... hmm. Is this actually erasing the existing definition of the call operator, if it was already emitted? That probably doesn't actually work in general; once a function is emitted, you can't re-emit it. Emitting a function emits other related stuff like static variables; you can't EmitFunctionBody() more than once. Either you need to decide when it's first emitted, or you need to rewrite the signature of the definition and splice the existing body in.

good point, I realized I can just put the call operator emission where the call op is normally emitted, which removes the thing where I was trying to create a new call op while the static invoker body is being emitted.

more cleanup

Harbormaster completed remote builds in B237610: Diff 529755.Jun 8 2023, 4:37 PM

Fix ordering in the test case

Harbormaster completed remote builds in B237628: Diff 529780.Jun 8 2023, 5:08 PM

efriedma added inline comments.Jun 10 2023, 2:45 PM

clang/lib/CodeGen/CGClass.cpp
3174	Is there any way we can use GenerateCode as the entrypoint here, instead of calling EmitFunctionBody directly? Without this patch, EmitFunctionBody has exactly one caller, so this makes the control flow and special cases more complicated to reason about. For example, there's some special coroutine handling in GenerateCode.
clang/lib/CodeGen/CodeGenFunction.cpp
1472 ↗	(On Diff #529780)	Does this pass the correct value of "this"? EmitLambdaStaticInvokeBody creates a new alloca to represent "this", but it's already an argument to the function. Granted, it only matters in really obscure cases, but still.

akhuang added inline comments.Jun 14 2023, 11:21 AM

clang/lib/CodeGen/CGClass.cpp
3174	Changed to using GenerateCode -- had to add another case to the path in GenerateCode to make sure we emit different things for the call op body inside __impl and the call op body inside the call op function.
clang/lib/CodeGen/CodeGenFunction.cpp
1472 ↗	(On Diff #529780)	That's true, the "this" won't be passed correctly. Actually, would it be fine to just emit the original call op body? (so that the same function body is emitted twice -- once in the call op and once in __impl).

Call GenerateCode to emit __impl function body

efriedma added inline comments.Jun 14 2023, 12:00 PM

clang/lib/CodeGen/CodeGenFunction.cpp
1472 ↗	(On Diff #529780)	Not completely sure what you're asking... but as I've mentioned, we can't EmitFunctionBody() the body of a function more than once.

Harbormaster completed remote builds in B238886: Diff 531435.Jun 14 2023, 2:35 PM

Emit call op which forwards %this argument

clang/lib/CodeGen/CodeGenFunction.cpp
1472 ↗	(On Diff #529780)	Oh, nevermind then. I think this version passes "this" from the call op.

update test case

Harbormaster completed remote builds in B238998: Diff 531571.Jun 14 2023, 7:05 PM

efriedma added inline comments.Jun 15 2023, 5:04 PM

clang/lib/CodeGen/CodeGenFunction.cpp
1470 ↗	(On Diff #531571)	I'm not sure `__impl` is unique enough that user code will never use it. (I mean, it's reserved, but we don't actually forbid using it.) Can we use the isDelegateCall() bit?

Use isDelegateCall to check whether function is actual call op fn or not

clang/lib/CodeGen/CodeGenFunction.cpp
1470 ↗	(On Diff #531571)	yep, that makes more sense.

LGTM

This revision is now accepted and ready to land.Jun 16 2023, 4:33 PM

Harbormaster completed remote builds in B239524: Diff 532291.Jun 16 2023, 6:17 PM

I'm really happy to see this limitation finally addressed! It's quite a bit of embarrassing tech debt that affects folks making Windows x86 builds.

rebase and clang format

This revision was landed with ongoing or failed builds.Jun 20 2023, 5:31 PM

Closed by commit rG015049338d7e: Try to implement lambdas with inalloca parameters by forwarding without use of… (authored by akhuang). · Explain Why

This revision was automatically updated to reflect the committed changes.

akhuang added a commit: rG015049338d7e: Try to implement lambdas with inalloca parameters by forwarding without use of….

Harbormaster completed remote builds in B240116: Diff 533079.Jun 20 2023, 7:38 PM

akhuang added a reverting change: rG8ed7aa59f489: Revert "Try to implement lambdas with inalloca parameters by forwarding without….Jun 22 2023, 11:42 AM

For those following along, this was relanded in https://reviews.llvm.org/D154007

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCall.cpp

4 lines

CGClass.cpp

95 lines

CodeGenFunction.h