This is an archive of the discontinued LLVM Phabricator instance.

[clang][CodeGen] Refactor CreateTempAlloca function nest. NFC.
Needs ReviewPublic

Authored by wingo on Aug 20 2021, 6:01 AM.

Details

Summary

It used to be that there were three layers to create temp alloca
instructions in CodeGenFunction. The lowest level was named
CreateTempAlloca and returned an LLVM instruction (1):

llvm::AllocaInst* CreateTempAlloca(llvm::Type *Ty);

(Leaving off the name argument and array size from the prototype, for
brevity.)

The next level applied frontend-specified alignment to the alloca and
returned an address, but left the value in the alloca address space (2):

Address
CreateTempAllocaWithoutCast(llvm::Type *Ty, CharUnits Align);

Finally the normal function returns an Address, but also makes sure that
the result is in LangAS::Default (3):

Address
CreateTempAlloca(QualType Ty, CharUnits Align);

This is a bit confusing since functions (1) and (3) share a name but
have different behavior, and function (2) has a funny name.
Furthermore, the implementation of function (2) actually calls
function (1), making it seem to the reader like there is a loop in the
call graph.

This patch refactors to remove function (1) and replace code that uses
it with calls to function (2), returning an Address instead of an IR
instruction. This also removes some places in which the frontend wasn't
specifying the alignment of its allocas.

This patch also changes function (2) to explicitly take an address space
argument, which should generally be the alloca address space. There is
usually one target-specified alloca address space, but in the future,
the WebAssembly target may alloca variables in multiple address spaces.
The function name is changed from CreateTempAllocaWithoutCast to
CreateTempAllocaInAS, indicating that the result is left in the given
AS.

Finally, we also replace uses of the somewhat-deprecated
CreateDefaultAlignTempAlloca with CreateTempAllocaInAS, passing the
result of calling the new CodeGenFunction::PreferredAlignmentForIRType
method as the alignnment.

As a result of this patch, a number of llvm::Value* variables are
changed to be Address instead. This allows a more simplified codegen,
as the IR builder doesn't need to take an additional alignment argument.

Diff Detail

Event Timeline

wingo created this revision.Aug 20 2021, 6:01 AM
wingo requested review of this revision.Aug 20 2021, 6:01 AM
Herald added a project: Restricted Project. · View Herald Transcript
wingo added a comment.Aug 20 2021, 6:04 AM

Sooooo... besides the refactor, this is getting closer to where I'm going in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though still NFC. I think you can see where I would replace getASTAllocaAddressSpace with getAllocaAddressSpace(QualType Ty), and possibly (depending on the source language) avoid casting the resulting alloca to LangAS::Default. WDYT, is this sort of thing OK?

wingo added inline comments.Aug 20 2021, 6:07 AM
clang/lib/CodeGen/CGBuilder.h
115

it's the change to always return an Address from CreateTempAlloca that makes these methods unnecessary.

clang/lib/CodeGen/CGGPUBuiltin.cpp
116–122

this is an open question -- there could be a bug here in the existing code.

wingo added a subscriber: tlively.Aug 23 2021, 2:26 AM

Sooooo... besides the refactor, this is getting closer to where I'm going in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though still NFC. I think you can see where I would replace getASTAllocaAddressSpace with getAllocaAddressSpace(QualType Ty), and possibly (depending on the source language) avoid casting the resulting alloca to LangAS::Default. WDYT, is this sort of thing OK?

Taking this patch as perhaps a better generic discussion point, @rjmccall graciously gave some general feedback on this approach (thank you!!!):

I'm not sure that I agree with your overall plan, though:

  • The WebAssembly operand stack is not a good match for an address space at the language level because it's not addressable at all. If you can't meaningfully have a pointer into the address space, then you don't really need this in the type system; it's more like a declaration modifier at best.
  • Allocating local variables on the operand stack ought to be a very straightforward analysis in the backend. There's not much optimization value in trying to do it in the frontend, and it's going to be problematic for things like coroutine lowering.
  • The security argument seems pretty weak, not because security isn't important but because this is not really an adequate basis for getting the tamper-proof guarantee you want. For example, LLVM passes can and do introduce its own allocas and store scalars into them sometimes. Really you need some sort of "tamper-proof" *type* which the compiler can make an end-to-end guarantee of non-tamper-ability for the values of, and while optimally this would be implemented by just keeping values on the operand stack, in the general case you will need to have some sort of strategy for keeping things in memory.

Thanks for thinking about this! Indeed I started out with the goal of not going deep into clang and if it's possible to avoid going too deeply, that would be better for everyone involved. I am starting to think however that it may be unavoidable for me at least.

So, I am focusing on WebAssembly global and local variables; the WebAssembly operand stack is an artifact of the IR-to-MC lowering and AFAICS doesn't have any bearing on what clang does -- though perhaps I am misunderstanding what you are getting at here. The issue is not to allocate locals on the operand stack, but rather to allocate them as part of the "locals" of a WebAssembly function. Cc @tlively on the WebAssembly side.

I agree that the security argument is weak: it's something but it's not the real motivation.

The main motivator is the ability to have "reference type" (externref/funcref) locals and globals at all. Reference-typed values can't be stored to linear memory. They have no size and no byte representation -- they are opaque values from the host. However, WebAssembly locals and globals can define storage locations of type externref or funcref. The storage locations for WebAssembly locals and globals are not in linear memory, and are not addressable by pointer at run-time -- accesses to them are always by immediate.

Currently, clang always produces LLVM IR that allocates C++ globals and locals in linear memory. LLVM may transform some of these to WebAssembly globals or locals at its discretion. This strategy works because all values for the initial set of types supported by WebAssembly can be stored to linear memory; what you can do with a WebAssembly global or local was a subset of what you could do with linear-memory globals and alloca locals.

However, with reference types (merged into the spec earlier this year), this is no longer the case -- there are now types representable in WebAssembly globals/locals which can't be represented in linear memory.

Because of the limitations in how WebAssembly globals and locals can be used, reference-typed values have associated semantic restrictions in the front-end. If I declare a C++ local of type externref (which must be allocated to a WebAssembly local), I can't take its address:

void f() {
  externref_t x = g();
  h(&x); // error
}

Similarly I can't put an externref in an aggregate type that itself is allocated in linear memory:

// global
struct { int x; externref_t y; } z; // error

But, if we add a generic OpenCL-like address space attribute, that would allow the user to declare some variables to be in alternate address spaces. Then we can apply the ACLE SVE semantic restrictions to these values also, and add on an additional restriction preventing address-of. That way users get to make off-heap definitions, and if they misuse them, they get comprehensible errors. LLVM IR and WebAssembly lowering is ready for these alternate-address-space allocations.

// global
struct { int x; externref_t y; } z __attribute__((wasm_var)); // ok

The builtin externref_t and funcref_t types would probably already have this attribute. (I don't have a complete clang patchset yet, so if you prefer to wait to see what things look like, this is perfectly ok.) But because in the future there will be more kinds of reference types, and that we might want to have have aggregate types which include both number and reference types, it seems that there are two separable concerns here: one about applying the semantic restrictions for WebAssembly global and local storage locations, and another concern about handling "opaque" values (externref) which doesn't impose additional Sema/ restrictions.

The restrictions needed for WebAssembly globals and locals are essentially the same, and they lower to the same LLVM IR address space for the WebAssembly target, hence I would propose a single wasm_var attribute instead of wasm_global and wasm_local. This can change though if it's confusing.

Finally, I would note that it would be useful from an ABI point of view to be able to define named WebAssembly globals (but not locals) in C, if e.g. an external interface expects that a module export an i32 global with name foo. So this patch-set has that use-case also.

Regarding coroutine lowering, I can see how that can be challenging; would it be reasonable to restrict continuations to not include saved off-heap locals, for now? If there were such a local, it would be a compilation error.

OK, lots of words. Thanks for reading. What do you think about this (ab)use of LangAS? If there is a better way to cross reference types with C++, pointers are very much welcome. I will have a better idea what the end size of the patch-set is within a couple weeks; I guess I would propose to continue posting the series and hope that the end set of core changes is stomache-able, and add you to Cc as I go.

wingo updated this revision to Diff 368098.Aug 23 2021, 7:20 AM

Rebase to no longer require Address default constructor.

+ JF, who knows something about Web Assembly, or can at least drag in the right people

Sooooo... besides the refactor, this is getting closer to where I'm going in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though still NFC. I think you can see where I would replace getASTAllocaAddressSpace with getAllocaAddressSpace(QualType Ty), and possibly (depending on the source language) avoid casting the resulting alloca to LangAS::Default. WDYT, is this sort of thing OK?

Taking this patch as perhaps a better generic discussion point, @rjmccall graciously gave some general feedback on this approach (thank you!!!):

I'm not sure that I agree with your overall plan, though:

  • The WebAssembly operand stack is not a good match for an address space at the language level because it's not addressable at all. If you can't meaningfully have a pointer into the address space, then you don't really need this in the type system; it's more like a declaration modifier at best.
  • Allocating local variables on the operand stack ought to be a very straightforward analysis in the backend. There's not much optimization value in trying to do it in the frontend, and it's going to be problematic for things like coroutine lowering.
  • The security argument seems pretty weak, not because security isn't important but because this is not really an adequate basis for getting the tamper-proof guarantee you want. For example, LLVM passes can and do introduce its own allocas and store scalars into them sometimes. Really you need some sort of "tamper-proof" *type* which the compiler can make an end-to-end guarantee of non-tamper-ability for the values of, and while optimally this would be implemented by just keeping values on the operand stack, in the general case you will need to have some sort of strategy for keeping things in memory.

Thanks for thinking about this! Indeed I started out with the goal of not going deep into clang and if it's possible to avoid going too deeply, that would be better for everyone involved. I am starting to think however that it may be unavoidable for me at least.

So, I am focusing on WebAssembly global and local variables; the WebAssembly operand stack is an artifact of the IR-to-MC lowering and AFAICS doesn't have any bearing on what clang does -- though perhaps I am misunderstanding what you are getting at here. The issue is not to allocate locals on the operand stack, but rather to allocate them as part of the "locals" of a WebAssembly function. Cc @tlively on the WebAssembly side.

By "operand stack" I mean the innate, unaddressable stack that the WebAssembly VM maintains in order to make functions reentrant. I don't know what term the VM spec uses for it, but I believe "operand stack" is widely accepted terminology for the unaddressable stack when you've got this kind of dual-stack setup. And yes, VM "locals" would go there.

The main motivator is the ability to have "reference type" (externref/funcref) locals and globals at all. Reference-typed values can't be stored to linear memory. They have no size and no byte representation -- they are opaque values from the host. However, WebAssembly locals and globals can define storage locations of type externref or funcref.

I see. I think you need to think carefully about the best way to represent values of these types in LLVM IR, because it probably cannot just be "treat them as a normal value, emit code a certain way that we know how to lower, and hope nothing goes wrong". It seems to me that you probably need a new IR type for it, since normal types aren't restricted from memory and tokens can't be used as parameters or return values.

Hopefully, someone had a plan for this when they introduced that WebAssembly extension.

But, if we add a generic OpenCL-like address space attribute, that would allow the user to declare some variables to be in alternate address spaces. Then we can apply the ACLE SVE semantic restrictions to these values also, and add on an additional restriction preventing address-of. That way users get to make off-heap definitions, and if they misuse them, they get comprehensible errors. LLVM IR and WebAssembly lowering is ready for these alternate-address-space allocations.

Again, I'm not sure you're getting anything at all from the address space side of this. The restrictions on these variables prevent any of the general address-space logic from applying. In a language sense, it's more like a storage class than an address space.

Regarding coroutine lowering, I can see how that can be challenging; would it be reasonable to restrict continuations to not include saved off-heap locals, for now? If there were such a local, it would be a compilation error.

I suppose you would have to.

+ JF, who knows something about Web Assembly, or can at least drag in the right people

Sooooo... besides the refactor, this is getting closer to where I'm going in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though still NFC. I think you can see where I would replace getASTAllocaAddressSpace with getAllocaAddressSpace(QualType Ty), and possibly (depending on the source language) avoid casting the resulting alloca to LangAS::Default. WDYT, is this sort of thing OK?

Taking this patch as perhaps a better generic discussion point, @rjmccall graciously gave some general feedback on this approach (thank you!!!):

I'm not sure that I agree with your overall plan, though:

  • The WebAssembly operand stack is not a good match for an address space at the language level because it's not addressable at all. If you can't meaningfully have a pointer into the address space, then you don't really need this in the type system; it's more like a declaration modifier at best.
  • Allocating local variables on the operand stack ought to be a very straightforward analysis in the backend. There's not much optimization value in trying to do it in the frontend, and it's going to be problematic for things like coroutine lowering.
  • The security argument seems pretty weak, not because security isn't important but because this is not really an adequate basis for getting the tamper-proof guarantee you want. For example, LLVM passes can and do introduce its own allocas and store scalars into them sometimes. Really you need some sort of "tamper-proof" *type* which the compiler can make an end-to-end guarantee of non-tamper-ability for the values of, and while optimally this would be implemented by just keeping values on the operand stack, in the general case you will need to have some sort of strategy for keeping things in memory.

Thanks for thinking about this! Indeed I started out with the goal of not going deep into clang and if it's possible to avoid going too deeply, that would be better for everyone involved. I am starting to think however that it may be unavoidable for me at least.

So, I am focusing on WebAssembly global and local variables; the WebAssembly operand stack is an artifact of the IR-to-MC lowering and AFAICS doesn't have any bearing on what clang does -- though perhaps I am misunderstanding what you are getting at here. The issue is not to allocate locals on the operand stack, but rather to allocate them as part of the "locals" of a WebAssembly function. Cc @tlively on the WebAssembly side.

By "operand stack" I mean the innate, unaddressable stack that the WebAssembly VM maintains in order to make functions reentrant. I don't know what term the VM spec uses for it, but I believe "operand stack" is widely accepted terminology for the unaddressable stack when you've got this kind of dual-stack setup. And yes, VM "locals" would go there.

@wingo, are there cases where it is useful to declare variables as living in WebAssembly locals and not in the VM stack? I'm having trouble coming up with a case where leaving that up to the backend is not enough. We clearly need a way to prevent values from being written to main memory (AS 0), but it's not clear to me that we need a way to specifically allocate locals for them.

The main motivator is the ability to have "reference type" (externref/funcref) locals and globals at all. Reference-typed values can't be stored to linear memory. They have no size and no byte representation -- they are opaque values from the host. However, WebAssembly locals and globals can define storage locations of type externref or funcref.

I see. I think you need to think carefully about the best way to represent values of these types in LLVM IR, because it probably cannot just be "treat them as a normal value, emit code a certain way that we know how to lower, and hope nothing goes wrong". It seems to me that you probably need a new IR type for it, since normal types aren't restricted from memory and tokens can't be used as parameters or return values.

Hopefully, someone had a plan for this when they introduced that WebAssembly extension.

Yes, we had a plan :) In WebAssembly, reference types are essentially opaque pointers that cannot be dereferenced or stored into main memory. They can, however, be stored in WebAssembly globals and tables, which are modeled as LLVM global pointers and global arrays in other address spaces. At the IR level, reference types are modeled as pointers into a non-integral AS that themselves live in a non-integral AS. If the optimizer ever spills a local reference-typed value to memory, we are able to discover and correct that in the backend. I believe we are currently assuming that the optimizer will never introduce a store of a reference-typed value into a global main memory location, though.

But, if we add a generic OpenCL-like address space attribute, that would allow the user to declare some variables to be in alternate address spaces. Then we can apply the ACLE SVE semantic restrictions to these values also, and add on an additional restriction preventing address-of. That way users get to make off-heap definitions, and if they misuse them, they get comprehensible errors. LLVM IR and WebAssembly lowering is ready for these alternate-address-space allocations.

Again, I'm not sure you're getting anything at all from the address space side of this. The restrictions on these variables prevent any of the general address-space logic from applying. In a language sense, it's more like a storage class than an address space.

Using address spaces lets us model loads and stores of reference-typed values from and to globals and tables. I don't think it makes sense to present these concepts as "address spaces" to C/C++ users, but that's what we're using at the IR level.

Regarding coroutine lowering, I can see how that can be challenging; would it be reasonable to restrict continuations to not include saved off-heap locals, for now? If there were such a local, it would be a compilation error.

I suppose you would have to.

+ JF, who knows something about Web Assembly, or can at least drag in the right people

Sooooo... besides the refactor, this is getting closer to where I'm going in https://lists.llvm.org/pipermail/cfe-dev/2021-July/068559.html, though still NFC. I think you can see where I would replace getASTAllocaAddressSpace with getAllocaAddressSpace(QualType Ty), and possibly (depending on the source language) avoid casting the resulting alloca to LangAS::Default. WDYT, is this sort of thing OK?

Taking this patch as perhaps a better generic discussion point, @rjmccall graciously gave some general feedback on this approach (thank you!!!):

I'm not sure that I agree with your overall plan, though:

  • The WebAssembly operand stack is not a good match for an address space at the language level because it's not addressable at all. If you can't meaningfully have a pointer into the address space, then you don't really need this in the type system; it's more like a declaration modifier at best.
  • Allocating local variables on the operand stack ought to be a very straightforward analysis in the backend. There's not much optimization value in trying to do it in the frontend, and it's going to be problematic for things like coroutine lowering.
  • The security argument seems pretty weak, not because security isn't important but because this is not really an adequate basis for getting the tamper-proof guarantee you want. For example, LLVM passes can and do introduce its own allocas and store scalars into them sometimes. Really you need some sort of "tamper-proof" *type* which the compiler can make an end-to-end guarantee of non-tamper-ability for the values of, and while optimally this would be implemented by just keeping values on the operand stack, in the general case you will need to have some sort of strategy for keeping things in memory.

Thanks for thinking about this! Indeed I started out with the goal of not going deep into clang and if it's possible to avoid going too deeply, that would be better for everyone involved. I am starting to think however that it may be unavoidable for me at least.

So, I am focusing on WebAssembly global and local variables; the WebAssembly operand stack is an artifact of the IR-to-MC lowering and AFAICS doesn't have any bearing on what clang does -- though perhaps I am misunderstanding what you are getting at here. The issue is not to allocate locals on the operand stack, but rather to allocate them as part of the "locals" of a WebAssembly function. Cc @tlively on the WebAssembly side.

By "operand stack" I mean the innate, unaddressable stack that the WebAssembly VM maintains in order to make functions reentrant. I don't know what term the VM spec uses for it, but I believe "operand stack" is widely accepted terminology for the unaddressable stack when you've got this kind of dual-stack setup. And yes, VM "locals" would go there.

@wingo, are there cases where it is useful to declare variables as living in WebAssembly locals and not in the VM stack? I'm having trouble coming up with a case where leaving that up to the backend is not enough. We clearly need a way to prevent values from being written to main memory (AS 0), but it's not clear to me that we need a way to specifically allocate locals for them.

Right, I think this is one of the key questions here. Right now this seems to be entirely type-directed: it's a special property of a couple of builtin types that you can't take the address of their objects and those objects can only live in specific places. Having it be type-directed, but only to the point of saying that certain types *must* use certain address spaces, and then imposing all the other restrictions on those types as novel restrictions on those address spaces, feels like it's adding complexity to the language concept of address spaces without much benefit.

The main motivator is the ability to have "reference type" (externref/funcref) locals and globals at all. Reference-typed values can't be stored to linear memory. They have no size and no byte representation -- they are opaque values from the host. However, WebAssembly locals and globals can define storage locations of type externref or funcref.

I see. I think you need to think carefully about the best way to represent values of these types in LLVM IR, because it probably cannot just be "treat them as a normal value, emit code a certain way that we know how to lower, and hope nothing goes wrong". It seems to me that you probably need a new IR type for it, since normal types aren't restricted from memory and tokens can't be used as parameters or return values.

Hopefully, someone had a plan for this when they introduced that WebAssembly extension.

Yes, we had a plan :) In WebAssembly, reference types are essentially opaque pointers that cannot be dereferenced or stored into main memory. They can, however, be stored in WebAssembly globals and tables, which are modeled as LLVM global pointers and global arrays in other address spaces. At the IR level, reference types are modeled as pointers into a non-integral AS that themselves live in a non-integral AS. If the optimizer ever spills a local reference-typed value to memory, we are able to discover and correct that in the backend. I believe we are currently assuming that the optimizer will never introduce a store of a reference-typed value into a global main memory location, though.

Hmm. This seems like it's abusing the inner address space to create a primitive opaque user-defined type in LLVM IR, but I don't have a compelling argument why you shouldn't do it this way. I guess my only real objection to the overall approach is vague hand-wringing about this idea of having strong representational invariants and then just working around passes that break them in the backend.

But, if we add a generic OpenCL-like address space attribute, that would allow the user to declare some variables to be in alternate address spaces. Then we can apply the ACLE SVE semantic restrictions to these values also, and add on an additional restriction preventing address-of. That way users get to make off-heap definitions, and if they misuse them, they get comprehensible errors. LLVM IR and WebAssembly lowering is ready for these alternate-address-space allocations.

Again, I'm not sure you're getting anything at all from the address space side of this. The restrictions on these variables prevent any of the general address-space logic from applying. In a language sense, it's more like a storage class than an address space.

Using address spaces lets us model loads and stores of reference-typed values from and to globals and tables. I don't think it makes sense to present these concepts as "address spaces" to C/C++ users, but that's what we're using at the IR level.

Yeah, at some point I'm willing to accept that this is your best option at the IR level, but I want to not jam this into the user-facing language unless it's really the right design approach.

John.

wingo added a subscriber: sbc100.EditedAug 24 2021, 1:36 AM

Thanks again John & Thomas for your thoughts.

I don't think it makes sense to present these concepts as "address spaces" to C/C++ users, but that's what we're using at the IR level.

Yeah, at some point I'm willing to accept that this is your best option at the IR level, but I want to not jam this into the user-facing language unless it's really the right design approach.

I am absolutely not married to the particular approach in this patch series and am happy to explore the design space :)

@wingo, are there cases where it is useful to declare variables as living in WebAssembly locals and not in the VM stack? I'm having trouble coming up with a case where leaving that up to the backend is not enough. We clearly need a way to prevent values from being written to main memory (AS 0), but it's not clear to me that we need a way to specifically allocate locals for them.

No, there are no cases that I know of. From the IR & backend POV, the point as you say is to prevent values from being written to main memory. It doesn't matter if they are in named locals or just temporaries.

The issue is that clang always lowers local variables as alloca's. Clang needs to lower them to alloca's in AS 1 for reference types. Without optimization, LLVM will lower an alloca in AS 1 to a WebAssembly local. SSA conversion in SROA could lift it to an SSA variable which may avoid the local, in some cases. So we are not specifically allocating a local for them, I agree that isn't the right way to express the requirement.

Also, being able to annotate non-reference-types with a wasm_var address space is not really part of the requirements. It certainly has no utility for variables with automatic storage duration. @sbc100 did mention that it could be useful for non-reference-typed globals, though, for ABI reasons.

Right now this seems to be entirely type-directed: it's a special property of a couple of builtin types that you can't take the address of their objects and those objects can only live in specific places. Having it be type-directed, but only to the point of saying that certain types *must* use certain address spaces, and then imposing all the other restrictions on those types as novel restrictions on those address spaces, feels like it's adding complexity to the language concept of address spaces without much benefit.

Yeah I could be getting this wrong here.

To expand a bit, it's a special property of a class of types -- currently externref and funcref but future WebAssembly specifications will include reference types like (struct i32 externref). So there is a concept to factor out here that isn't specific to just externref and funcref.

@rjmccall do you think some kind of type attribute would make more sense? If there is something already existing that I can take as a model, that would be helpful of course.

The utility we get from LangAS right now is (1) that it's a clear path to get alloca in AS 1 in codegen and (2) that it's a concept that Sema/ can query in order to detect errors and signal them to the user. I am sure there are other ways to do this though!