This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly][CodeGen] IR support for WebAssembly local variables
ClosedPublic

Authored by wingo on Apr 22 2021, 11:35 PM.

Details

Summary

This patch adds TargetStackID::WasmLocal. This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there. Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Diff Detail

Event Timeline

wingo created this revision.Apr 22 2021, 11:35 PM
wingo requested review of this revision.Apr 22 2021, 11:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2021, 11:35 PM
wingo updated this revision to Diff 341100.Apr 28 2021, 2:02 AM

Rebase, and add A0:u addrspace qualifier

Nice! This turns out to be relatively simple. It would be good to add more tests with reference typed locals mixed in with normal locals, etc. Also, what happens when one of the alloca values is used by something other than a load or a store? Is it just an ISel failure at that point?

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
102

I would name this WebAssemblylocal_get to match WebAssemblybr_table, although admittedly the naming convention for these custom nodes is very weird.

llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td
29 ↗(On Diff #341100)

Unintentional whitespace change?

llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.h
107 ↗(On Diff #341100)

I believe this should work fine as well. In fact this, whole pattern could be reduced to this:

auto Results = FrameIndex2Local.insert({FrameIndex, Locals.size()});
if (Results.second)
  addLocal(VT);
return Results.first->second + Params.size();
llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h
35 ↗(On Diff #341100)

Could even do this to make it clearer that this is meant to be used as an unsigned.

llvm/test/CodeGen/WebAssembly/externref-alloca.ll
3–4 ↗(On Diff #341100)

Would it be possible to use the same address space(s) for locals and globals? In general we can use an fixed, finite number of address spaces, but I would prefer to use as few as possible so there is less to remember :)

wingo added a comment.Apr 29 2021, 1:54 AM

Nice! This turns out to be relatively simple. It would be good to add more tests with reference typed locals mixed in with normal locals, etc.

Thanks for the feedback! Agreed with regards to tests obviously. Following up directly as regards the most important question, "is it actually robust":

Also, what happens when one of the alloca values is used by something other than a load or a store? Is it just an ISel failure at that point?

Let us assume that because of source restrictions, all allocas are static (in the sense of AllocaInst::isStaticAlloca()). The alloca's are in a non-integral address space so the only operation you can do on them is to pass the pointer value around, or load from it, or store to it. (If I missed a potential use, I am happy to be corrected.) We handle the load and store cases. Source restrictions prevent passing locals by-reference to other functions, so I think we are good there as well. I have no idea what might happen if the source restrictions aren't respected though -- good thing the SVE people have done most of the work on the front-end already because getting a coherent story there would be complicated otherwise!

There is one "hole in the racket", as apparently we say in french -- externref/funcref locals which have no uses. You wouldn't want these leaving isel as live stack objects in the MachineFrameInfo. I am thinking it may be more robust to run a quick pre-pass through the stack objects after the static alloca's are collected in FunctionLoweringInfo::set to eagerly transform them to locals and mark them dead, even if there are no uses.

wingo retitled this revision from [WebAssembly][CodeGen] Allow for externref/funcref local variables to [WebAssembly][CodeGen] IR support for WebAssembly local variables.May 4 2021, 1:24 AM
wingo edited the summary of this revision. (Show Details)
wingo updated this revision to Diff 342669.May 4 2021, 1:25 AM

Remove reftypes

wingo added a comment.May 4 2021, 1:28 AM

Rebased on top of https://reviews.llvm.org/D101608, and removed reftypes support, as it turns out that the notion of WebAssembly locals in IR is sufficiently general that externref/funcref can be added in a followup. WDYT @tlively, @dschuff , @pmatos ?

wingo updated this revision to Diff 342698.May 4 2021, 4:22 AM

Fix clang datalayout expectations

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2021, 4:22 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
tlively added inline comments.May 4 2021, 5:18 PM
llvm/include/llvm/CodeGen/MIRYamlMapping.h
351

Is there a good test to demonstrate this change in?

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
34–36

Why do this in two places instead of just once comprehensively in the hook? It would be good to explain that in the comment, too.

63–64

Can there be aggregate components? What do we do with them?

wingo updated this revision to Diff 343367.May 6 2021, 5:17 AM

Rename "object" to "managed", and add test for stack id

wingo updated this revision to Diff 344323.May 11 2021, 1:41 AM

Update in response to parent commit changes

wingo edited the summary of this revision. (Show Details)May 21 2021, 4:47 AM
wingo updated this revision to Diff 346990.May 21 2021, 5:01 AM
wingo marked 4 inline comments as done.

Address feedback

llvm/include/llvm/CodeGen/MIRYamlMapping.h
351

Done in ir-locals-stackid.ll

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
34–36

👍

63–64

Hoo, it is a good question. I think the high-level answer is that in the same way as you can allocate an i32 to a local, or an externref to a local, you could allocate a struct { i32, externref }. The leaves of that aggregate would be numbered from 0 to N, assigned to consecutive locals of the appropriate primitive types, and accesses would compiled down to direct access to the members. Probably the first person who tries this is going to run into some interesting cases and I think specifically the local.get index mapping isn't quite yet ready. But that would be the idea. WDYT?

Gentle ping here to @tlively :)

tlively accepted this revision.May 27 2021, 12:37 PM

LGTM with that one last comment on the test.

llvm/test/CodeGen/WebAssembly/ir-locals.ll
18–21

It might be good to get the value from an external function call rather than from an argument to prevent this transformation from happening.

This revision is now accepted and ready to land.May 27 2021, 12:37 PM
wingo added inline comments.May 28 2021, 2:05 AM
llvm/test/CodeGen/WebAssembly/ir-locals.ll
18–21

Tx for feedback. Actually it doesn't really matter where the value comes from; store-to-load forwarding happens regardless. If it came from a call it would be allocated a local via the explicit locals pass, as a value with more than one use.

However I was able to inhibit this transformation with an opaque call, so I'll do that.

wingo updated this revision to Diff 348467.May 28 2021, 2:09 AM

Adapt test to insert compiler barrier.

This revision was landed with ongoing or failed builds.May 28 2021, 2:10 AM
This revision was automatically updated to reflect the committed changes.
wingo reopened this revision.May 28 2021, 4:29 AM
This revision is now accepted and ready to land.May 28 2021, 4:29 AM
wingo updated this revision to Diff 348493.May 28 2021, 4:29 AM

Fix build for RISC-V and AMDGPU.

Is it just me or does having a backend-specific type in target-independent code feel wrong? It feels like there should be a space for target-specific TargetStackIDs...

wingo added a comment.May 28 2021, 6:19 AM

Is it just me or does having a backend-specific type in target-independent code feel wrong? It feels like there should be a space for target-specific TargetStackIDs...

Hoo, good question. More support for target-specific handling of stack IDs would be great. However in this case the concept is not purely wasm-specific; I can imagine other targets that might have a similar treatment of locals (if we had a .net target, or a jvm target, or so). The idea is that there is a separate stack consisting of named locals that may not be addressable by pointers to main memory. In earlier drafts of this patch the name was more generic ("Object", then "Managed") but you know, our words in this area are quite overloaded. So instead I went with something quite specific (WasmLocal) to avoid the general question -- but I do think the concept is not specific, even if it doesn't apply to any other target currently in tree. WDYT?

Is it just me or does having a backend-specific type in target-independent code feel wrong? It feels like there should be a space for target-specific TargetStackIDs...

Hoo, good question. More support for target-specific handling of stack IDs would be great. However in this case the concept is not purely wasm-specific; I can imagine other targets that might have a similar treatment of locals (if we had a .net target, or a jvm target, or so). The idea is that there is a separate stack consisting of named locals that may not be addressable by pointers to main memory. In earlier drafts of this patch the name was more generic ("Object", then "Managed") but you know, our words in this area are quite overloaded. So instead I went with something quite specific (WasmLocal) to avoid the general question -- but I do think the concept is not specific, even if it doesn't apply to any other target currently in tree. WDYT?

Well, except all the logic for it is in the backend, only the parser and the definition are in target-independent code, so even if another backend were to reuse it it would have its own completely separate logic for it, and thus there's no benefit to reusing a shared name for the thing over each target defining its own? Or would some of the code in the wasm backend be refactored out into CodeGen?

wingo added a comment.May 31 2021, 1:17 AM

Is it just me or does having a backend-specific type in target-independent code feel wrong? It feels like there should be a space for target-specific TargetStackIDs...

Hoo, good question. More support for target-specific handling of stack IDs would be great. However in this case the concept is not purely wasm-specific; I can imagine other targets that might have a similar treatment of locals (if we had a .net target, or a jvm target, or so). The idea is that there is a separate stack consisting of named locals that may not be addressable by pointers to main memory. In earlier drafts of this patch the name was more generic ("Object", then "Managed") but you know, our words in this area are quite overloaded. So instead I went with something quite specific (WasmLocal) to avoid the general question -- but I do think the concept is not specific, even if it doesn't apply to any other target currently in tree. WDYT?

Well, except all the logic for it is in the backend, only the parser and the definition are in target-independent code, so even if another backend were to reuse it it would have its own completely separate logic for it, and thus there's no benefit to reusing a shared name for the thing over each target defining its own? Or would some of the code in the wasm backend be refactored out into CodeGen?

This is the case for all TargetStackID enumerated values except TargetStackID::Default -- ScalableVector and SGPRSpill have 0 uses outside target backends, though the ScalableVector name is used in RISC-V and AMDGPU. NoAlloc has one mention outside the target backends but no real logic. Still, as a reader who hacks neither on RISC-V nor AMDGPU I find it useful that they share the same stack ID definition for scalable vectors -- lets me compare things a bit easier.

If there is a change to be made in this area (moving these enumerated values elsewhere) it would have to be in a followup I think and would need consultation with the other backends that use these values.

This revision was landed with ongoing or failed builds.May 31 2021, 1:42 AM
This revision was automatically updated to reflect the committed changes.
wingo reopened this revision.May 31 2021, 1:56 AM

Yarrrgh, broke the shared library build. Fix incoming...

This revision is now accepted and ready to land.May 31 2021, 1:56 AM
wingo updated this revision to Diff 348758.May 31 2021, 3:17 AM

Fix shared-library build by moving util function to WebAssemblyFrameLowering.

wingo added a comment.May 31 2021, 3:18 AM

Quick r? to @tlively -- OK with having the new util function in WebAssemblyFrameLowering? It didn't work to have it in the utils library because the function needs the vtable of WebAssemblyFunctionInfo.

tlively accepted this revision.May 31 2021, 9:59 PM

Quick r? to @tlively -- OK with having the new util function in WebAssemblyFrameLowering? It didn't work to have it in the utils library because the function needs the vtable of WebAssemblyFunctionInfo.

Yes, this sounds reasonable to me.

And regarding the target-specific additions to target-independent code, this addition also seems similar to how we have target-specific MVTs in the target-independent MVT list. Being able to add names (but usually not logic) for target-specific concepts to target-independent namespaces makes it much simpler to handle target-independent and target-specific things in a uniform way in the backends.

This revision was landed with ongoing or failed builds.Jun 1 2021, 2:34 AM
This revision was automatically updated to reflect the committed changes.