This polymorphic type accepts any WebAssembly reference type, at the
moment externref or funcref.
Details
- Reviewers
tlively
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
@tlively This is early days of the intrinsics implementation, so it's not ready for review.
However, I wanted to share to ensure that the progress of the implementation is public and to avoid going into a design dead end.
At the moment, I am fighting tablegen, which doesn't like the table.grow pattern due to:
Type set is empty for each HW mode: possible type contradiction in the pattern below (use -print-records with llvm-tblgen to see all expanded records).
I am still investigating, but the overall implementation will allow us to use table.grow and table.size at the IR level through the intrinsics. A few notes:
- As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.
- An intrinsic for ref.null is required so that we can create a constant to pass to table.grow to initialize the new slots on a table but also for other general uses.
- In order to specify the types (params and return) in the intrinsics definition, I needed to add a bunch of definitions for llvm_externref_ty and llvm_funcref_ty and plug them the correct types in other enumerations.
Hoping I can make peace with TableGen and sort the current problem. I feel like this should be relatively straightforward once I pass the TableGen hurdle.
P.S.: Pay no attention to the test llvm/test/CodeGen/WebAssembly/table-grow.ll whose contents are not there yet, and was committed by mistake.
Could we bypass/delay a lot of this complexity by specifying the initial version of intrinsic for table.grow as "grow_table_with_null" .. then we don't need ref.null intrinsic or these new types? (at least not yet).
If the motivation is to delay the complexity, then just having grow_table_with_null wouldn't suffice because we would still need the new types to define the table.size and table.grow param types. I also worry that doing something half-baked now will come to haunt us later if we then do it "properly" and have to keep old intrinsics for the sake of backwards compatibility.
Isn't "got with null" likely to be by far the most likely thing to be used? I'm tempted to call that __table_grow and go with __table_grow_with_default_elem for the less-used one.
Intrinsics actually can be polymorphic. For an example, see the definition of int_wasm_sub_sat_signed, which is polymorphic over all vector types (although not all vector types are actually supported in the backend). I don't know how much work it would be to support that kind of polymorphism for reference types, though.
- An intrinsic for ref.null is required so that we can create a constant to pass to table.grow to initialize the new slots on a table but also for other general uses.
Would it be more trouble than it's worth to try to get a constant zero to mean ref.null?
You mean the polymorphism achieve through llvm_anyvector_ty? So I guess the way to go about it would be to try to add a wasm_anyreftype_ty... I will take a look.
Yes, exactly. There's also the even more permissive llvm_any_ty. It looks like this would involve updating llvm/include/llvm/CodeGen/ValueTypes.td and llvm/include/llvm/Support/MachineValueType.h to add a new rAny value type that would be used by tablegen in llvm/utils/TableGen/{CodeGenDAGPatterns,CodeGenTarget,IntrinsicEmitter}.cpp.
That makes sense. Thanks for the pointers.
I am sort of going the rabbit hole of trying to understand how far I need to go to have llvm_externref_ty and llvm_funcref_ty in an intrinsic. This seems to require changes in a wide range of files - assuming it would be similar (if not worse) with a polymorphic type. Unfortunately, it is still now compiling as there are further changes required.
This feels like going into the rabbit hole of adding a new type as mentioned in https://www.llvm.org/docs/ExtendingLLVM.html#adding-a-new-type
Unsure this is necessary, but it feels like it might be if we need to specify a reference type for the intrinsic. On the other hand it's not great as I am starting to touch things related to bc encoding for the new ref types.
I keep thinking that we should be able to model these as pointers to something instead of new types, however the llvm_ptr_ty doesn't seem to allow specifying the pointee type. I will continue an investigation in this direction.
Implements the intrinsic generating ref.null.
The test ref-null.ll passes but the implementation raised questions on how funcref is represented.
Because (apparently) IITDescriptor didn't include the ability to represent functions, I added a
straightforward way in the current patch but it's definitely a draft version that provides food
for thought.
Also raises questions on how to represent funcrefs that represent different FunctionTypes, i.e.
different parameter count/type and different return type.
llvm/include/llvm/IR/Intrinsics.h | ||
---|---|---|
144 | Both externref and funcref are modeled as pointers, so I'm surprised we have to add a new type here. It looks like there is already a field in the following union for Pointer_AddressSpace, so hopefully that would be sufficient? | |
llvm/lib/CodeGen/ValueTypes.cpp | ||
207 ↗ | (On Diff #380716) | Maybe we should go along with the transition to opaque pointers and do PointerType::get(Context, 20)? If so, we should probably do the same for externref. |
llvm/lib/CodeGen/ValueTypes.cpp | ||
---|---|---|
207 ↗ | (On Diff #380716) | Tbh, I thought I was using that here when doing PointerType::get(0, 20), but you're right. I need to do PointerType::get(Context, 20). When I do so, Iearn that I need to enable opaque pointers mode through Context.enableOpaquePointers(). Curiously, nobody calls that function at the moment, but I have now done so to understand the consequences for us and what kind of pandora's box I am opening. |
llvm/lib/CodeGen/ValueTypes.cpp | ||
---|---|---|
207 ↗ | (On Diff #380716) | Oh hmm, that doesn't sound like something we want to do. Context.enableOpaquePointers() sounds like some sort of global configuration that we don't want to be responsible for here. I think the PointerType::get(0, 20) is interpreted as passing nullptr as the Type* parameter. |
The current patch pushes the implementation forward.
It currently implements intrinsics for table.size and ref.null as well as all the dependencies of this, the largest of which is a new polymorphic type llvm_anyref_ty.
There are a few niggles that need to be fixed and discussed, as well as the final implementation of table.grow intrinsic before a final patch is presented for review.
.gitignore | ||
---|---|---|
63 ↗ | (On Diff #385437) | Ignore this for now... local change committed by mistake. |
llvm/include/llvm/IR/Intrinsics.h | ||
162 | This single change caused me to lose countless hours, until I understood that in IntrinsicsEmitter.cpp, the order of the cases statement for rAny depends on the order of this. I added some comments in IntrinsicsEmitter.cpp to ensure others don't lose the same amount of time. However, this (the comments) could be a separate patch. | |
llvm/include/llvm/IR/Intrinsics.td | ||
238 | Maybe this could be a separate patch - adding the new polymorphic type, what do you think? | |
llvm/include/llvm/IR/Type.h | ||
80 | I would appreciate some suggestions here. The predicate to recognize reference types, which is required by the polymorphic type llvm_anyref_ty is currently defined in WebAssemblyUtils. However, we need it to be in Type.h. What can we do here? It feels like pulling all this target dependent stuff into here is just not ideal. Minimally, we could include the WebAssemblyUtils header here and create a wrapper for these predicates in Type. But that still involves including a target dependent header here. Any suggestions? | |
llvm/lib/CodeGen/ValueTypes.cpp | ||
207 ↗ | (On Diff #380716) | I took the opaque pointer stuff a bit too far and got stuck in a few places where LLVM still looks for the pointee type and crashed, which should be fixed but felt like I was diving into a pandora's box. After discussing with @wingo he suggested using pointers to i8, which works... |
llvm/include/llvm/CodeGen/ValueTypes.td | ||
---|---|---|
240 | Hmm, I wonder what "current pointer size" means here. The contrast between this and iPTRAny makes me think that there should be a way to control the address space of iPTR arguments. Maybe it's the address space attached to the call instruction for the intrinsic? If we can find a target-independent way to pass pointers to particular address spaces to intrinsics, that seems like it would be a better solution that putting specific knowledge of WebAssembly address spaces into IR/Type.h. This also might be worth asking about on llvm-dev. | |
llvm/lib/CodeGen/ValueTypes.cpp | ||
207 ↗ | (On Diff #380716) | Using pointers to i8 seems fine to me. Another option would be using pointers to pointers to opaque structs like we do (IIRC) with externrefs. Or maybe externrefs should also be pointers to i8. It would nice to be consistent between them if funcrefs can't just be function pointers. |
llvm/include/llvm/CodeGen/ValueTypes.td | ||
---|---|---|
240 | Good point. I will take a look at this and if I can't figure it out in a reasonable amount of time, I will ask llvm-dev. | |
llvm/lib/CodeGen/ValueTypes.cpp | ||
207 ↗ | (On Diff #380716) | That's what we used to do but it doesn't work because the intrinsic validator checks that the type defined for funcref matches the type of the argument / return value of the intrinsic, so you cannot define funcref to be a pointer to opaque and then define the intrinsic as getting a pointer to function. |
llvm/include/llvm/CodeGen/ValueTypes.td | ||
---|---|---|
240 | hummm, I think this might not be that useful, a quick glance at the code shows: // An int value the size of the pointer of the current // target to any address space. This must only be used internal to // tblgen. Other than for overloading, we treat iPTRAny the same as iPTR. iPTRAny = 250, and // An int value the size of the pointer of the current // target. This should only be used internal to tblgen! iPTR = 254, I think that this is independent of address space. iPTRAny is barely used in code. I think for most purposes it's identical to iPTR. I will discuss this problem in llvm-dev. |
llvm/include/llvm/CodeGen/ValueTypes.td | ||
---|---|---|
240 | I brought this issue up here: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153640.html |
Update the revision to contain only the changes relevant to the implementation of the new polymorphic type llvm_anyref_ty.
llvm/include/llvm/IR/Intrinsics.td | ||
---|---|---|
171–175 | Instead of having to build Wasm-specific knowledge into so many new parts of LLVM, can we instead use LLVMQualPointerType? I guess this wouldn't allow us to define polymorphic intrinsics after all, though. As long as we have just externref and funcref, that might be fine, but do we know whether we will be adding more reference types in the future? The more we will need to add, the more amenable to these changes I would be. One option is to not land this change now, but keep it in our back pocket for if/when we need to add more reference types in the future. | |
llvm/include/llvm/Support/MachineValueType.h | ||
314 | This should also have the This is only for tblgen's consumption! warning. |
@tlively Thanks for the comments, I agree that it might be too much hassle to define a polymorphic type to just range over the current two available ref types, so I am dropping this revision. I have a revised patch with intrinsics implementation without the need for a polymorphic type that I will push to D113420.
Hmm, I wonder what "current pointer size" means here. The contrast between this and iPTRAny makes me think that there should be a way to control the address space of iPTR arguments. Maybe it's the address space attached to the call instruction for the intrinsic? If we can find a target-independent way to pass pointers to particular address spaces to intrinsics, that seems like it would be a better solution that putting specific knowledge of WebAssembly address spaces into IR/Type.h. This also might be worth asking about on llvm-dev.