This is an archive of the discontinued LLVM Phabricator instance.

Implementation of new polymorphic type llvm_anyref_ty
AbandonedPublic

Authored by pmatos on Oct 6 2021, 6:19 AM.

Details

Reviewers
tlively
Summary

This polymorphic type accepts any WebAssembly reference type, at the
moment externref or funcref.

Diff Detail

Event Timeline

pmatos created this revision.Oct 6 2021, 6:19 AM
pmatos requested review of this revision.Oct 6 2021, 6:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2021, 6:19 AM
pmatos added a comment.Oct 6 2021, 6:28 AM

@tlively This is early days of the intrinsics implementation, so it's not ready for review.
However, I wanted to share to ensure that the progress of the implementation is public and to avoid going into a design dead end.

At the moment, I am fighting tablegen, which doesn't like the table.grow pattern due to:

Type set is empty for each HW mode:
possible type contradiction in the pattern below (use -print-records with llvm-tblgen to see all expanded records).

I am still investigating, but the overall implementation will allow us to use table.grow and table.size at the IR level through the intrinsics. A few notes:

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.
  • An intrinsic for ref.null is required so that we can create a constant to pass to table.grow to initialize the new slots on a table but also for other general uses.
  • In order to specify the types (params and return) in the intrinsics definition, I needed to add a bunch of definitions for llvm_externref_ty and llvm_funcref_ty and plug them the correct types in other enumerations.

Hoping I can make peace with TableGen and sort the current problem. I feel like this should be relatively straightforward once I pass the TableGen hurdle.

P.S.: Pay no attention to the test llvm/test/CodeGen/WebAssembly/table-grow.ll whose contents are not there yet, and was committed by mistake.

pmatos planned changes to this revision.Oct 6 2021, 6:49 AM
sbc100 added a comment.Oct 6 2021, 7:48 AM

@tlively This is early days of the intrinsics implementation, so it's not ready for review.
However, I wanted to share to ensure that the progress of the implementation is public and to avoid going into a design dead end.

At the moment, I am fighting tablegen, which doesn't like the table.grow pattern due to:

Type set is empty for each HW mode:
possible type contradiction in the pattern below (use -print-records with llvm-tblgen to see all expanded records).

I am still investigating, but the overall implementation will allow us to use table.grow and table.size at the IR level through the intrinsics. A few notes:

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.
  • An intrinsic for ref.null is required so that we can create a constant to pass to table.grow to initialize the new slots on a table but also for other general uses.
  • In order to specify the types (params and return) in the intrinsics definition, I needed to add a bunch of definitions for llvm_externref_ty and llvm_funcref_ty and plug them the correct types in other enumerations.

Hoping I can make peace with TableGen and sort the current problem. I feel like this should be relatively straightforward once I pass the TableGen hurdle.

P.S.: Pay no attention to the test llvm/test/CodeGen/WebAssembly/table-grow.ll whose contents are not there yet, and was committed by mistake.

Could we bypass/delay a lot of this complexity by specifying the initial version of intrinsic for table.grow as "grow_table_with_null" .. then we don't need ref.null intrinsic or these new types? (at least not yet).

Could we bypass/delay a lot of this complexity by specifying the initial version of intrinsic for table.grow as "grow_table_with_null" .. then we don't need ref.null intrinsic or these new types? (at least not yet).

If the motivation is to delay the complexity, then just having grow_table_with_null wouldn't suffice because we would still need the new types to define the table.size and table.grow param types. I also worry that doing something half-baked now will come to haunt us later if we then do it "properly" and have to keep old intrinsics for the sake of backwards compatibility.

Could we bypass/delay a lot of this complexity by specifying the initial version of intrinsic for table.grow as "grow_table_with_null" .. then we don't need ref.null intrinsic or these new types? (at least not yet).

If the motivation is to delay the complexity, then just having grow_table_with_null wouldn't suffice because we would still need the new types to define the table.size and table.grow param types. I also worry that doing something half-baked now will come to haunt us later if we then do it "properly" and have to keep old intrinsics for the sake of backwards compatibility.

Isn't "got with null" likely to be by far the most likely thing to be used? I'm tempted to call that __table_grow and go with __table_grow_with_default_elem for the less-used one.

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.

Intrinsics actually can be polymorphic. For an example, see the definition of int_wasm_sub_sat_signed, which is polymorphic over all vector types (although not all vector types are actually supported in the backend). I don't know how much work it would be to support that kind of polymorphism for reference types, though.

  • An intrinsic for ref.null is required so that we can create a constant to pass to table.grow to initialize the new slots on a table but also for other general uses.

Would it be more trouble than it's worth to try to get a constant zero to mean ref.null?

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.

Intrinsics actually can be polymorphic. For an example, see the definition of int_wasm_sub_sat_signed, which is polymorphic over all vector types (although not all vector types are actually supported in the backend). I don't know how much work it would be to support that kind of polymorphism for reference types, though.

You mean the polymorphism achieve through llvm_anyvector_ty? So I guess the way to go about it would be to try to add a wasm_anyreftype_ty... I will take a look.

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.

Intrinsics actually can be polymorphic. For an example, see the definition of int_wasm_sub_sat_signed, which is polymorphic over all vector types (although not all vector types are actually supported in the backend). I don't know how much work it would be to support that kind of polymorphism for reference types, though.

You mean the polymorphism achieve through llvm_anyvector_ty? So I guess the way to go about it would be to try to add a wasm_anyreftype_ty... I will take a look.

Yes, exactly. There's also the even more permissive llvm_any_ty. It looks like this would involve updating llvm/include/llvm/CodeGen/ValueTypes.td and llvm/include/llvm/Support/MachineValueType.h to add a new rAny value type that would be used by tablegen in llvm/utils/TableGen/{CodeGenDAGPatterns,CodeGenTarget,IntrinsicEmitter}.cpp.

pmatos updated this revision to Diff 380340.Oct 18 2021, 4:22 AM

Follow the breadcrumbs to define llvm_extenref_ty and llvm_funcref_ty

  • As far as I understand intrinsics cannot be polymorphic, therefore table.grow and ref.null intrinsics come in two versions: one for externref and one for funcref.

Intrinsics actually can be polymorphic. For an example, see the definition of int_wasm_sub_sat_signed, which is polymorphic over all vector types (although not all vector types are actually supported in the backend). I don't know how much work it would be to support that kind of polymorphism for reference types, though.

You mean the polymorphism achieve through llvm_anyvector_ty? So I guess the way to go about it would be to try to add a wasm_anyreftype_ty... I will take a look.

Yes, exactly. There's also the even more permissive llvm_any_ty. It looks like this would involve updating llvm/include/llvm/CodeGen/ValueTypes.td and llvm/include/llvm/Support/MachineValueType.h to add a new rAny value type that would be used by tablegen in llvm/utils/TableGen/{CodeGenDAGPatterns,CodeGenTarget,IntrinsicEmitter}.cpp.

That makes sense. Thanks for the pointers.
I am sort of going the rabbit hole of trying to understand how far I need to go to have llvm_externref_ty and llvm_funcref_ty in an intrinsic. This seems to require changes in a wide range of files - assuming it would be similar (if not worse) with a polymorphic type. Unfortunately, it is still now compiling as there are further changes required.

That makes sense. Thanks for the pointers.
I am sort of going the rabbit hole of trying to understand how far I need to go to have llvm_externref_ty and llvm_funcref_ty in an intrinsic. This seems to require changes in a wide range of files - assuming it would be similar (if not worse) with a polymorphic type. Unfortunately, it is still now compiling as there are further changes required.

This feels like going into the rabbit hole of adding a new type as mentioned in https://www.llvm.org/docs/ExtendingLLVM.html#adding-a-new-type
Unsure this is necessary, but it feels like it might be if we need to specify a reference type for the intrinsic. On the other hand it's not great as I am starting to touch things related to bc encoding for the new ref types.

That makes sense. Thanks for the pointers.
I am sort of going the rabbit hole of trying to understand how far I need to go to have llvm_externref_ty and llvm_funcref_ty in an intrinsic. This seems to require changes in a wide range of files - assuming it would be similar (if not worse) with a polymorphic type. Unfortunately, it is still now compiling as there are further changes required.

This feels like going into the rabbit hole of adding a new type as mentioned in https://www.llvm.org/docs/ExtendingLLVM.html#adding-a-new-type
Unsure this is necessary, but it feels like it might be if we need to specify a reference type for the intrinsic. On the other hand it's not great as I am starting to touch things related to bc encoding for the new ref types.

I keep thinking that we should be able to model these as pointers to something instead of new types, however the llvm_ptr_ty doesn't seem to allow specifying the pointee type. I will continue an investigation in this direction.

pmatos updated this revision to Diff 380716.Oct 19 2021, 9:52 AM

Implements the intrinsic generating ref.null.

The test ref-null.ll passes but the implementation raised questions on how funcref is represented.
Because (apparently) IITDescriptor didn't include the ability to represent functions, I added a
straightforward way in the current patch but it's definitely a draft version that provides food
for thought.

Also raises questions on how to represent funcrefs that represent different FunctionTypes, i.e.
different parameter count/type and different return type.

tlively added inline comments.Oct 19 2021, 6:24 PM
llvm/include/llvm/IR/Intrinsics.h
144

Both externref and funcref are modeled as pointers, so I'm surprised we have to add a new type here. It looks like there is already a field in the following union for Pointer_AddressSpace, so hopefully that would be sufficient?

llvm/lib/CodeGen/ValueTypes.cpp
207

Maybe we should go along with the transition to opaque pointers and do PointerType::get(Context, 20)? If so, we should probably do the same for externref.

pmatos added inline comments.Oct 20 2021, 7:09 AM
llvm/lib/CodeGen/ValueTypes.cpp
207

Tbh, I thought I was using that here when doing PointerType::get(0, 20), but you're right. I need to do PointerType::get(Context, 20). When I do so, Iearn that I need to enable opaque pointers mode through Context.enableOpaquePointers().

Curiously, nobody calls that function at the moment, but I have now done so to understand the consequences for us and what kind of pandora's box I am opening.

tlively added inline comments.Oct 20 2021, 12:01 PM
llvm/lib/CodeGen/ValueTypes.cpp
207

Oh hmm, that doesn't sound like something we want to do. Context.enableOpaquePointers() sounds like some sort of global configuration that we don't want to be responsible for here.

I think the PointerType::get(0, 20) is interpreted as passing nullptr as the Type* parameter.

pmatos updated this revision to Diff 385437.Nov 8 2021, 3:28 AM

The current patch pushes the implementation forward.

It currently implements intrinsics for table.size and ref.null as well as all the dependencies of this, the largest of which is a new polymorphic type llvm_anyref_ty.

There are a few niggles that need to be fixed and discussed, as well as the final implementation of table.grow intrinsic before a final patch is presented for review.

pmatos added inline comments.Nov 8 2021, 3:43 AM
.gitignore
63

Ignore this for now... local change committed by mistake.

llvm/include/llvm/IR/Intrinsics.h
162

This single change caused me to lose countless hours, until I understood that in IntrinsicsEmitter.cpp, the order of the cases statement for rAny depends on the order of this. I added some comments in IntrinsicsEmitter.cpp to ensure others don't lose the same amount of time. However, this (the comments) could be a separate patch.

llvm/include/llvm/IR/Intrinsics.td
238

Maybe this could be a separate patch - adding the new polymorphic type, what do you think?

llvm/include/llvm/IR/Type.h
80

I would appreciate some suggestions here. The predicate to recognize reference types, which is required by the polymorphic type llvm_anyref_ty is currently defined in WebAssemblyUtils. However, we need it to be in Type.h. What can we do here? It feels like pulling all this target dependent stuff into here is just not ideal. Minimally, we could include the WebAssemblyUtils header here and create a wrapper for these predicates in Type. But that still involves including a target dependent header here. Any suggestions?

llvm/lib/CodeGen/ValueTypes.cpp
207

I took the opaque pointer stuff a bit too far and got stuck in a few places where LLVM still looks for the pointee type and crashed, which should be fixed but felt like I was diving into a pandora's box.

After discussing with @wingo he suggested using pointers to i8, which works...
Although I feel the cleaner solution is opaque pointers, I am unsure we should be the first users of opaque pointers in LLVM at the moment. Therefore funcrefs are now pointers to i8 which are cast to the respective type when used.

pmatos updated this revision to Diff 385498.Nov 8 2021, 7:49 AM

Implement table.grow intrinsic

tlively added inline comments.Nov 8 2021, 8:47 PM
llvm/include/llvm/CodeGen/ValueTypes.td
240

Hmm, I wonder what "current pointer size" means here. The contrast between this and iPTRAny makes me think that there should be a way to control the address space of iPTR arguments. Maybe it's the address space attached to the call instruction for the intrinsic? If we can find a target-independent way to pass pointers to particular address spaces to intrinsics, that seems like it would be a better solution that putting specific knowledge of WebAssembly address spaces into IR/Type.h. This also might be worth asking about on llvm-dev.

llvm/lib/CodeGen/ValueTypes.cpp
207

Using pointers to i8 seems fine to me. Another option would be using pointers to pointers to opaque structs like we do (IIRC) with externrefs. Or maybe externrefs should also be pointers to i8. It would nice to be consistent between them if funcrefs can't just be function pointers.

pmatos added inline comments.Nov 9 2021, 7:51 AM
llvm/include/llvm/CodeGen/ValueTypes.td
240

Good point. I will take a look at this and if I can't figure it out in a reasonable amount of time, I will ask llvm-dev.

llvm/lib/CodeGen/ValueTypes.cpp
207

That's what we used to do but it doesn't work because the intrinsic validator checks that the type defined for funcref matches the type of the argument / return value of the intrinsic, so you cannot define funcref to be a pointer to opaque and then define the intrinsic as getting a pointer to function.

pmatos added inline comments.Nov 9 2021, 7:55 AM
llvm/include/llvm/CodeGen/ValueTypes.td
240

hummm, I think this might not be that useful, a quick glance at the code shows:

// An int value the size of the pointer of the current
// target to any address space. This must only be used internal to
// tblgen. Other than for overloading, we treat iPTRAny the same as iPTR.
iPTRAny        = 250,

and

// An int value the size of the pointer of the current
// target.  This should only be used internal to tblgen!
iPTR           = 254,

I think that this is independent of address space. iPTRAny is barely used in code. I think for most purposes it's identical to iPTR. I will discuss this problem in llvm-dev.

pmatos added inline comments.Nov 10 2021, 2:01 AM
llvm/include/llvm/CodeGen/ValueTypes.td
240
pmatos updated this revision to Diff 386829.Nov 12 2021, 6:13 AM

Update the revision to contain only the changes relevant to the implementation of the new polymorphic type llvm_anyref_ty.

pmatos retitled this revision from [WebAssembly] Implementation of table.grow/size and ref.null intrinsics to Implementation of new polymorphic type llvm_anyref_ty.Nov 12 2021, 6:14 AM
pmatos edited the summary of this revision. (Show Details)
tlively added inline comments.Nov 14 2021, 10:20 PM
llvm/include/llvm/IR/Intrinsics.td
171–175

Instead of having to build Wasm-specific knowledge into so many new parts of LLVM, can we instead use LLVMQualPointerType? I guess this wouldn't allow us to define polymorphic intrinsics after all, though. As long as we have just externref and funcref, that might be fine, but do we know whether we will be adding more reference types in the future? The more we will need to add, the more amenable to these changes I would be. One option is to not land this change now, but keep it in our back pocket for if/when we need to add more reference types in the future.

llvm/include/llvm/Support/MachineValueType.h
314

This should also have the This is only for tblgen's consumption! warning.

pmatos abandoned this revision.Nov 18 2021, 10:05 AM

@tlively Thanks for the comments, I agree that it might be too much hassle to define a polymorphic type to just range over the current two available ref types, so I am dropping this revision. I have a revised patch with intrinsics implementation without the need for a polymorphic type that I will push to D113420.