Page MenuHomePhabricator

Implementation of global.get/set for reftypes in LLVM IR
Needs ReviewPublic

Authored by pmatos on Jan 26 2021, 12:50 AM.

Details

Summary

This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Diff Detail

Unit TestsFailed

TimeTest
300 msx64 debian > LLVM.CodeGen/AArch64::dag-combine-insert-subvector.ll
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/dag-combine-insert-subvector.ll -o /dev/null
410 msx64 debian > LLVM.CodeGen/AArch64::inline-asm-constraints-bad-sve.ll
Script: -- : 'RUN: at line 1'; not /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -mtriple=aarch64-none-linux-gnu -mattr=+sve -o - /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
290 msx64 debian > LLVM.CodeGen/AArch64::named-vector-shuffles-sve.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -verify-machineinstrs < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll
320 msx64 debian > LLVM.CodeGen/AArch64::spillfill-sve.ll
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -mtriple=aarch64-none-linux-gnu -mattr=+sve -mattr=+bf16 < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/spillfill-sve.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/spillfill-sve.ll
400 msx64 debian > LLVM.CodeGen/AArch64::split-vector-insert.ll
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc < /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/split-vector-insert.ll -debug-only=legalize-types 2>&1 | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/CodeGen/AArch64/split-vector-insert.ll --check-prefix=CHECK-LEGALIZATION
View Full Test Results (301 Failed)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
pmatos updated this revision to Diff 337143.Apr 13 2021, 7:26 AM

Clarify many of the issues requested in previous review comments.

pmatos updated this revision to Diff 337379.Apr 14 2021, 2:27 AM

Fix table name for funcref calls

Are there any further comments to this patch before landing?

wingo added inline comments.Mon, Apr 19, 2:24 AM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1449

Given that this node defines one value, the chain, should this be MVT::other?

sbc100 added inline comments.Mon, Apr 19, 9:41 AM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
120 ↗(On Diff #337379)

The other names in this file seem to either use UPPER_SNAKE_CASE or CamelCase. Any reason these new one use lower_snake_case?

llvm/test/CodeGen/WebAssembly/externref-inttoptr.ll
12

Missing newline at EOF here and in externref-ptrtoint.ll

I'm still not comfortable approving this patch with changes to llvm-c/, Bitcode/, or IR/. If those changes can't be backed out, this will require an RFC on llvm-dev, but I would be surprised if that were necessary.

I'm still not comfortable approving this patch with changes to llvm-c/, Bitcode/, or IR/. If those changes can't be backed out, this will require an RFC on llvm-dev, but I would be surprised if that were necessary.

I understand - I will look into a workaround to avoid the call of getTypeForEVT I mentioned earlier on.

Apparently getTypeForEVT in LowerArguments is only necessary to call the getABIAlignmentForCallingConv. I was thinking that since reference types really are pointers to different address spaces we could return their TypeForEVT as being int32. That will then return as the alignment for reference types the default alignment for int32. This should help with avoiding us touching middle end code. I will experiment in this direction.

Apparently getTypeForEVT in LowerArguments is only necessary to call the getABIAlignmentForCallingConv. I was thinking that since reference types really are pointers to different address spaces we could return their TypeForEVT as being int32. That will then return as the alignment for reference types the default alignment for int32. This should help with avoiding us touching middle end code. I will experiment in this direction.

Why not return the original pointer to opaque type, i.e. PointerType::get(StructType::create(Context), 1)?

pmatos updated this revision to Diff 340553.Mon, Apr 26, 9:14 AM

Use PointerType to avoid defining types for externref and funcref.

pmatos added inline comments.Mon, Apr 26, 11:59 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
120 ↗(On Diff #337379)

@sbc100 I think it's quite mixed in this file at least. In the beginning of the file we have def regPlusImm followed by def or_is_add. These are for PatFrags. I modelled these after @tlively 's SIMD SDTypeProfile that uses: def wasm_load_splat_t for example.

pmatos updated this revision to Diff 340734.Tue, Apr 27, 12:03 AM

Take care of some syntax linting/style comments

pmatos updated this revision to Diff 340740.Tue, Apr 27, 12:27 AM
pmatos marked an inline comment as done.

Add missing newlines at the end of test files

There are still two things missing:

  1. Clear the table entry after the call_indirect to avoid hidden GC roots as mentioned by @wingo
  2. Understand how to handle a ret %extern undef - comment by @tlively
wingo added inline comments.Tue, Apr 27, 12:38 AM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
69

should be just "if"

1124

Should this be TABLE_SET_FUNCREF and not just TABLE_SET ? Given that we're producing a DAG node that then needs to be selected, I would think the type-specific stuff could happen in the pattern-matching.

1449

Pretty sure this comment still applies

pmatos added inline comments.Tue, Apr 27, 12:39 AM
llvm/test/CodeGen/WebAssembly/externref-undef.ll
14–15

Returning a ref.null or an uninitialized externref would make more sense if the return type would be %externref. However, in this case this is an %extern value, which really is an opaque type and should never really happen.

pmatos updated this revision to Diff 340750.Tue, Apr 27, 1:12 AM

Fix else if and VT type as suggested by @wingo.

pmatos marked 3 inline comments as done.Tue, Apr 27, 1:15 AM
pmatos added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1124

In this case, I think we cannot generate a TABLE_SET. I think the TABLE_SET you are referring to is the machine instruction defined in the backend which is not matched from here. From here, you can only match the nodes you create and that's WebAssemblyISD::TABLE_SET_FUNCREF. I am still not totally clear why this is, but I never had any luck generating the machine instructions defined in the backend directly from lowering so there might be a way but it's not straightforward. So we generate here the custom machine node we created and then we pattern match that in tablegen to generate the instruction.

1449

Sorry - I missed this. You're right, this should be MVT::Other.

pmatos updated this revision to Diff 340787.Tue, Apr 27, 3:59 AM
pmatos marked an inline comment as done.

Make TABLE_SET node polymorphic. Deal with types through the Pats. One for funcref and one for externref.

pmatos added inline comments.Tue, Apr 27, 7:41 AM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1124

Apologies I misunderstood your comment. We can make the node polymorphic and deal with the types in the Pat Frags. I will do that next.

This is looking really good!

llvm/lib/CodeGen/MachineOperand.cpp
1174

Is there a test that demonstrates this printing? Also, I'm not sure specifically calling out zero-sized operands in the text dump is that useful. Can zero-sized operands still be given an alignment? If so, it might even be good to continue printing the alignment for them.

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
132–134

Is this necessary given that the symbol is set to be undefined? Perhaps it would be better to make it defined here and also set comdat so that the linker doesn't need to do anything special to make sure it exists in the final binary. (Or possibly I'm misunderstanding what these settings mean!) @sbc100 wdyt?

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1349

It would be good to mark these helper functions static.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
59–61

Why are the address spaces for globals integral? It doesn't make sense to do arithmetic on the "address" of a global afaict. Does using getelementptr to index tables in the global address space require the address space to be integral? If so, that might be a good reason to make the address space integral, but it might also be a good reason to put tables and globals in different address spaces instead. Also, IMO it would be better to use just a single address space for all globals; that way we could support globals with other types like i32, i64, etc. without even more address spaces.

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td
1 ↗(On Diff #340787)

Unintentional whitespace change?

llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
129

This check should incorporate the CPU as well using getSubtargetImpl(CPU, FS)->hasReferenceTypes().

llvm/test/CodeGen/WebAssembly/externref-undef.ll
14–15

Ah, that makes sense.

sbc100 added inline comments.Wed, Apr 28, 3:53 PM
llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
132–134

I think don't want this here. Firstly I think this will mark the symbol as being a comdat group symbol .. which is not the same thing as being part of a comdat group ( which I think is that is indented here). Secondly, I don't think it makes sense for an undefined symbol to be part of a comdat group. I think what you want here, if the symbol was defined is setWeak(). Given that is not defined I dont think you even want that.

wingo added inline comments.Thu, Apr 29, 8:24 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4107

Does this path actually fire for you? In my local tests, if you are loading from EXTERNREF_GLOBAL, then the EltVT ends up being i32.

pmatos updated this revision to Diff 342705.Tue, May 4, 5:40 AM

Rebase this on D101608 from @wingo and clear table slot after call

pmatos updated this revision to Diff 342708.Tue, May 4, 5:58 AM

Fix patch submitted to phab...

Herald added a project: Restricted Project. · View Herald TranscriptTue, May 4, 5:58 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
pmatos updated this revision to Diff 342712.Tue, May 4, 6:11 AM

Fix patch set... again!

pmatos updated this revision to Diff 342718.Tue, May 4, 6:55 AM

Fix broken merge of table ins reordering commit

pmatos updated this revision to Diff 342729.Tue, May 4, 7:29 AM

Removed extraneous changes to previous patches

wingo added inline comments.Tue, May 4, 7:46 AM
clang/lib/Basic/Targets/WebAssembly.h
150 ↗(On Diff #342708)

This change is already in D101608; rebase?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
4083

It turns out that all the changes here to visitLoad and visitStore are no longer needed. The issue was that before we made getPointerMemTy virtual, it was hard-coded to return an integer of whatever the pointer bit size was. But for externref and funcref values, where we're using pointers in non-default address spaces to represent opaque values, that's not what we want: an externref "in memory" isn't an i32. Having made getPointerMemTy virtual and returning externref / funcref for those values, now we avoid the zero-extension paths, and even the ISD::ADD node doesn't cause us problems.

4252

Can revert this too, as mentioned above.

llvm/lib/CodeGen/TargetLoweringBase.cpp
1684

This one I would think that we want to return true, that the access is "fast", for access to zero-sized types.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
2346

I just checked and it would seem that the getPointerMemTy change means that this override no longer appears to be needed.

llvm/lib/Target/WebAssembly/WebAssemblyInstrTable.td
15

I think you may need hasSideEffects = 0 for these annotations to have an effect.

19

Not something for this patch, but this is certainly bogus: surely we mean table.get\t$table, $i and we have an i32 index argument?

59

Consider changing to use the approach used for GLOBAL_GET from the parent commit, and folding these into the multiclass so that we don't have to repeat the code for externref and funcref.

pmatos marked an inline comment as done.Wed, May 5, 1:30 AM
tlively added inline comments.Wed, May 5, 2:44 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrTable.td
15

I would be surprised if this were true!

19

Agree about the i32 index argument, but it is correct to have the result as part of the string for the register-based output format.

llvm/test/CodeGen/WebAssembly/externref-globalget.ll
5

I forget if we had or resolved this conversation elsewhere already, but I would expect clang to emit pointers in the correct address spaces, yes. The SVE folks had to go though multiple years of effort to add scalable vectors to LLVM IR because there was no existing way to model their semantics in IR such that they could implement all the vectorization optimizations they wanted. We are lucky that we can get away with using address space pointers directly without having to add new types to the IR.

pmatos marked 7 inline comments as done.Fri, May 7, 7:28 AM
pmatos added inline comments.
llvm/lib/CodeGen/MachineOperand.cpp
1174

The reason I changed this here is due to the call of getSize() crashing for zeroSized arguments. I have not added test for the printing. I don't think zero-sized operands can still be given an alignment. We are actually setting the alignment to

llvm/lib/Target/WebAssembly/Utils/WebAssemblyUtilities.cpp
132–134

It seems I was mistaken - I thought that by setting it as Comdat, the linker would merge symbols from different compilation units if they existed. Maybe what I need is Weak then.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
59–61

Current set of patches should have addressed your concerns here.

llvm/lib/Target/WebAssembly/WebAssemblyInstrTable.td
15

Why would this be the case? If I remember correctly, I added mayLoad and mayStore here so that lowering includes a chain. And this works without the need for hasSideEffects. Unless you think this is required for other reaons, but mayLoad works with it.

19

Not sure I understand why this should be $i instead of $table. Isn't $table represented as an index anyway? A table32_op is defined as Operand<i32> so I don't quite understand what we gain by changing this.

llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
129

What exactly do you mean here? Not sure what is the difference here. You mean adding something like : hasReferenceTypes(FS) || getSubtargetImpl(CPU, FS)->hasReferenceTypes()?

pmatos updated this revision to Diff 343685.Fri, May 7, 8:04 AM
  • Simplifies some of code in the middle-end.
  • Removes setComdat and adds setWeak

Small improvements to address comments in review.

pmatos updated this revision to Diff 343695.Fri, May 7, 8:48 AM

Making the linter happy...