This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Adding 64-bit versions of all load & store ops.
ClosedPublic

Authored by aardappel on May 28 2020, 4:45 PM.

Details

Summary

Context: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md
This is just a first step, adding the new instruction variants while keeping the existing 32-bit functionality working.
Some of the basic load/store tests have new wasm64 versions that show that the basics of the target are working.
Further features need implementation, but these will be added in followups to keep things reviewable.

Diff Detail

Event Timeline

aardappel created this revision.May 28 2020, 4:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2020, 4:45 PM

finished initial tablegen changes

aardappel updated this revision to Diff 267355.May 29 2020, 1:30 PM

fixed all type errors + more missing atomics

aardappel updated this revision to Diff 267412.May 29 2020, 4:43 PM

it builds!

aardappel updated this revision to Diff 267689.Jun 1 2020, 12:02 PM

Fixed test

aardappel updated this revision to Diff 267771.Jun 1 2020, 5:48 PM
aardappel edited the summary of this revision. (Show Details)

Made several tests pass under wasm64!

dschuff added inline comments.Jun 2 2020, 4:53 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
62

side note: since we are about to start depending even more on the asm syntax, and having users try to use it, it's probably getting to now-or-never on this fix.

llvm/test/CodeGen/WebAssembly/cpus.ll
4

probably not to fix in this CL, but triple=wasm64 and -mcpu=mvp probably shouldn't be allowed.

llvm/test/CodeGen/WebAssembly/load-ext.ll
5–6

should the datalayout be removed here too?
I think it does make sense for many of these simple tests to use the target-defined datalayout (i.e. use the same IR for both 32 and 64) but we should be a bit careful with that, given that the frontend is going to generate actually-different IR for some cases.

llvm/test/MC/WebAssembly/wasm64.s
3

Are we able to check the encoding of the instructions yet? See e.g. https://github.com/llvm/llvm-project/blob/master/llvm/test/MC/WebAssembly/atomics-encodings.s

aardappel marked 4 inline comments as done.Jun 4 2020, 8:49 AM
aardappel added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
62

Is this about swapping offset & align? I could do that in a next commit, though personally this doesn't sound important to me. @sbc100 opinion?

llvm/test/CodeGen/WebAssembly/cpus.ll
4

I was actually thinking about that, but looking at the code that deals with the cpu flags (in WebAssembly.cpp) and couldn't see a good place to error out on this. Maybe in setABI ?
Also, what should the default cpu be if someone selects wasm64 ?

llvm/test/CodeGen/WebAssembly/load-ext.ll
5–6

Yes, forgot to delete it here. I assumed it was better to rely on whatever is default for wasm32/64, but not familiar with under what conditions this could produce values other than the ones in datalayout.

llvm/test/MC/WebAssembly/wasm64.s
3

Yes, we should be able to add that to this test. Not sure what the point is though, since all load/store ops have exactly the same encoding as in wasm32.

Frankly, this test doesn't test anything, since the only thing that changes is that load/store now accept 64-bit operands. We need to test that those are being emitted by LLVM (which we don't test here), and that that combination is accepted by verifiers (which we also don't test here, the assembler doesn't verify types, and happily constructs illegal instruction sequences).

What is new here (and is currently commented out pending new reloc types) is i64.const label, but we can't check the encoding of that, since at the .o level it will just be all zeroes.

aardappel updated this revision to Diff 268496.Jun 4 2020, 8:52 AM

Removed one more datalayout

sbc100 added inline comments.Jun 4 2020, 11:50 AM
llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp
95

Might be worth a separate case for fixup_sleb128_i64 since we know they can only be memory addresses.

And we probably want R_WASM_MEMORY_ADDR_SLEB64 and R_WASM_MEMORY_ADDR_I64 as new relocation types.

dschuff added inline comments.Jun 4 2020, 1:45 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
62

Yes, it's about swapping offset and align. I don't have a strong opinion on that, other than that we should decide whether or not to do it, and remove/update the comment as applicable.

One argument for keeping things the same might be that offsets are much more common (at least when hand-writing) than alignments, and a format like <align>:<offset> might require an alignment when there otherwise might not need to be one.
Anyway, all of this dates back from when we just needed something "for now", even before we had the fully-stacked style and binary writer. So now's a good chance to revisit the design if we want.

llvm/test/CodeGen/WebAssembly/cpus.ll
4

yeah I guess the feature flags are a bit entangled in the CPU. If we can it would be good to error on any cpu setting (or feature flag setting) that doesn't support memory64.
my guess would be that the default cpu for wasm64 should just have the memory64 feature and no others. Not sure about the best place for the error (other than probably somewhere in clang). Perhaps @tlively has thought about this more recently than I have?

llvm/test/CodeGen/WebAssembly/load-ext.ll
5–6

In general everything should have an explicit datalayout chosen by the frontend (also the default datalayout for a target is AFAIK the same as what clang uses explicitly when there are no ABI-affecting cflags). But when there is no frontend and it's just tests, it gets a little more interesting, and it sometimes makes sense to have tests to double duty. I think it's fine to leave it out, as long as we are actually intending to run the test 2 different ways and we've verified that we are testing what we intend to test. Otherwise it should have an explicit datalayout just because it's a general default thing to do.

llvm/test/MC/WebAssembly/wasm64.s
3

Oh right I had forgotten that the encodings were all the same.

aardappel marked an inline comment as done.Jun 4 2020, 5:16 PM
aardappel added inline comments.
llvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyWasmObjectWriter.cpp
95

Yup, doing that in the follow-up, if that's ok.

Question: Is this feature gonna be handled by -mtriple and not -mattrs like other features?

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

Patterns and instruction definitions in WebAssemblyInstrMemory.td are defined as multiclasses to group A32 and A64 versions. I think it's better to be consistent here too..? The same for all patterns and defs for atomic.wait, atomic.notify and AStoreOffset.

llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
44

Nit: Can we set do reqs = [] as a default argument so we don't need [] in many cases below?

72

Requires only affects patterns and not instruction definitions. Not sure if it is Tablegen bug... So I think it's better to have Requires<HasAddr64> in patterns, here and others, like

multiclass LoadPatNoOffset<ValueType ty, PatFrag kind, string inst> {
  def : Pat<(ty (kind I32:$addr)), (!cast<NI>(inst # "_A32") 0, 0, I32:$addr)>;
  def : Pat<(ty (kind I64:$addr)), (!cast<NI>(inst # "_A64") 0, 0, I64:$addr)>, Requires<[HasAddr64]>;
}

The same for other patterns in this file, atomics and SIMD.

247

Don't we need reqs parameter and Requires<!listconcat(reqs, [HasAddr64])> here as in WebAssemblyLoad here?

But actually the funny thing is Requires doesn't work in instruction definitions.. Not sure if it's a Tablegen bug. We have them on definitions anyway, and I think it's ok, and and if we have them it's better to be consistent.

llvm/test/MC/WebAssembly/wasm64.s
107

If the only thing this file tests currently is whether LLVM accepts this file without an error, do we need to have these CHECK lines, which are the same as the original program? Roundtripping capability is tested in other files already. I might be mistaken, so please let me know if so.

aardappel marked 5 inline comments as done.Jun 8 2020, 9:30 AM

@aheejin yes, this is part of the triple, much like x86 vs x64. That structure was already in place before I started.

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

I tried to stay close to the original definitions. I do not feel the amount of duplication here is excessive to warrant trying to factor things out into more common definitions, given that trying to abstract things in tablegen leads to less readable results than in a normal programming language.

llvm/test/MC/WebAssembly/wasm64.s
107

Yes, the value of these CHECKs is low at the moment, but they check a little bit more than just the instruction being accepted, since they convert the instruction back to text. There is nothing in here that likely isn't covered by other tests, but that is hopefully changing soon.

aardappel updated this revision to Diff 269267.Jun 8 2020, 9:31 AM

Added HasAddr64 to patterns, and made it default for instructions.

aheejin added inline comments.Jun 8 2020, 11:17 AM
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

I'm not sure the structure of original definition of these is different from that of InstrMemory.td. For ImmOff patterns, here we have

class WaitPatImmOff<ValueType ty, Intrinsic kind, PatFrag operand, NI inst> :    
  Pat<(i32 (kind (operand I32:$addr, imm:$off), ty:$exp, I64:$timeout)),         
      (inst 0, imm:$off, I32:$addr, ty:$exp, I64:$timeout)>;                     
def : WaitPatImmOff<i32, int_wasm_atomic_wait_i32, regPlusImm, ATOMIC_WAIT_I32>; 
def : WaitPatImmOff<i32, int_wasm_atomic_wait_i32, or_is_add, ATOMIC_WAIT_I32>;  
def : WaitPatImmOff<i64, int_wasm_atomic_wait_i64, regPlusImm, ATOMIC_WAIT_I64>; 
def : WaitPatImmOff<i64, int_wasm_atomic_wait_i64, or_is_add, ATOMIC_WAIT_I64>;

And in WebAssemblyInstrMemory.td we have

class LoadPatImmOff<ValueType ty, PatFrag kind, PatFrag operand, NI inst> :      
  Pat<(ty (kind (operand I32:$addr, imm:$off))), (inst 0, imm:$off, I32:$addr)>;                                                                             
def : LoadPatImmOff<i32, load, regPlusImm, LOAD_I32>;                            
def : LoadPatImmOff<i64, load, regPlusImm, LOAD_I64>;                            
def : LoadPatImmOff<f32, load, regPlusImm, LOAD_F32>;                            
def : LoadPatImmOff<f64, load, regPlusImm, LOAD_F64>;                            
...

And I don't think the amount of duplication is excessive either, but I suggested this more for consistency, because other patterns are grouped with multiclasses. BinRMW and TerRMW related patterns in this file are grouped as well, even though they have the same number of entries with these patterns.

aardappel marked an inline comment as done.Jun 8 2020, 12:03 PM
aardappel added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

Agree it would be more consistent, but to make the change in this particular example requires doing instruction string concatenation (which I had to do in other cases to reduce the amount of duplication), whereas here it would just make things worse. I can make the change if you feel strongly about it, but I don't think it actually makes anything better.

aheejin added inline comments.Jun 8 2020, 6:05 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

I don't know why string concatenation will be specially ugly only here, given that we are doing it everywhere else? Is there something I'm missing that something is different here? (And AStore patterns too)

I'm not sure what would be the reason for grouping multiclass in most of other patterns but not some of them, that's all. If the reason is there's not enough entries, BinRMW and BinTer have the same number of entries with this one too.

aardappel marked an inline comment as done.Jun 12 2020, 11:04 AM
aardappel added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

Just looked if I could make this happen.

Making ATOMIC_I generate 2 variants is not as easy as for WebAssembyLoad, since it takes full dag arguments, which have several components that would have to be replaced with !subst or whatever.. this would get tricky.

Adding a multi-class just for every pair of instructions would actually make the code bigger, and certainly less readable.

For the patterns, you're looking at changing this:

class WaitPatNoOffset<ValueType ty, Intrinsic kind, WebAssemblyRegClass rc,
                      NI inst> :
  Pat<(i32 (kind rc:$addr, ty:$exp, I64:$timeout)),
      (inst 0, 0, rc:$addr, ty:$exp, I64:$timeout)>;
def : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, I32, ATOMIC_WAIT_I32_A32>;
def : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, I32, ATOMIC_WAIT_I64_A32>;
def : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, I32, ATOMIC_WAIT_I32_A32>;
def : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, I32, ATOMIC_WAIT_I64_A32>;
def : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, I64, ATOMIC_WAIT_I32_A64>;
def : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, I64, ATOMIC_WAIT_I64_A64>;
def : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, I64, ATOMIC_WAIT_I32_A64>;
def : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, I64, ATOMIC_WAIT_I64_A64>;

into:

multiclass WaitPatNoOffset<ValueType ty, Intrinsic kind,
                      string inst> {
  def : Pat<(i32 (kind I32:$addr, ty:$exp, I64:$timeout)),
            (!cast<NI>(inst#_A32) 0, 0, I32:$addr, ty:$exp, I64:$timeout)>;
  def : Pat<(i32 (kind I64:$addr, ty:$exp, I64:$timeout)),
            (!cast<NI>(inst#_A64) 0, 0, I64:$addr, ty:$exp, I64:$timeout)>;
}
defm : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, "ATOMIC_WAIT_I32">;
defm : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, "ATOMIC_WAIT_I64">;
defm : WaitPatNoOffset<i32, int_wasm_atomic_wait_i32, "ATOMIC_WAIT_I32">;
defm : WaitPatNoOffset<i64, int_wasm_atomic_wait_i64, "ATOMIC_WAIT_I64">;

If you think that is better, I'll go make similar changes below.

aheejin added inline comments.Jun 12 2020, 11:57 AM
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
129

(Chatted offline and posting the summary here) I meant grouping patterns, and wasn't talking about making two variants of ATOMIC_I with dags and everything. This CL makes grouped patterns in WebAssemblyInstrSIMD.td but doesn't touch SIMD_I itself. That's what I meant in this file too.

Refactored atomic patterns

aheejin accepted this revision.Jun 12 2020, 12:55 PM

Some lines in WebAssemblyInstrAtomics.td and WebAssemblyInstrSIMD.td look they exceed 80 cols, so please fix that too :)

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
115

These two patterns are still ungrouped..

This revision is now accepted and ready to land.Jun 12 2020, 12:55 PM
aardappel marked an inline comment as done.Jun 12 2020, 1:20 PM
aardappel added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
115

introducing a similar structure to the patterns below for these would create a multiclass with 2 patterns only called once, which is strictly worse than what is here

aardappel updated this revision to Diff 270507.Jun 12 2020, 1:30 PM

80-column wrapping

tlively added inline comments.Jun 12 2020, 1:51 PM
llvm/test/CodeGen/WebAssembly/cpus.ll
4

We have a bunch of errors for incompatible features and options in the driver code in clang already. That would probably be a good place for these checks, too.

As for the default cpu for the wasm64 triple, we could either do a cpu with 64-bit memory and no other features like mvp, or we could do a cpu that has 65-bit memory and all the small features that were standardized and widely implemented before it. Since there aren't many such features right now, though, I guess an mvp-like mvp64 cpu would be best for now.

aardappel marked an inline comment as done.Jun 12 2020, 4:22 PM
aardappel added inline comments.
llvm/test/CodeGen/WebAssembly/cpus.ll
4

Ok, where do I add this? :) Maybe ok as a followup?

aardappel updated this revision to Diff 270539.Jun 12 2020, 4:24 PM

Added HasAddr32|64 to all the patterns.

dschuff accepted this revision.Jun 12 2020, 4:49 PM

I think this is good now. If the Requires I mentioned above should actually go back in, it can be in a followup CL (presumably where we add MEMORY_SIZE_I64 and friends).

llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
360

Question for the tblgen experts maybe, if a def has both an instruction definition and a pattern as this one does, does the Requires field actually apply to the pattern? Or does Requires only work for freestanding patterns?

llvm/test/CodeGen/WebAssembly/cpus.ll
4

oh and yes, I think it's fine for a followup and i mentioned above.

tlively added inline comments.Jun 12 2020, 6:23 PM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
360

I would have thought that Requires should apply to both kinds of patterns. In particular, inheriting it just sets the Predicates field in both Instruction and Pattern. But I read @aheejin's comment above to mean that this doesn't work after all, so perhaps I am wrong.

aheejin added inline comments.Jun 13 2020, 9:07 AM
llvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td
360

Yes not sure if it's an intended behavior or a bug of TableGen, but what I observed is predicates only work on patterns and not on definitions. Actually @dschuff confirmed that to me when I was confused and asked him :)

This revision was automatically updated to reflect the committed changes.