This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Replace LOAD_SPLAT with SPLAT_VECTOR
ClosedPublic

Authored by luke on Dec 12 2022, 12:00 PM.

Details

Summary

Splats were selected by matching on uses of build_vector with
identical elements, but a while back a target independent node for
vector splatting was added.
This removes the WebAssembly specific LOAD_SPLAT intrinsic, and instead
makes SPLAT_VECTOR legal and adds patterns for splat loads.

It also has the effect of replacing splatted vector constants with a
scalar constant and a splat, which should save on code size.

Diff Detail

Event Timeline

luke created this revision.Dec 12 2022, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 12:00 PM
luke requested review of this revision.Dec 12 2022, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 12:00 PM
luke updated this revision to Diff 482220.Dec 12 2022, 12:04 PM

Remove extraneous whitespace

Nice, it's good to use the builtin nodes where possible. I think we had intentionally selected splats of constants into constant vectors for performance reasons. Would you be able to preserve that behavior and make this a non-functional change?

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
64

Should this be lane_load?

luke added inline comments.Dec 12 2022, 12:12 PM
llvm/test/CodeGen/WebAssembly/fpclamptosat_vec.ll
20–21

Constant splats now get selected as a const and a splat, which can save code size for the smaller integer sizes since v128.const is always 18 bytes long.

llvm/test/CodeGen/WebAssembly/simd-load-splat.ll
8–14

This results in one more instruction, but one less load. The resulting binary code size is probably comparable since tee is usually only two bytes.

luke added a comment.Dec 12 2022, 12:14 PM

Nice, it's good to use the builtin nodes where possible. I think we had intentionally selected splats of constants into constant vectors for performance reasons. Would you be able to preserve that behavior and make this a non-functional change?

Of course, should just be a matter of adding some patterns for v128.const

luke updated this revision to Diff 482450.Dec 13 2022, 6:36 AM

Restore existing behaviour of selecting v128.const for splats

luke marked an inline comment as done.Dec 13 2022, 6:42 AM
llvm/test/CodeGen/WebAssembly/fpclamptosat_vec.ll
302

This is as close as I could get to making it an NFC, by adding a pattern to select v128.const for splats on immediates.
The undef fields are no longer 0, but I initially tried to preserve it by only conditionally selecting splat_vector. But this meant changing DAGCombine and adding a target lowering info hook, and caused some other cases to fail.

llvm/test/CodeGen/WebAssembly/simd-load-splat.ll
8–14

This seems to be triggered by some reordering LLVM does with splat_vector

aheejin added inline comments.Dec 14 2022, 12:49 AM
llvm/test/CodeGen/WebAssembly/simd-load-splat.ll
8–14

I think I was told that we prefer load_splat over splat from @tlively (correct me if I'm wrong), but not sure if it was for performance or code size reasons. We have a specific order of preference written as if-else chain in https://github.com/llvm/llvm-project/blob/8a55264be311bf54ddef0430c712ad51a80a5f7f/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp#L1938-L2193; I wonder if this has any impact on that.

tlively added inline comments.Jan 3 2023, 12:35 PM
llvm/test/CodeGen/WebAssembly/simd-load-splat.ll
8–14

I think the rationale for preferring load_splat over splat there was simply because load_splat could do more work in a single instruction, so it's assumed to be faster than doing the same work in two instructions (but we don't have any experimental data on that). I think the change to this output is fine, since it actually does less work (assuming loads are more work than tees).

tlively accepted this revision.Jan 3 2023, 12:39 PM

LGTM

llvm/test/CodeGen/WebAssembly/fpclamptosat_vec.ll
302

non-zero undefs seem fine to me.

This revision is now accepted and ready to land.Jan 3 2023, 12:39 PM
luke updated this revision to Diff 486277.Jan 4 2023, 7:06 AM

Rebase

This revision was landed with ongoing or failed builds.Jan 4 2023, 7:09 AM
This revision was automatically updated to reflect the committed changes.