This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Add passes for GEP lowering
Needs ReviewPublic

Authored by samparker on Jan 17 2023, 6:00 AM.

Details

Summary

At optimization level -O2 and above, add SeparateConstOffsetFromGEP, EarlyCSE, InstCombine and LICM to the addIRPasses phase.

These passes will also enable D139415.

Diff Detail

Event Timeline

samparker created this revision.Jan 17 2023, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2023, 6:00 AM
samparker requested review of this revision.Jan 17 2023, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2023, 6:00 AM
samparker edited the summary of this revision. (Show Details)Jan 17 2023, 6:01 AM
samparker added a reviewer: sbc100.
sbc100 added inline comments.Jan 17 2023, 7:49 AM
llvm/test/CodeGen/WebAssembly/fpclamptosat.ll
184

What happened here?

samparker added inline comments.Jan 17 2023, 7:54 AM
llvm/test/CodeGen/WebAssembly/fpclamptosat.ll
184

Yeah... I need to look at these clamping files. I'm assuming CSE is responsible, but the reasons for the changes are non-obvious to me.

samparker added inline comments.Jan 17 2023, 8:46 AM
llvm/test/CodeGen/WebAssembly/fpclamptosat.ll
184

Looks like DAGCombiner is now failing to produce fp_to_uint_sat, I will investigate more tomorrow.

samparker updated this revision to Diff 494620.Feb 3 2023, 7:01 AM

Hopefully the FP clamping and min/max idiom misses have now been fixed in D142093, D142481, D142535, D143106 and D143256.

With my local benchmark suite, using wasi-sdk and running on node, this patch affects slightly less than half of the benchmarks and gives a geomean improvement of ~5%.

samparker edited the summary of this revision. (Show Details)Feb 3 2023, 7:04 AM

Impressive performance wins! How is code size affected?

How is code size affected?

Using wasi-sdk-19, the total code size of my suite is 7985 KB. With this patch it's reduced slightly to 7982 KB.

Do those figures include using wasm-opt after linking?

Do those figures include using wasm-opt after linking?

I don't know... I just use wasi-sdk and I assumed it just uses clang, llvm and lld.

Though, after reporting the (small) code size changes, the reported performance numbers (large) didn't make sense. I'm still investigating what's going on.

The clang driver will automatically call wasm-opt if it finds it in your PATH, so it's worth checking whether wasm-opt is being run or not.

junparser added inline comments.Feb 23 2023, 3:17 AM
llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
466

have you ever test with createSeparateConstOffsetFromGEPPass(true)

samparker added a comment.EditedMar 6 2023, 8:21 AM

Sorry for the delay in replying.

I found it impossible to reproduce my previously quoted numbers. I'm juggling an evolving benchmark suite, while modifying wasi-sdk and NodeJS, so I'm guessing I got something out-of-sync!

Below are my revised numbers, with the varying configuration options for completeness. I'm running on an Ampere Altra, with a modified version of NodeJS which uses trap handlers for OOB checks (the patch is currently in review for V8), and I'm only using the TurboFan compiler to reduce noise. wasm-opt is not on my PATH. The main takeaways are:

  • LICM is no longer included.
  • Running InstCombine increases code size a little, but I haven't included any of the patches to mitigate the known problems with running it late.
  • Lowering GEPs, in SeparateConstOffsetFromGEP, increases code size and execution time.
  • OptimizeGEPs still gives a useful improvement in size and execution time.
Pass ConfigTotal code size (KB)Speedup (%)
vanilla wasi-sdk-167809.192-
SeparateConstOffsetFromGEP(false), EarlyCSE7805.7710.1
SeparateConstOffsetFromGEP(false), EarlyCSE, InstCombine7810.4460.1
SeparateConstOffsetFromGEP(true), EarlyCSE7819.303-0.22
SeparateConstOffsetFromGEP(true), EarlyCSE, InstCombine7819.988-0.24
SeparateConstOffsetFromGEP(false), EarlyCSE, OptimizeGEPs7798.1391.78
SeparateConstOffsetFromGEP(false), EarlyCSE, OptimizeGEPs, InstCombine7802.8541.7