When possible, optimize TRUNCATE to generate Wasm SIMD narrow instructions (i16x8.narrow_i32x4_u, i8x16.narrow_i16x8_u), rather than generate lots of extract_lane and replace_lane.
See Bug 51006 for discussion.
|70 ms||x64 debian > LLVM.Bindings/Go::go.test|
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm
Can you add tests for this to llvm/test/CodeGen/WebAssembly?
It would be good to write out the transformation we are doing in terms of selection dag operations and WebAssembly instructions. I don't know what PACK*SDW and PACK*SWB mean.
If you change the run line to this, you can use utils/update_llc_test_checks.py to autogenerate the test output, which will make it easier for me to understand the transformation we are doing. Also, it looks like the code is more general than it needs to be to handle this one case. Is there an opportunity to either simplify the code or test more cases here?
I'm not sure I fully understand what you mean by this. Do you want me to change the function body of @trunc16i32_16i8? Is there detailed suggestions on how to change?
For example, extractSubVector and truncateVectorWithNARROW look like they are meant to work with many different vector types. If that's correct, it would be good to add tests for more vector types. Alternatively, if we only care about a smaller set of types, it would be nice if we could simplify the code and make it less general.
Hi Thomas, you're right that the code is meant to handle more types. My initial purpose is to discuss the solution with you and Petr first so I keep the types simple and leaves a TODO here. If you're OK with the solution, I can try to cover more types and more tests.
(Actually I have little experience with LLVM types, it may take me some time to cover more, and any advice on the if conditions here is very welcome)
Nice, these tests give me a good sense of what this optimization is doing. The transformation looks good.
For both this and truncateVectorWithNARROW, it would be good to add comments about what the parameters are for and what will be returned.
@jingbao I was working on landing this, but the CodeGen/WebAssembly/fpclamptosat_vec.ll test had to be updated. Unfortunately Phabricator won't let me attach the changes to this revision, but it looks like this change introduces some new unnecessary masking that was not there before in a couple of the test functions. (It also makes a bunch of good changes!). Would you be able to take a look at that? I'd also be ok with landing this as-is since the improvements looks more significant and more common than the regression.
Hi @tlively, I updated the test file. After some investigation, I still cannot delete the masking operation if you are talking about the "v128.and" part, because those can avoid the narrow operations to saturate incorrectly. If you have concerns about other parts, please point that out on the test file and we can discuss again.