Some intrinsics like saturating operations may set flags, so if the scalar arg is inserted into an undef vector, the undef elements may trigger unwanted side effects. Using zero should be safer than undef.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Did you look into a scalar variant of the intrinsic call instead? These instructions have non-vector variants (e.g. sqadd s0, s0, s0), and that's actually why the intrinsics exist in the first place. It'd be a shame to always require this extra work.
This looks quite involved as the scalar intrinsics have illegal types etc, and at the moment I don't have a lot of time to spend on this, it was just intended as a fix for the unstable tests: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/6257
Does anyone object to this? I'd like to get it off my review dashboard one way or the other.
I see two issues here:
- We're still generating the wrong instruction.
- The intrinsic is marked readnone, so any code that depends on whether it sets saturation flags is likely broken anyway.
Given the layers of wrongness here, this seems like a marginal improvement.
Despite all that, I guess this is an improvement... but please at least make the comment in the code reflect this discussion.