When storing the 0th lane of a vector, use a simpler and usually more efficient scalar store instead. In this case, also using the unscaled offset.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
LGTM with a minor comment on the test changes.
llvm/test/CodeGen/AArch64/arm64-st1.ll | ||
---|---|---|
72 ↗ | (On Diff #146362) | Using patterns rather than the actual instruction doesn't seem helpful here: you know exactly which registers the two inputs will be in, and the exact offset, so the output should be stable. (Patterns are more useful when the registers aren't predictable, like temporary values.) |