This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] support neon_sshl and neon_ushl in performIntrinsicCombine.
ClosedPublic

Authored by fhahn on May 23 2019, 6:11 AM.

Details

Summary

If we have a constant vector mask with the shift values being all equal,
we can simplify aarch64_neon_sshl to VSHL.

This pattern can be generated by code using vshlq_s32(a,vdupq_n_s32(n))
instead of vshlq_n_s32(a, n), because it is used in contexts where n is
not guaranteed to be constant, before inlining.

We can do a similar combine for aarch64_neon_ushl, but we have to be
a bit more careful, because we can only match ushll/ushll2 for vector
shifts with a zero-extended first operand.

Also adds 2 tests marked with FIXME, where we can further increase
codegen.

Event Timeline

fhahn created this revision.May 23 2019, 6:11 AM

I think there are probably other shifts that we can include while we're in the area. Most obviously aarch64_neon_ushl, but maybe others too.

fhahn updated this revision to Diff 215173.Aug 14 2019, 11:04 AM

Add support for neon_ushl.

I think there are probably other shifts that we can include while we're in the area. Most obviously aarch64_neon_ushl, but maybe others too.

Thanks Tim. I've added support for neon_ushl. I think w have to be a bit more careful with that one, as we only have ushll/ushll2 that take immediates

fhahn retitled this revision from [AArch64] support neon_sshl in performIntrinsicCombine. to [AArch64] support neon_sshl and neon_ushl in performIntrinsicCombine..Aug 15 2019, 2:04 AM
fhahn edited the summary of this revision. (Show Details)
anemet added a subscriber: anemet.Sep 6 2019, 5:56 PM

Does this also apply to right shifts?

fhahn added a comment.Sep 9 2019, 10:30 AM

Does this also apply to right shifts?

I think the relevant cases should be handled already. For vector right shifts, it looks like there are only immediate forms and we turn left shifts with negative immediate into the corresponding right shifts.

anemet accepted this revision.Sep 17 2019, 10:21 PM

Some minor test questions/suggestions. Feel free to commit after addressing.

llvm/test/CodeGen/AArch64/arm64-vshift.ll
1204

I don't see any negative tests when we zero-extend not to the next one higher type.

1292

technically this is not sshll (long)

1318

Isn't it used for the extensions?

1333

I think that we should also have other shl tests (.4s non-foldable and perhaps some other sizes).

This revision is now accepted and ready to land.Sep 17 2019, 10:21 PM
fhahn updated this revision to Diff 221021.Sep 20 2019, 6:53 AM
fhahn marked 2 inline comments as done.

Add additional test cases and limit this patch to converting cases with appropriate zext/sext.

I'll submit a separate patch for turning ushl -> shl, if the shift is all constant.

I'll submit a separate patch for turning ushl -> shl, if the shift is all constant.

... and sshl -> shl?

fhahn added a comment.Sep 23 2019, 2:37 AM

I'll submit a separate patch for turning ushl -> shl, if the shift is all constant.

... and sshl -> shl?

Yep, let's look at that separately.

llvm/test/CodeGen/AArch64/arm64-vshift.ll
1318

Yeah, I initially thought there might be a long version of sshr, but looks like there is not

This revision was automatically updated to reflect the committed changes.