[X86] Replace (v)palignr intrinsics with generic shuffles (Clang)
AbandonedPublic

Authored by RKSimon on Mar 12 2015, 10:48 AM.

Details

Summary

The (v)palignr instructions are currently described using builtin intrinsics although the x86 shuffle lowering code now correctly identifies them.

This patch replaces the builtins with generic __builtin_shufflevector calls. I'll be posting a LLVM equivalent patch shortly.

Diff Detail

Repository
rL LLVM
RKSimon retitled this revision from to [X86] Replace (v)palignr intrinsics with generic shuffles (Clang).Mar 12 2015, 10:48 AM
RKSimon updated this object.
RKSimon edited the test plan for this revision. (Show Details)
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: Unknown Object (MLST).

We've always been sending shuffles to the backend. We just generated the shuffles in CGBuiltin instead of the header.

I'm not sure I like completely losing the type system on the immediate. Theoretically with the code in CGBuiltin we could at least get a truncation warning if the immediate was larger than a byte. Though I'm not sure that warning is on by default. Really I wish we could check the immediates for illegal values on all of these macros and deliver nice messages to the user. I think gcc does check a lot of them.

We've always been sending shuffles to the backend. We just generated the shuffles in CGBuiltin instead of the header.

Hi Craig, yes, I was hoping that this patch would get us to the point that we could get rid of even that - or is the CGBuiltin stage good enough do you think?

I'm not sure I like completely losing the type system on the immediate. Theoretically with the code in CGBuiltin we could at least get a truncation warning if the immediate was larger than a byte. Though I'm not sure that warning is on by default. Really I wish we could check the immediates for illegal values on all of these macros and deliver nice messages to the user. I think gcc does check a lot of them.

Short of adding static_assert I'm not sure of the best way of doing this. We're in a position at the moment of having some of the intrinsics already converted over to pure __builtin_shufflevector implementations despite having a similar problem - the slldq/srldq byte shifts come to mind which are pretty similar to alignr.

RKSimon abandoned this revision.Mar 29 2015, 4:19 AM

Abandoning this ticket - as Craig said we're creating shuffles internally which is good enough.