This is an archive of the discontinued LLVM Phabricator instance.

[X86] replace vextractf128 intrinsics with generic shuffles
ClosedPublic

Authored by spatel on Mar 11 2015, 4:34 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 21787.Mar 11 2015, 4:34 PM
spatel retitled this revision from to [X86] replace vextractf128 intrinsics with generic shuffles.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added a subscriber: Unknown Object (MLST).
andreadb accepted this revision.Mar 12 2015, 8:50 AM
andreadb edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Mar 12 2015, 8:50 AM
This revision was automatically updated to reflect the committed changes.

Thanks, Andrea.

I checked in the LLVM side of this patch at r232047 without realizing there was a dependency on this patch. Sorry about the build-bot breakage!

I checked in the LLVM side of this patch at r232047 without realizing there was a dependency on this patch. Sorry about the build-bot breakage!

Got my patches mixed up. That should have been r232045:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232045

craig.topper added inline comments.Mar 12 2015, 9:54 AM
lib/Headers/avxintrin.h
1184 ↗(On Diff #21787)

Not sure if its safe to access V twice. It could be a function call that we maybe shouldn't invoke twice. May need a temporary here even though it might fail Wshadow

Or maybe just feed an appropriate zero vector to the second shuffle argument?

spatel added inline comments.Mar 12 2015, 10:29 AM
lib/Headers/avxintrin.h
1184 ↗(On Diff #21787)

Nice catch! I think the zero vector implementation should be safe. Checked in at r232061.

While we're here, do you see a problem with casting the 32-bit / packed single case to use 64-bit values instead? If we do it that way, the masks would be identical for all cases. And then we could just define the 2nd and 3rd macros in terms of the 1st?