This is an archive of the discontinued LLVM Phabricator instance.

[X86] Look for scalar through one bitcast when lowering to VBROADCAST.
ClosedPublic

Authored by ab on Jun 18 2015, 4:53 PM.

Details

Summary

Fixes PR23464: one way to use the broadcast intrinsics is something like:

_mm256_broadcastw_epi16(_mm_cvtsi32_si128(*(int*)src));

We don't currently fold these, but we can look through one bitcast to find the broadcast scalar.

Note that I added a file specifically for these tests. I can add these to the various vector-shuffle-* files; I don't have an argument/preference for either.

Diff Detail

Repository
rL LLVM

Event Timeline

ab updated this revision to Diff 27974.Jun 18 2015, 4:53 PM
ab retitled this revision from to [X86] Look for scalar through one bitcast when lowering to VBROADCAST..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added reviewers: chandlerc, spatel.
ab added a subscriber: Unknown Object (MLST).
chandlerc accepted this revision.Jun 18 2015, 6:42 PM
chandlerc edited edge metadata.

Nice catch for the missing pattern, and yay for being able to pattern match these things rather than needing C++ code.

Given that we have the test here, I don't really have a problem with it.

It might be nice to merge these with the larger vector-shuffle-* tests which do load fold testing, but that doesn't seem critical, and certainly shouldn't block this patch.

This revision is now accepted and ready to land.Jun 18 2015, 6:42 PM
ab added a comment.Jun 18 2015, 6:46 PM

Note that the test file isn't in tree right now: I added it locally, and submitted a patch against it (to make it easier to review).

So I can totally merge it with the vector-shuffle-* if you prefer (seems like you do, I'm neutral; let's do that!).

Ah, I see.

If it's not huge, I'd merge it. If it is huge, I'd try to pull more
load-specific stuff into it and make the others non-load-specific.

Is this going to be committed at any point? I have a feeling that it'll affect work I'm doing to merge shuffle across bitcasts.

This revision was automatically updated to reflect the committed changes.