This is an archive of the discontinued LLVM Phabricator instance.

[X86] WIP Match the IR pattern form movmsk on SSE1 only targets where v4i32 isn't legal
ClosedPublic

Authored by craig.topper on Aug 2 2019, 6:02 PM.

Details

Summary

This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that.

This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type. Still need to add test cases, but wanted to go ahead and post it to see what others thought of the direction.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Aug 2 2019, 6:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2019, 6:02 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
spatel added a comment.Aug 4 2019, 5:35 AM

The direction seems fine to me. We have similar existing code to avoid problems for an SSE1-only target. There's a question about why someone would target SSE1-only for perf-critical code at this point, but I assume there's some legacy constraint preventing doing better.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 10 2019, 12:51 AM
This revision was automatically updated to reflect the committed changes.