This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX.
ClosedPublic

Authored by craig.topper on Oct 9 2019, 11:41 PM.

Details

Summary

If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp.

Diff Detail

Event Timeline

craig.topper created this revision.Oct 9 2019, 11:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 9 2019, 11:42 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
This revision is now accepted and ready to land.Oct 10 2019, 11:17 AM
This revision was automatically updated to reflect the committed changes.