This is an archive of the discontinued LLVM Phabricator instance.

[X86] Update cost model for the cost of s/zextending from vXi1 to a vXi8/i16/i32 vector of less than 128 bits
Needs ReviewPublic

Authored by craig.topper on Apr 29 2020, 10:31 AM.

Details

Reviewers
RKSimon
spatel
Summary

vXi1 vectors are legalized by promoting, but vXi8/i16/i32 vectors are legalized by widening. This results in these extends becoming truncates+sign/zext_extend_inreg. This is worse than the costs we were getting from the default TTI implementation.

We could probably lower the costs of these by improving the codegen to do the sign/zext_extend_inreg before the truncate. I think that would enable the use of packss/packus operations to do the truncation. Then we wouldn't need to insert an AND to make packus usable.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 29 2020, 10:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2020, 10:31 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

My main concern is that a lot of this is very dependent on what generated the vXi1 in the first place - if it was a vector compare then we can usually ext/trunc very cheaply.