This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Prefer trunc(movd(x)) to pextrb(x,0)
ClosedPublic

Authored by RKSimon on Mar 13 2020, 9:05 AM.

Details

Summary

If we're extracting the 0'th index of a v16i8 vector we're better off using MOVD than PEXTRB, unless we're storing the value or we require the implicit zero extension of PEXTRB.

The biggest perf diff is on SLM targets where MOVD (uops=1, lat=3 tp=1) is notably faster than PEXTRB (uops=2, lat=5, tp=4).

This matches what we already do for PEXTRW.

Diff Detail

Event Timeline

RKSimon created this revision.Mar 13 2020, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 13 2020, 9:05 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

Test changes seem to be omitted from the diff?

Test changes seem to be omitted from the diff?

I loathe git at times :-)

RKSimon updated this revision to Diff 250249.Mar 13 2020, 10:09 AM

rebase, with tests this time

This revision is now accepted and ready to land.Mar 13 2020, 11:00 AM
RKSimon edited the summary of this revision. (Show Details)Mar 13 2020, 11:30 AM
This revision was automatically updated to reflect the committed changes.