Page MenuHomePhabricator

[X86][SSE] Prefer trunc(movd(x)) to pextrb(x,0)
ClosedPublic

Authored by RKSimon on Fri, Mar 13, 9:05 AM.

Details

Summary

If we're extracting the 0'th index of a v16i8 vector we're better off using MOVD than PEXTRB, unless we're storing the value or we require the implicit zero extension of PEXTRB.

The biggest perf diff is on SLM targets where MOVD (uops=1, lat=3 tp=1) is notably faster than PEXTRB (uops=2, lat=5, tp=4).

This matches what we already do for PEXTRW.

Diff Detail

Event Timeline

RKSimon created this revision.Fri, Mar 13, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptFri, Mar 13, 9:05 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

Test changes seem to be omitted from the diff?

Test changes seem to be omitted from the diff?

I loathe git at times :-)

RKSimon updated this revision to Diff 250249.Fri, Mar 13, 10:09 AM

rebase, with tests this time

This revision is now accepted and ready to land.Fri, Mar 13, 11:00 AM
RKSimon edited the summary of this revision. (Show Details)Fri, Mar 13, 11:30 AM
This revision was automatically updated to reflect the committed changes.