This is an archive of the discontinued LLVM Phabricator instance.

[X86] Adjust some IceLake integer shuffle schedule classes (PR48110)
ClosedPublic

Authored by RKSimon on Dec 10 2021, 12:10 PM.

Details

Summary

The IceLake scheduler model is still mainly a copy of the SkylakeServer model.

This patch adjusts the integer shuffle classes to account for most instructions now working on Port 1 as well as Port 5.

This is based off Agner + uops.info as well as the PR48110 report.

Diff Detail

Event Timeline

RKSimon created this revision.Dec 10 2021, 12:10 PM
RKSimon requested review of this revision.Dec 10 2021, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2021, 12:10 PM
wxiao3 added a subscriber: wxiao3.Dec 11 2021, 6:20 AM

Thanks @RKSimon for working on it. Do we have an easy way to check if the change covers all shuffle instructions and the value is matched with Agner + uops.info?

llvm/lib/Target/X86/X86SchedIceLake.td
391–398

What's WriteShuffle used for? I saw some *Z128 use it as well. Why does it only work on port5?

668

Just for curious, what's the difference between (V)? and (V?)

668

Y|Z|Z128|Z256?

669

Y|Z|Z128|Z256?

1544

Y|Z|Z256

1545

ditto?

llvm/test/tools/llvm-mca/X86/IceLakeServer/resources-avx1.s
1568

Why is it affected?

1584

ditto.

RKSimon added inline comments.Dec 12 2021, 2:34 AM
llvm/lib/Target/X86/X86SchedIceLake.td
391–398

Its mainly for MMX shuffles

668

Just for curious, what's the difference between (V)? and (V?)

I think for single cases inside the () there isn't any - I'll try to be consistent

RKSimon updated this revision to Diff 393741.Dec 12 2021, 4:30 AM

address comments

Thanks @RKSimon for working on it. Do we have an easy way to check if the change covers all shuffle instructions and the value is matched with Agner + uops.info?

No - I've added more instructions to the llvm-mca avx512 tests, but we're probably still missing a lot.

I'm going to see if the latest wsl2 (which enables some perf support) will finally allow us to run llvm-exegesis, as I now have access to a RocketLake windows machine

HaohaiWen added a comment.EditedDec 13 2021, 12:52 AM

Here's diff I found: https://www.textcompare.org/?id=61b7080d8668ef0015be12a9
Left window is before this patch, right is after.
Seems like VPBROADCASTBZ128rm, VPCMOVYrmr, VPPERMrmr ports change from [5, 23] to [15, 23] but we didn't have list test for them.
BTW, uops.info provides lat/tpt for icelake, not icelake-server.

Here's diff I found: https://www.textcompare.org/?id=61b7080d8668ef0015be12a9
Left window is before this patch, right is after.
Seems like VPBROADCASTBZ128rm, VPCMOVYrmr, VPPERMrmr ports change from [5, 23] to [15, 23] but we didn't have list test for them.

Cheers - I'll get these added to the llvm-mca test coverage.

BTW, uops.info provides lat/tpt for icelake, not icelake-server.

Are there actual known sched diffs for ICL/ICX/RKL/TGL? I hadn't noticed any in instlatx64 listings etc. All I know is TGL supports VP2INTERSECT but the others do not - I'll add that at some point (maybe like IvyBridge F16C which we added under SandyBridge llvm-mca tests).

RKSimon updated this revision to Diff 393981.Dec 13 2021, 11:46 AM

rebase + fix vpbroadcastb/w avx512bwvl overrides

Here's diff I found: https://www.textcompare.org/?id=61b7080d8668ef0015be12a9
Left window is before this patch, right is after.
Seems like VPBROADCASTBZ128rm, VPCMOVYrmr, VPPERMrmr ports change from [5, 23] to [15, 23] but we didn't have list test for them.

For reference - VPCMOVYrmr, VPPERMrmr are XOP instructions so aren't legal on Intel

This revision is now accepted and ready to land.Dec 13 2021, 10:56 PM
HaohaiWen added inline comments.Dec 13 2021, 11:03 PM
llvm/lib/Target/X86/X86SchedIceLake.td
669

Y|Z|Z128|Z256?
==
Y|Z|Z128|Z25(6)?

This revision was landed with ongoing or failed builds.Dec 14 2021, 10:57 AM
This revision was automatically updated to reflect the committed changes.