This is an archive of the discontinued LLVM Phabricator instance.

[X86] Improve the gather scheduler models for SkylakeClient and SkylakeServer
ClosedPublic

Authored by craig.topper on Feb 4 2020, 2:54 PM.

Details

Summary

The load ports need a cycle for each potentially loaded element just like Haswell and Skylake. Unlike Haswell and Broadwell, the number of uops does not scale with the number of elements. Instead the load uops run for multiple cycles.

I've taken the latency number from the uops.info. The port binding for the non-load uops is taken from the original IACA data I have.

I've added avx512 gather instructions to llvm-mca resource tests. I wanted to pre-commit them, but since some of them have 0 uops in the existing data, llvm-mca gave an error.

Diff Detail

Event Timeline

craig.topper created this revision.Feb 4 2020, 2:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 2:55 PM

Please can you add the resources-avx512.s/resources-avx512vl.s changes to the generic tests as well ?

I've added avx512 gather instructions to llvm-mca resource tests. I wanted to pre-commit them, but since some of them have 0 uops in the existing data, llvm-mca gave an error.

Field NumMicroOpcodes is currently used by mca to model the number of uOPs dispatched from the uOp-Queue to the out of order backend.
From a 'dispatch' point of view, an instruction with zero opcodes is still valid; it simply doesn't consume any dispatch group slots.

However, mca doesn't expect an instruction with zero uOPs to consume pipeline resources. That is seen as a contradiction in practice; it makes sense only if the instruction is always eliminated and never really executed. Maybe mca is being too conservative here. However I believe that we should check this in the subtarget emitter when we verify the scheduling info from tablegen (I'll raise a bug about it).

What I can do in the short term is to remove the check for MayLoad and MayStore in mca (I have a patch ready for it). That check is too conservative, since we were already test if an instruction consumes processor resources in general. That is enough to fix the issue that you saw with zero uOP gathers.

I've added avx512 gather instructions to llvm-mca resource tests. I wanted to pre-commit them, but since some of them have 0 uops in the existing data, llvm-mca gave an error.

What I can do in the short term is to remove the check for MayLoad and MayStore in mca (I have a patch ready for it). That check is too conservative, since we were already test if an instruction consumes processor resources in general. That is enough to fix the issue that you saw with zero uOP gathers.

FTR: I have committed aaaeac6 to remove those checks.
So, (if you really want to) you should be able now to precommit your new tests.

That being said. The patch looks good to me modulo the request from Simon.

Update Generic tests too

RKSimon accepted this revision.Feb 5 2020, 8:35 AM

No preference whether you pre-commit the tests or not. LGTM - cheers

This revision is now accepted and ready to land.Feb 5 2020, 8:35 AM
andreadb accepted this revision.Feb 5 2020, 10:10 AM

LGTM too.

This revision was automatically updated to reflect the committed changes.