This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use silvermont cost model overrides for goldmont as well.
ClosedPublic

Authored by craig.topper on Mar 19 2018, 12:47 PM.

Details

Summary

Goldmont is similiar to silvermont we should probably use the silvermont cost model as a starting point.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Mar 19 2018, 12:47 PM

I suspect this is wrong for PMULLD. I think that improved to a single uop on goldmont.

I suspect this is wrong for PMULLD. I think that improved to a single uop on goldmont.

Do you want to fix that here or just add a fixme?

craig.topper planned changes to this revision.Mar 24 2018, 4:49 PM

Turns out almost all of the slow things in the SLM table have been improved in GLM. So this isn't the right table to use. The only thing that didn't change much was floating point division.

Turns out almost all of the slow things in the SLM table have been improved in GLM. So this isn't the right table to use. The only thing that didn't change much was floating point division.

I'm hoping to start work on PR36550 reasonably soon - a better approach might be to (a) ensure that the SLM model matches what the TTI says it should and (b) decide how best to provide a GLM model. What do you think?

Introduce a GLM specific table the overrides FDIV. Override FSQRT for both GLM and SLM. It appears for packed operations only half of the 128-bit vector is produced at a time for both SLM and GLM as the throughput is twice the scalar throughput. The default SSE42 throughputs we were getting otherwise don't match that behavior.

Throughput data for GLM was taken from table 16-17 in the latest Intel Optimization Manual.

Should I make a copy of the SLM scheduler and use it for GLM so we can start refining it?

RKSimon accepted this revision.Mar 25 2018, 2:55 AM

Should I make a copy of the SLM scheduler and use it for GLM so we can start refining it?

Probably - I was looking to find tidy ways to override models such as these - architecturally the same but with a few latency tweaks - but couldn't see anything. Probably easier just to copy it,, maybe once you're happy with the accuracy of the SLM model.

These changes LGTM though

This revision is now accepted and ready to land.Mar 25 2018, 2:55 AM
This revision was automatically updated to reflect the committed changes.