This is an archive of the discontinued LLVM Phabricator instance.

[X86][Znver1] Remove InstRWs for BLENDVPS/PD
ClosedPublic

Authored by craig.topper on Mar 23 2018, 11:49 AM.

Details

Summary

This removes the InstRWs for BLENDVPS/PD in favor of WriteFVarBlend. The latency listed was 3 cycles but WriteFVarBlend is defined as 1 cycle latency. The 1 cycle latency matches Agner Fog's data.

The patterns were missing the VEX forms which is why there are no test changes. We don't test "-mcpu=znver1 -mattr=-avx"

Diff Detail

Event Timeline

craig.topper created this revision.Mar 23 2018, 11:49 AM

The patterns were missing the VEX forms which is why there are no test changes. We don't test "-mcpu=znver1 -mattr=-avx"

Thanks for reminding me - we need to fix the sse schedule tests to not use the vex instructions. I'll look at replacing then with inline asm.

Why can't just explicitly set the sse level?

Why can't just explicitly set the sse level?

Yes, I can - it does mean that we'll lose VEX coverage for a lot of instructions - we can either add 2 entries for each target or I add tests to avx/avx2-schedule.

There are a few cost diffs (e.g. sandybridge cvtss2sirm is different to vcvtss2sirm).

Why can't just explicitly set the sse level?

Yes, I can - it does mean that we'll lose VEX coverage for a lot of instructions - we can either add 2 entries for each target or I add tests to avx/avx2-schedule.

There are a few cost diffs (e.g. sandybridge cvtss2sirm is different to vcvtss2sirm).

I've added these at rL328420/rL328421/rL328423

@GGanesh Is there any reason why the SSE/AVX versions would differ? It doesn't look like pblendvb/vpblendvb has an equivalent diff.

It shouldn't differ.
The xmm version has 1 cycle latency and ymm version has 2 cycle latency for both AVX and SSE.

RKSimon accepted this revision.Apr 8 2018, 2:34 AM

It shouldn't differ.
The xmm version has 1 cycle latency and ymm version has 2 cycle latency for both AVX and SSE.

OK - patch LGTM

This revision is now accepted and ready to land.Apr 8 2018, 2:34 AM
This revision was automatically updated to reflect the committed changes.