This is an archive of the discontinued LLVM Phabricator instance.

[X86][BtVer2] Fix latency and resource cycles of AVX 256-bit zero-idioms.
ClosedPublic

Authored by andreadb on Sep 21 2018, 2:55 AM.

Details

Summary

This patch introduces a SchedWriteVariant to describe zero-idiom VXORP(S|D)Yrr and VANDNP(S|D)Yrr.

This is a follow-up of r342555.

On Jaguar, a VXORPSYrr is 2 macro opcodes. Only one opcode is eliminated at register-renaming stage. The other opcode has to be executed to set the upper half of the destination YMM.
Same for VANDNP(S|D)Yrr.

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb created this revision.Sep 21 2018, 2:55 AM
lebedev.ri added inline comments.
test/CodeGen/X86/avx-schedule.ll
5508 ↗(On Diff #166424)

There is no way to achieve this without inline asm?
When committing, please commit the baseline first.

RKSimon accepted this revision.Sep 21 2018, 3:13 AM

LGTM - as @lebedev.ri suggested please pre-commit the avx-schedule.ll baseline

This revision is now accepted and ready to land.Sep 21 2018, 3:13 AM
andreadb added inline comments.Sep 21 2018, 4:34 AM
test/CodeGen/X86/avx-schedule.ll
5508 ↗(On Diff #166424)

Not that I know of. In this case, I am pretty sure that there is no other way.

I will add the test in a separate commit.

Thanks for the review!

This revision was automatically updated to reflect the committed changes.