This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX512] Choose correct registers in vpbroadcastb/w
ClosedPublic

Authored by guyblank on Aug 8 2017, 11:49 AM.

Details

Summary

This patch addresses pr33795

The vpbroadcastb/w instructions flavor which broadcasts a byte/word from the lower part of a GPR to an XMM/YMM/ZMM, should use the full GPR as the source operand.
But currently it uses the GR8/16 regclasses attempting to match the subregisters corresponding to the part being broadcast.
In most cases this turns out ok since the encoding for AL,BL,CL,DL is the same as their full registers. But, as seen in the bug report, when CH is chosen, the encoding actually matches EBP.
Plus, CH shouldn't have been used at all since it isn't the lower part of a GPR.

The patch adds td patterns and classes to move the source value from the subregister to the full register and use it in the broadcast.

Diff Detail

Event Timeline

guyblank created this revision.Aug 8 2017, 11:49 AM
craig.topper added inline comments.Aug 8 2017, 10:20 PM
test/CodeGen/X86/avx512bw-intrinsics.ll
1967

Any idea why we lost the movzwl?

guyblank added inline comments.Aug 9 2017, 9:03 AM
test/CodeGen/X86/avx512bw-intrinsics.ll
1967

since EAX is now live after the load, FixupBWInsts is unable to replace the AX load to movzwl.

I'm trying to teach FixupBWInstPass::getSuperRegDestIfDead to recognize that it is safe to replace in this case. But seeing a lot of performance swings so far.

This revision is now accepted and ready to land.Aug 9 2017, 9:12 AM
This revision was automatically updated to reflect the committed changes.