This is an archive of the discontinued LLVM Phabricator instance.

[X86] Recognize AVX2 gather instructions during lowering so we can modify the source input when the mask is all ones
ClosedPublic

Authored by craig.topper on Mar 11 2017, 10:06 AM.

Details

Summary

This patch implements AVX2 gather lowering the same way we do for AVX-512. This way we can recognize when the mask is all ones and change the source to undef. I also copied the forcing the source to zero if it was undef and the mask is not all ones. Not sure how important that is for either avx2 or avx512.

Unfortunately, due to the way fast isel works we are unable to recognize the mask being all ones and instead ended up forcing the source to zero even though the mask is all ones.

Diff Detail

Event Timeline

spatel accepted this revision.Mar 12 2017, 9:34 AM

LGTM. I don't know if the force-to-zero is important either, but since we're generally happy to add xors to avoid partial reg dependencies, that seems fine to me.

With this, there should be no regressions from the clang change for undef in D30834?

This revision is now accepted and ready to land.Mar 12 2017, 9:34 AM

Looks like we only add xor for partial dependency breaking if the instruction is listed in hasUndefRegUpdate in X86InstrInfo.cpp and our first response it to use the same register as one of the other input operands, but that would be illegal for gather.

craig.topper closed this revision.Mar 13 2017, 11:48 AM

I've modified this to force the input to zero when the mask is all ones to break the execution dependency. I'll file a bug to look at using ExeDepsFix.