This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX2] Enable ZERO_EXTEND_VECTOR_INREG lowering of 256-bit vectors
ClosedPublic

Authored by RKSimon on Oct 7 2018, 6:54 AM.

Details

Summary

Some necessary yak shaving before lowering *_EXTEND_VECTOR_INREG 256-bit vectors on AVX1 targets as suggested by D52964.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Oct 7 2018, 6:54 AM
RKSimon added inline comments.
test/CodeGen/X86/pr35443.ll
18 ↗(On Diff #168587)

@craig.topper Please can you confirm if the pr35443.ll change is acceptable?

An alternative is to set the passthrough value zeroinitializer, which instead adds a vpmovzxbq op after the vmovd (some kind of demanded bits failure that could be fixed in a future patch).

craig.topper added inline comments.Oct 8 2018, 10:34 AM
test/CodeGen/X86/pr35443.ll
18 ↗(On Diff #168587)

What if you just change the alignment of @ac to 1? That should prevent the single byte load from the masked.load from promoting to a wider size I think.

RKSimon updated this revision to Diff 168688.Oct 8 2018, 11:20 AM

Tweaked load alignment of test - the additional vpmovzxbq /should/ be removable with a suitable demandedelts+demandedbits combine (probably D52935 in reverse).

This revision is now accepted and ready to land.Oct 8 2018, 11:26 AM
This revision was automatically updated to reflect the committed changes.