This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Don't vectorize loads of non-packed types (like i1, i2).
ClosedPublic

Authored by mzolotukhin on Sep 29 2015, 6:29 PM.

Details

Summary

Given an array of i2 elements, 4 consecutive scalar loads will be lowered to
i8-sized loads and thus will access 4 consecutive bytes in memory. If we
vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in
memory. Hence, we should prohibit vectorization in such cases.

PS: Initial patch was proposed by Arnold.

Diff Detail

Repository
rL LLVM

Event Timeline

mzolotukhin retitled this revision from to [SLP] Don't vectorize loads of non-packed types (like i1, i2)..
mzolotukhin updated this object.
mzolotukhin added a subscriber: llvm-commits.
mzolotukhin updated this object.Sep 29 2015, 6:31 PM
aschwaighofer edited edge metadata.Sep 29 2015, 7:14 PM

LGTM.

Thank you.

This revision was automatically updated to reflect the committed changes.
nadav edited edge metadata.Oct 1 2015, 10:05 AM
nadav added a subscriber: nadav.

What about SLP-vectorization of stores? I suspect that we have the same bug for stores.

Hi Nadav,

Currently we can't get into such situation with stores, but that's just by luck - the minimal vector size is 128 bit, and maximum number of accesses we can bundle together is 16. So the minimal element size that could be used in vector store is i8, which doesn't have such a problem.

Next, I'm going to reveal the incorrect behavior on stores by replacing these hardcoded parameters with cl options (see e.g. D13278), fix it, and make sure we honor these constraints when vectorizing phis (currently we don't).

Thanks,
Michael