This is an archive of the discontinued LLVM Phabricator instance.

Fix the default alignment of i1 vectors.
ClosedPublic

Authored by efriedma on Oct 7 2020, 11:38 AM.

Details

Summary

Currently, the default alignment is much larger than the actual size of the vector in memory. Fix this to use a sane default.

For SVE, temporarily remove lowering of load/store operations for predicates with less than 16 elements. The layout the backend was assuming for SVE predicates with less than 16 elements doesn't agree with the datalayout. More work probably needs to be done here.

This change is, strictly speaking, not backwards-compatible at the bitcode level. But probably nobody is actually depending on that; i1 vectors in memory are rare, and the code that does use them probably ends up forcing the alignment to something sane anyway. If we think this is a concern, I can restrict this to scalable vectors for now (where it's actually causing issues for me at the moment).

I did my best to update the regression tests, but I'm not completely sure I did it correctly for amdgpu and nvptx.

Diff Detail

Event Timeline

efriedma created this revision.Oct 7 2020, 11:38 AM
Herald added a project: Restricted Project. · View Herald Transcript
efriedma requested review of this revision.Oct 7 2020, 11:38 AM
arsenm added inline comments.Oct 8 2020, 7:07 AM
llvm/test/Transforms/SROA/vector-promotion-different-size.ll
6

This looks like it now fails to eliminate some of the elements?

efriedma added inline comments.Oct 8 2020, 10:30 AM
llvm/test/Transforms/SROA/vector-promotion-different-size.ll
6

The IR is basically the same either way. Only type of the alloca itself is different, and the difference doesn't have any practical effect here.

(I haven't looked deeply at why SROA is choosing a different type, but it doesn't seem important.)

Thanks for working on this @efriedma. The change seem sensible to me and I appreciate the effect it has on SVE predicates.
I can't really speak for the correctness for other targets though.

llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
131–132

Not sure if this matters, but I can't see any operations (select) going through memory, so don't know why the code on the left was so convoluted or how your change affects this.

efriedma added inline comments.Oct 9 2020, 12:27 PM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
131–132

Type legalization is creating a stack temporary to lower the bitcast. The reason it doesn't show up in the assembly is that we optimize it out later: the resulting <16 x i1> load gets legalized to an i16 load followed by a PREDICATE_CAST, and then the whole store+load sequence gets optimized out. But the alignment requirement of the stack temporary sticks around.

This patch avoids the issue by lowering the alignment of the temporary, so it doesn't trigger stack realignment.

There's probably room for improvement here.

dmgreen added inline comments.Oct 10 2020, 12:21 AM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
131–132

Yeah, I'm a little surprised that it's doing that. There's no reason to go via a stack slot to begin with, and that dead stack slot are not removed...

I was trying to custom lower it but running into legalizer problems. I may keep looking, but it's not an problem with this patch.

sdesmalen added inline comments.Oct 12 2020, 8:05 AM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
131–132

Thanks for confirming, I suspected is was something like that!

sdesmalen accepted this revision.Oct 21 2020, 8:38 AM

I'm happy to LGTM, but maybe you want to wait just another day to land it in case anyone has more feedback for the other targets.

This revision is now accepted and ready to land.Oct 21 2020, 8:38 AM

@RKSimon @craig.topper Any thoughts from the x86 side?

efriedma updated this revision to Diff 360901.Jul 22 2021, 11:30 AM

Just realized I never landed this. Rebased.

I'd still like some feedback on the x86 side of this.

Matt added a subscriber: Matt.Jul 26 2021, 10:26 AM
This revision was landed with ongoing or failed builds.Jul 31 2021, 2:11 PM
This revision was automatically updated to reflect the committed changes.
Allen added a subscriber: Allen.Jun 13 2022, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 7:17 AM
csstormq added inline comments.
llvm/lib/IR/DataLayout.cpp
807

Is there any way to set the alignment of fixed vector type to 1 byte rather than a power of 2 by default?