Page MenuHomePhabricator

Fix the default alignment of i1 vectors.
AcceptedPublic

Authored by efriedma on Oct 7 2020, 11:38 AM.

Details

Summary

Currently, the default alignment is much larger than the actual size of the vector in memory. Fix this to use a sane default.

For SVE, temporarily remove lowering of load/store operations for predicates with less than 16 elements. The layout the backend was assuming for SVE predicates with less than 16 elements doesn't agree with the datalayout. More work probably needs to be done here.

This change is, strictly speaking, not backwards-compatible at the bitcode level. But probably nobody is actually depending on that; i1 vectors in memory are rare, and the code that does use them probably ends up forcing the alignment to something sane anyway. If we think this is a concern, I can restrict this to scalable vectors for now (where it's actually causing issues for me at the moment).

I did my best to update the regression tests, but I'm not completely sure I did it correctly for amdgpu and nvptx.

Diff Detail

Unit TestsFailed

TimeTest
220 mslinux > Clang.CodeGen::attr-arm-sve-vector-bits-bitcast.c
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=128 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c --check-prefix=CHECK-128
290 mslinux > Clang.CodeGen::attr-arm-sve-vector-bits-call.c
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -fno-experimental-new-pass-manager -S -O1 -emit-llvm -o - /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
210 mslinux > Clang.CodeGen::attr-arm-sve-vector-bits-cast.c
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
200 mslinux > Clang.CodeGen::attr-arm-sve-vector-bits-codegen.c
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=512 -fallow-half-arguments-and-returns -S -disable-llvm-passes -emit-llvm -o - /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
220 mslinux > Clang.CodeGen::attr-arm-sve-vector-bits-globals.c
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/clang -cc1 -internal-isystem /mnt/disks/ssd0/agent/llvm-project/build/lib/clang/12.0.0/include -nostdsysteminc -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +bf16 -msve-vector-bits=128 -fallow-half-arguments-and-returns -S -O1 -emit-llvm -o - /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c --check-prefix=CHECK-128

Event Timeline

efriedma created this revision.Oct 7 2020, 11:38 AM
Herald added a project: Restricted Project. · View Herald Transcript
efriedma requested review of this revision.Oct 7 2020, 11:38 AM
arsenm added inline comments.Oct 8 2020, 7:07 AM
llvm/test/Transforms/SROA/vector-promotion-different-size.ll
6

This looks like it now fails to eliminate some of the elements?

efriedma added inline comments.Oct 8 2020, 10:30 AM
llvm/test/Transforms/SROA/vector-promotion-different-size.ll
6

The IR is basically the same either way. Only type of the alloca itself is different, and the difference doesn't have any practical effect here.

(I haven't looked deeply at why SROA is choosing a different type, but it doesn't seem important.)

Thanks for working on this @efriedma. The change seem sensible to me and I appreciate the effect it has on SVE predicates.
I can't really speak for the correctness for other targets though.

llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
133

Not sure if this matters, but I can't see any operations (select) going through memory, so don't know why the code on the left was so convoluted or how your change affects this.

efriedma added inline comments.Oct 9 2020, 12:27 PM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
133

Type legalization is creating a stack temporary to lower the bitcast. The reason it doesn't show up in the assembly is that we optimize it out later: the resulting <16 x i1> load gets legalized to an i16 load followed by a PREDICATE_CAST, and then the whole store+load sequence gets optimized out. But the alignment requirement of the stack temporary sticks around.

This patch avoids the issue by lowering the alignment of the temporary, so it doesn't trigger stack realignment.

There's probably room for improvement here.

dmgreen added inline comments.Oct 10 2020, 12:21 AM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
133

Yeah, I'm a little surprised that it's doing that. There's no reason to go via a stack slot to begin with, and that dead stack slot are not removed...

I was trying to custom lower it but running into legalizer problems. I may keep looking, but it's not an problem with this patch.

sdesmalen added inline comments.Oct 12 2020, 8:05 AM
llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
133

Thanks for confirming, I suspected is was something like that!

sdesmalen accepted this revision.Oct 21 2020, 8:38 AM

I'm happy to LGTM, but maybe you want to wait just another day to land it in case anyone has more feedback for the other targets.

This revision is now accepted and ready to land.Oct 21 2020, 8:38 AM

@RKSimon @craig.topper Any thoughts from the x86 side?