This is an archive of the discontinued LLVM Phabricator instance.

[AARCH64][RegisterCoalescer] clang miscompiles zero-extension to long long
ClosedPublic

Authored by simonwallis2 on Aug 13 2020, 11:17 PM.

Details

Summary

Implement AArch64 variant of shouldCoalesce() to detect a known failing case
and prevent the coalescing of a 32-bit copy into a 64-bit sign-extending load.

Do not coalesce in the following case:
COPY where source is bottom 32 bits of a 64-register,
and destination is a 32-bit subregister of a 64-bit register,
ie it causes the rest of the register to be implicitly set to zero.

A mir test has been added.

In the test case, the 32-bit copy implements a 32 to 64 bit zero extension
and relies on the upper 32 bits being zeroed.

Coalescing to the result of the 64-bit load meant overwriting
the upper 32 bits incorrectly when the loaded byte was negative.

Diff Detail

Event Timeline

simonwallis2 created this revision.Aug 13 2020, 11:17 PM
simonwallis2 requested review of this revision.Aug 13 2020, 11:17 PM
arsenm added inline comments.Aug 27 2020, 7:02 AM
llvm/test/CodeGen/AArch64/zext-reg-coalesce.mir
4

Add a comment explaining the test?

You don't need -O1

5–12

Don't need the IR section

24–37

I think you can trim most of these instructions

simonwallis2 marked 2 inline comments as done.

Thanks for the feedback arsenm.
I've made all of the changes you suggested,
with 1 exception:
I have left in the IR for the declaration of @c.

llvm/test/CodeGen/AArch64/zext-reg-coalesce.mir
5–12

I left the IR section in place because of @c

john.brawn added inline comments.Sep 2 2020, 9:56 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
735

We should have a comment here explaining why coalescing is incorrect in this case. Also I'm not sure if the SrcRC == DstRC is correct here, as the implicit zeroing on subregister write doesn't depend on what the source register is.

simonwallis2 added inline comments.Sep 3 2020, 6:40 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
735

We should have a comment here explaining why coalescing is incorrect in this case.

I will reword and tailor the comment that is in the test case.

735

Also I'm not sure if the SrcRC == DstRC is correct here, as the implicit zeroing on subregister write doesn't depend on what the source register is.

Thanks. I will modify the check. Currently it matches:

undef %4.sub_32:gpr64 = COPY %2.sub_32:gpr64

The 2 register classes do not have to be the same,
But they both have to be a subregister of a 64bit int register type so that this will also match:

undef %40.sub_32:gpr64 = COPY %41.sub_32:gpr64common

So I should include GPR64common.

Added comments to implementation of AArch64RegisterInfo::shouldCoalesce()

Removed test for exact match of both TargetRegisterClass;
Replaced it with check that both classes are a subregister of a 64bit int register class.

This revision is now accepted and ready to land.Sep 7 2020, 9:34 AM