Page MenuHomePhabricator

[llvm][AArch64] Handle arrays of struct properly (from IR)

Authored by DavidSpickett on Jun 11 2021, 8:13 AM.



This only applies to FastIsel. GlobalIsel seems to sidestep
the issue.

This fixes

One of the things we do in llvm is decide if a type needs
consecutive registers. Previously, we just checked if it
was an array or not.
(plus an SVE specific check that is not changing here)

This causes some confusion when you arbitrary IR like:

%T1 = type { double, i1 };
define [ 1 x %T1 ] @foo() {
  ret [ 1 x %T1 ] zeroinitializer

We see it is an array so we call CC_AArch64_Custom_Block
which bails out when it sees the i1, a type we don't want
to put into a block.

This leaves the location of the double in some kind of
intermediate state and leads to odd codegen. Which then crashes
the backend because it doesn't know how to implement
what it's been asked for.

You get this:

renamable $d0 = FMOVD0
$w0 = COPY killed renamable $d0

Rather than this:

$d0 = FMOVD0
$w0 = COPY $wzr

The backend knows how to copy 64 bit to 64 bit registers,
but not 64 to 32. It can certainly be taught how but the real
issue seems to be us even trying to assign a register block
in the first place.

This change makes the logic of
a bit more in depth. If we find an array, also check that all the
nested aggregates in that array have a single member type.

Then CC_AArch64_Custom_Block's assumption of a type that looks
like [ N x type ] will be valid and we get the expected codegen.

New tests have been added to exercise these situations. Note that
some of the output is not ABI compliant. The aim of this change is
to simply handle these situations and not to make our processing
of arbitrary IR ABI compliant.

Diff Detail

Event Timeline

DavidSpickett created this revision.Jun 11 2021, 8:13 AM
DavidSpickett requested review of this revision.Jun 11 2021, 8:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2021, 8:13 AM
DavidSpickett edited the summary of this revision. (Show Details)Jun 11 2021, 8:14 AM

Though globalisel doesn't crash with the reported snippet, it does crash on the added tests when we spill to memory. I still need to look into that but I don't think it should block this. (unless the bug reporter turns out to be using globalisel)

On the subject of not being ABI compliant when going direct from IR, here's an example of that: , this doesn't make the situation any worse.

We started considering more types for register blocks after but there is no fault in that commit's usage of it. It simply uncovered something that already existed.

Momchil this is the patch to fix what we were talking about. Walking the whole type might be a bit much but it should exit early 99% of the time.

chill added a subscriber: chill.

I think this change make sense. I wouldn't worry about the AAPCS64 compliance as long as the (implicit, ill-defined) "LLVM IR PCS" is flexible enough, so as to allow implementing AAPCS64.
Perhaps it's useful to have more tests, on both sides of the calls and returns:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "aarch64-eabi" 

%T = type {double, double, double, double, double, [4 x double]}

@x = dso_local global %T zeroinitializer, align 8

; Check how values are returned
define %T @f0() {
  ret %T zeroinitializer

; Check where the caller expects the values
define void @g() {
  %1 = call %T @f0()
  store %T %1, %T* @x
  ret void

; Check where the callee expects the values
define void @u(%T %a) {
  store %T %a, %T* @x
  ret void

; Check how values are passed
define void @h() {
  %1 = load %T, %T * @x
  call void @u(%T %1)
  ret void

No else after return.




Could be like for (Type *T : Ty->subtypes()) ...

  • Fix clang-tidy warnings
  • Add caller/callee tests. For passing as a block, in memory and in normal registers.
DavidSpickett marked 3 inline comments as done.Jun 14 2021, 6:47 AM

Seems fine to me. An alternative implementation could use ComputeValueVTs, instead of implementing the recursion yourself, but this seems okay as-is.


Unused variable subtypes_end

Matt added a subscriber: Matt.Jun 14 2021, 12:31 PM

Use ComputeValueVTs instead of recursing manually.

Note that is_splat does return false for a container
with 0 items but this is fine. Just means there's
nothing concrete for us to handle anyway.

DavidSpickett marked an inline comment as done.Jun 15 2021, 3:09 AM
This revision is now accepted and ready to land.Jun 15 2021, 12:21 PM
efriedma requested changes to this revision.Jun 15 2021, 12:34 PM
efriedma added inline comments.

Actually, looking at this again, I'm not sure I like using createDataLayout() like this... it's kind of an expensive/unusual operation. Can we pass down the datalayout from the callers?

This revision now requires changes to proceed.Jun 15 2021, 12:34 PM
DavidSpickett added inline comments.Jun 16 2021, 2:20 AM

By expensive do you mean in that it creates a copy of the DataLayout object each time? (I thought maybe it reparses the layout string but it doesn't)

I will see about passing it down. (we have IsVarArg there already and only used by Arm)

Pass the data layout to functionArgumentNeedsConsecutiveRegisters
instead of getting a new copy for every call.

Found out along the way that in fact Arm has a isHomogeneousAggregate
function that is uses. It's following the letter of the ABI, which would
be nice to do here too but I'd rather stick to this for now.

Remove unused Optional include


Turned out it was simple to do anyway.

This revision is now accepted and ready to land.Jun 16 2021, 6:14 AM
This revision was landed with ongoing or failed builds.Jun 16 2021, 6:56 AM
This revision was automatically updated to reflect the committed changes.