Page MenuHomePhabricator

[AArch64] Stackframe accesses to SVE objects.
ClosedPublic

Authored by sdesmalen on Sep 19 2019, 12:31 AM.

Details

Summary

Materialize accesses to SVE frame objects from SP or FP, whichever is
available and beneficial.

This patch still assumes the objects are pre-allocated. The automatic
layout of SVE objects within the stackframe will be added in a separate
patch.

Diff Detail

Event Timeline

sdesmalen created this revision.Sep 19 2019, 12:31 AM
lib/Target/AArch64/AArch64InstrInfo.cpp
3366 ↗(On Diff #220822)

I'm not an LLVM coding standards expert, but does this need an llvm_unreachable()? I think it does...

3453 ↗(On Diff #220822)

Would you shed some light on what this change is doing?

IsMulVL indicates there are scalable objects on the stack, right? What is the reason for the behavior change of the legacy code when !IsMulVL. I.e. the addition of StackOffset(SOffset.getScalableBytes(), MVT::nxv1i8) in the else block.

efriedma added inline comments.Oct 7 2019, 12:21 PM
lib/Target/AArch64/AArch64InstrInfo.cpp
3366 ↗(On Diff #220822)

The reason we add llvm_unreachable() after switches in some cases is related to warnings. Some compilers warn about a missing return after a switch that covers every named enum value, but not every possible enum value. That case doesn't apply here: the switch has a "default".

sdesmalen marked 2 inline comments as done.Oct 7 2019, 12:36 PM
sdesmalen added inline comments.
lib/Target/AArch64/AArch64InstrInfo.cpp
3453 ↗(On Diff #220822)

isAArch64FrameOffsetLegal tries to determine if the immediate field of the instruction can fit the given StackOffset, and will attempt to fold as much of the offset in the immediate.

Although a StackOffset can contain both a scalable and a non-scalable part, it will depend on the instruction whether the immediate is scalable or non-scalable. For example, in

LDR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}]

the immediate is mul vl, so scalable, which means the instruction can only handle the "scalable" part of the StackOffset. The rest of the offset will need to be handled elsewhere.

The variable int64_t Offset uses either the scalable or non-scalable part of the StackOffset, which happens on line 3411. After that, this function does its magic to determine what part of the offset can be folded into the immediate.

On line 3448, the remaining part of the offset that could *not* be folded into the immediate will need to be reflected in the in/out parameter SOffset, which is a StackOffset.
If IsMulVL is true, then variable Offset is scalable and will at this point contain the part of the scalable offset that could not be folded into the immediate.
SOffset.getBytes() just passes through the fixed-size part of the offset that is not handled by the instruction.
Conversely, if IsMulVL is false, then the variable Offset is non-scalable and will contain the part of the fixed-size offset that could not be folded into the immediate. It then has to pass through SOffset.getScalableBytes() that is not handled by the instruction, so it can be handled elsewhere.

For example:

isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {16, MVT::nxv1i8})

would fold {16, MVT::nxv1i8} into the immediate, and the resulting SOffset would be {16, MVT::i8}. It would return AArch64FrameOffsetCanUpdate.

isAArch64FrameOffsetLegal(AArch64::LDR_ZXI, {16, MVT::i8} + {4096, MVT::nxv1i8})

would fold {4080, MVT::nxv1i8} into the immediate (note that it's immediate goes up to #255 MUL VL), and the resulting SOffset would be {16, MVT::i8} + {16, MVT::nxv1i8}.

lib/Target/AArch64/AArch64InstrInfo.cpp
3366 ↗(On Diff #220822)

Ah, ok. I thought that it was protection in case the the switch is changed in the future. The X86 backend has a few switches with default cases that also have unreachable after them.

cameron.mcinally accepted this revision.Oct 8 2019, 8:18 AM
cameron.mcinally marked an inline comment as done.

LGTM with a moderate confidence level. Maybe give other reviewers a day or so to comment before committing.

lib/Target/AArch64/AArch64InstrInfo.cpp
3453 ↗(On Diff #220822)

Ah, okay. So it's determining the addressing mode...

This revision is now accepted and ready to land.Oct 8 2019, 8:18 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2019, 6:15 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

LGTM with a moderate confidence level. Maybe give other reviewers a day or so to comment before committing.

Thanks @cameron.mcinally!