This is an archive of the discontinued LLVM Phabricator instance.

[LiveDebugValues] Handle spill locations with a fixed and scalable component.
ClosedPublic

Authored by sdesmalen on Oct 23 2020, 7:11 AM.

Details

Summary

This patch fixes the two LiveDebugValues implementations
(InstrRef/VarLoc)Based to handle cases where the StackOffset contains
both a fixed and scalable component.

This depends on the TargetRegisterInfo::prependOffsetExpression being
added in D90020. Feel free to leave comments on that patch if you have them.

Diff Detail

Event Timeline

sdesmalen created this revision.Oct 23 2020, 7:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 23 2020, 7:11 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
sdesmalen requested review of this revision.Oct 23 2020, 7:11 AM
sdesmalen edited the summary of this revision. (Show Details)Dec 2 2020, 8:32 AM
sdesmalen set the repository for this revision to rG LLVM Github Monorepo.
djtodoro added inline comments.Dec 3 2020, 7:36 AM
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
564

missing return false?

llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir
49–53

no need for all these attributes

sdesmalen updated this revision to Diff 309878.Dec 7 2020, 4:42 AM
sdesmalen marked an inline comment as done.
  • Refactored operator== to return false by default.
  • Removed unnecessary attributes.
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
564

Yes, good catch, thanks!

Thanks for addressing the comments.

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
231–234

We use tuples here because std::tie() takes l-values only?

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
556

Do we use Loc.Hash for the case of Spill locations?

jmorse added a comment.Dec 7 2020, 7:50 AM

Forgive my ignorance on scalable vectors; but is there definitely a scenario where the "Scalable" portion of the stack offset is nonzero, and if so could the test cover it please.

Right now I believe that for aggregate types:

  • We can refer to portions of structs if they're split by SROA and promoted from there,
  • The same for arrays that are completely promoted,
  • All other aggregates are left on the stack, and we only describe their stack address.

As a result, the stack-location tracking in LiveDebugValues is only geared towards tracking the spills / restores of scalars that the register allocator introduces, and sometimes entire vectors. I have a fear that if this patch is trying to make elements within scalable vectors describable to debug-info, it might be aiming to do something that isn't supported.

This probably stems from not understanding scalable vectors; would you have any source examples of when this patch becomes necessary, just so that I can work through them?

Forgive my ignorance on scalable vectors; but is there definitely a scenario where the "Scalable" portion of the stack offset is nonzero, and if so could the test cover it please.

No worries! In the test-case in this patch the Scalable portion of the offset is actually non-zero:

# CHECK-DAG: ST1W_IMM renamable $z1, killed renamable $p{{[0-9]+}}, $fp, -[[#%u,OFFSET:]]
                                                                            ^^^^^^^^^^^

For this test OFFSET is '1' (so it is stored at FP - 1 * (length of scalable vector)). The 'length of scalable vector' is handled by the addressing mode of ST1W. So for 128bit vectors, it would store to FP - 16bytes, for 256 it would store to FP - 32bytes, etc. The corresponding StackOffset returned by TFI->getFrameIndexReference is {0 bytes, -16 scalable bytes}, and because SVE vectors are always a multiple of 16 bytes, this translates to using -1 for ST1W_IMM.

After the live-debugvalues pass, the debug-location for this value is now described by:

CHECK-DAG: DBG_VALUE $fp, 0, !{{[0-9]+}}, !DIExpression(DW_OP_constu, [[#mul(OFFSET,8)]], DW_OP_bregx, 46, 0, DW_OP_mul, DW_OP_minus),

Which describes the location FP - 8 * VG, where VG is the number of 64-bit 'vector granules'. For an 128bit vector, VG is 2, for 256 it is 4, etc, resulting in the same addresses calculated by the ST1W instruction.

Right now I believe that for aggregate types:

  • We can refer to portions of structs if they're split by SROA and promoted from there,
  • The same for arrays that are completely promoted,
  • All other aggregates are left on the stack, and we only describe their stack address.

As a result, the stack-location tracking in LiveDebugValues is only geared towards tracking the spills / restores of scalars that the register allocator introduces, and sometimes entire vectors. I have a fear that if this patch is trying to make elements within scalable vectors describable to debug-info, it might be aiming to do something that isn't supported.

I don't think this is any different for scalable vectors. Specifically because (from the LangRef):

Scalable vectors cannot be global variables or members of structs or arrays because their size is unknown at compile time.

So I would expect only entire vectors to be tracked. It is also worth noting that the current use for this is with source-level scalable vector types. For SVE the only scalable types are defined in arm_sve.h which can only be used in conjunction with the intrinsics defined in the Arm C/C++ Language Extensions for SVE.

This probably stems from not understanding scalable vectors; would you have any source examples of when this patch becomes necessary, just so that I can work through them?

The following example hits the assert in VarLocBasedLDV::extractSpillBaseRegAndOffset that the StackOffset has no scalable component (which it now has, because the stack contains two SVE locals, at least the address of one of them must have a non-zero scalable offset).

// After applying D90020, build this example with:
//   clang -O1 -g -march=armv8-a+sve2 -S t.c -o -

#include <arm_sve.h>

svint32_t bar(svint32_t &);

svint32_t foo(svint32_t v, int N) {
  svint32_t local_v1 = v;
  svint32_t local_v2 = v;
  if (N > 0) {
    local_v1 = bar(local_v1);
    local_v2 = bar(local_v2);
  }
  return svadd_z(svptrue_b32(), local_v1, local_v2);
}

Hopefully that clarifies a few things, but don't hesitate to ask if you have more questions!

sdesmalen added inline comments.Dec 8 2020, 12:50 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
231–234

Yes, that's right.

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
556

We did before, because uint64_t Hash covered SpillLoc (which contains an unsigned (SpillBase) and int (SpillOffset)). But now that SpillOffset is a StackOffset instead of an int, uint64_t Hash no longer covers all those bits so we need to explicitly compare the two.

jmorse added a comment.Dec 8 2020, 2:32 AM

Ah, I hadn't quite lined up in my head that the offset scales as the stack grows downwards. With that in mind, I think this sounds safe -- the scalable part of the stack offset is opaque as far as LiveDebugValues is concerned, and we can just pass it around.

Are there any scenarios where stack slots can be accessed with different scalable offsets? For example if stack slot colouring merged two slots for vectors of different scalable size (maybe it doesn't do that). If that happened, the StackOffset object would compare differently, and we might miss a spilt value being overwritten (see llvm/test/DebugInfo/MIR/X86/live-debug-values-stack-clobber.mir for a scalar example of overwrites).

The size of a VarLoc object increases from 72 bytes to ~80 bytes on amd64; this isn't ideal, but trying to micro-optimise it at this stage invites errors.

llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir
149–151

I'd recommend also testing restoring the vector to a register, to ensure LiveDebugValues chops the spill-expression off correctly. VarLocBasedLDV should put in a DBG_VALUE after loading from the stack; you should need to clobber the stack location with some other value to force InstrRefBasedLDV to issue another DBG_VALUE.

Are there any scenarios where stack slots can be accessed with different scalable offsets? For example if stack slot colouring merged two slots for vectors of different scalable size (maybe it doesn't do that). If that happened, the StackOffset object would compare differently, and we might miss a spilt value being overwritten (see llvm/test/DebugInfo/MIR/X86/live-debug-values-stack-clobber.mir for a scalar example of overwrites).

I guess that is possible. As long as both frame objects have the same Stack-ID, StackSlotColouring can merge them together. However, in that case, the sizes and offsets of those objects would both be scalable and comparing these sizes would be no different as they would be for non-scalable sizes.

The size of a VarLoc object increases from 72 bytes to ~80 bytes on amd64; this isn't ideal, but trying to micro-optimise it at this stage invites errors.

Yes, that is unfortunate. We'll need to accept some cost for returning structs instead of int/unsigned throughout the code-base to represent the scalable semantics (same for TypeSize and ElementCount).

llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir
149–151

I tried this, but I'm not entirely sure the result is correct because isRestoreInstruction always returns false (TargetInstrInfo::getRestoreSize always returns 0 if this is not overriden by the target, which it is not for AArch64).

So when I do a reload from %stack.0 and then clobber that stack location, like this:

$z3 = IMPLICIT_DEF
renamable $z1 = LD1W_IMM renamable $p0, %stack.0, 0, debug-location !34 :: (load unknown-size from %stack.0, align 16)
ST1W_IMM renamable $z3, killed renamable $p0, %stack.0, 0 :: (store unknown-size into %stack.0, align 16)

LiveDebugValues creates DBG_VALUE $noreg, $noreg, !27, !DIExpression(), debug-location !30. It doesn't seem to recognize LD1W_IMM as a reload, and therefore doesn't see that we've just reloaded debug-value !27 back into $z1. The DIExpression is empty though.

Is that what you meant?

jmorse accepted this revision.Dec 9 2020, 2:53 AM

I guess that is possible. As long as both frame objects have the same Stack-ID, StackSlotColouring can merge them together. However, in that case, the sizes and offsets of those objects would both be scalable and comparing these sizes would be no different as they would be for non-scalable sizes.

I imagine if they're merged, all the load and store instructions will refer to the same %stack.[0-9]+ frame index. So it's probably fine.

Generally LGTM with the extra CHECK mentioned inline.

llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir
149–151

TargetInstrInfo::getRestoreSize always returns 0 if this is not overriden by the target, which it is not for AArch64

Oooofff, I wasn't aware of that. Making sure restores are correct are probably out of scope for this patch then.

LiveDebugValues creates DBG_VALUE $noreg,

That's part of the correct behaviour (terminating locations if they're overwritten on the stack), please do add CHECK lines for that, plus a "TODO" indicating this will change when AArch64 can detect loads.

This revision is now accepted and ready to land.Dec 9 2020, 2:53 AM
sdesmalen updated this revision to Diff 310516.Dec 9 2020, 6:56 AM
  • Added extra CHECK line and TODO to test.

Thanks for the review @jmorse!

djtodoro accepted this revision.Dec 9 2020, 7:11 AM

lgtm, thanks!

StephenTozer added inline comments.
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
576–578

The SpillOffset fixed and scalable arguments here are missing Other., so we're comparing the current VarLoc to itself - can you fix this in a followup patch?

sdesmalen added inline comments.Jan 11 2021, 1:22 PM
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
576–578

Good spot, thanks for pointing out! Do you have any objection for me to just push the fix directly? (I wouldn't really know how to create a test for this, unless you have any suggestions?)

jmorse added inline comments.Jan 11 2021, 1:41 PM
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
576–578

IMO just pushing up the fix is fine -- the developer policy permits it if it's "obvious" that the patch is correct.