This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][RemoveDIs] Implement behind-the-scenes debug-info maintenance in splice / moveBefore / insertBefore APIs
ClosedPublic

Authored by jmorse on Jul 3 2023, 8:02 AM.

Details

Summary

This is the "central" patch to the removing-debug-intrinsics project: it changes the instruction movement APIs (insert, move, splice) to interpret the "Head" bits we're attaching to BasicBlock::iterators, and updates debug-info records in the background to preserve the ordering of debug-info (which is in DPValue objects instead of dbg.values). The cost is the complexity of this patch, plus memory. The benefit is that LLVM developers can cease thinking about whether they're moving debug-info or not, because it'll happen behind the scenes.

To get an idea of what's going on, immediately skip to looking at the comment in BasicBlock::spliceDebugInfo. In dbg.value form, variable location information gets moved around automatically because it's an instruction, and it always stays in the right order. However, once converted into DPValue objects, we have to do this manually. The presence of the "head" bits on iterators indicates whether or not a transfer of instructions is intended to include the leading debug-info or not (in one scenario: the trailing debug-info). BasicBlock::splice now interprets those and moves DPValues around as appropriate. The range of transfers that can be described in splice() is explicitly enumerated in the unit test added, and there's a worked example in each scenario for clarity.

The rest of this patch applies the same reasoning in a variety of scenarios. When moveBefore (and it's siblings) are used to move instructions around, the caller has to indicate whether they intend for debug-info to move too (is it a "Preserving" call or not), and then the "Head" bits used to determine where debug-info moves to. Similar reasoning is needed for insertBefore.

Some utility methods are added to Instruction to allow examining whether an instruction has debug-info attached to it or not.

Diff Detail

Event Timeline

jmorse created this revision.Jul 3 2023, 8:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 3 2023, 8:02 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
jmorse requested review of this revision.Jul 3 2023, 8:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 3 2023, 8:02 AM

SGTM so far, just working my way through the unittests now.

llvm/include/llvm/IR/Instruction.h
74

nit: from_here -> FromHere

86–87

Is it worth specifying in the comment the behaviour if there are multiple / none attached?

190

Perhaps moveBeforeImpl as it feels more idiomatic?

197

IMO worth adding comments describing the difference between moveBefore and moveBeforePreserving.

llvm/lib/IR/BasicBlock.cpp
803–804

up to and including or excluding Last?

809–810

I'm not sure I understand this final option - why would we need to move : them up ahead so far? Ah, is "those" in

or should those go before "+" DPValues?

referring to =?

847

Very helpful comment and diagrams!

llvm/lib/IR/Instruction.cpp
84

I think (It doesn't so far) can be removed.

118–119

I'm a bit confused about what this comment is saying, please can you explain further (here)?

248–249

nit: Any reason to use trailing return type here (same return type in function above doesn't)?

255

Hoist the assert above the if block since Parent is dereferenced there? IMO a vague string would be useful too (or comment above the assert).

llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
211

EXPECT_TRUE(BeginIt.getHeadBit()); here for completeness?

257

nit: comma -> full stop? Otherwise gives the impression of an unfinished comment.

Unittests LGTM. Comprehensive coverage yet easy to read, nice.

llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
1090

ASSERT/CHECK_TRUE(Entry.empty()) for completeness? YMMV on this one.

Orlando accepted this revision.Nov 1 2023, 11:24 AM
This revision is now accepted and ready to land.Nov 1 2023, 11:24 AM
jmorse updated this revision to Diff 558060.Nov 9 2023, 7:03 AM
jmorse marked 14 inline comments as done.

(Addressing comments, ticket boxes etc).

llvm/include/llvm/IR/Instruction.h
86–87

IMO no, as the DPValue argument should fairly obviously be the DPValue that's to be dropped -- it's not "drop one random" or "drop the first DPValue" after all.

190

SGTM

197

(Inserting a comment about the earlier moveBeforePreserving, and referring to it from here.

llvm/lib/IR/BasicBlock.cpp
809–810

Good spot, "those" is ==== here, I'll make that explicit.

llvm/lib/IR/Instruction.cpp
118–119

My recollection is that there are no scenarios where DPValues can change what instruction they're attached to as a result of insertAfter, for example:

call dbg.value
%foo = add
call dbg.value
%bar = sub
call dbg.value

Inserting after %foo or after %bar doesn't cause an instruction be inserted inbetween dbg.values and the next instruction. All the dbg.values continue to "happen" at the next instruction, and that's preserved using DPValues too.

Of course, if the insert position of an instruction is a dbg.value intrinsic then this is no longer true. Not allowing people to do that is one of the design goals though: doing that is always a bug.

248–249

I think there was a reason once, but it's since shifted X_X

jmorse added a comment.Nov 9 2023, 7:09 AM

Some other changes that have filtered in:

  • CodeGenPrepare::placeDbgValues deliberately removes dbg.value intrinsics, so I've removed the assertion in Instruction::eraseFromParent. We're in a much stronger position wrt. binary-identicalness so I'm not worried about this.
  • There are some fixes for TrailingDPValues now being stored in LLVMContextImpl, but they're all mechanical.
  • Some things have been renamed / reworded as suggested.
  • One of the comments in the big-comment in spliceDebugInfo was wrong, "Or if we didn't want to insert at the head of Dest"... but the example had the "head" bit set to true. I've flipped this to the correct way, but figured I'd highlight this as being deliberate rather than an accident. If we don't want to insert before dbg.values on an iterator, the "head" bit should be false.
This revision was landed with ongoing or failed builds.Nov 9 2023, 7:34 AM
This revision was automatically updated to reflect the committed changes.

Still LGTM

llvm/lib/IR/Instruction.cpp
118–119

That makes sense, though I still find the code comment confusing. If there's anything you can do to make it clearer that might help. YMMV

jmorse added inline comments.Nov 9 2023, 8:10 AM
llvm/lib/IR/Instruction.cpp
118–119

Ah yeah, I was going to update it, whoops. How about instead:

"There cannot be any debug-info records in the gap between this instruction, and the one we're inserting right now, therefore no DPValue maintenance is required".

?

Orlando added inline comments.Nov 9 2023, 8:34 AM
llvm/lib/IR/Instruction.cpp
118–119

IMO "There cannot be any debug-info records in the gap between this instruction and the one we're inserting" feels slightly misleading as there might be debug-info records, it's just that everything still ends up in the correct order without intervention IIUC.

How about something like:
"Inserting 'this' after 'InsertPos', 'this' comes before any debug-info records attached to the next instruction, so no debug-info updates are necessary."

I'm not 100% happy with that either. I feel I'm splitting hairs here though, so if you're not a fan of this iteration I'm happy for you to go with your suggestion.