This is an archive of the discontinued LLVM Phabricator instance.

Salvage debug info from instructions about to be deleted
ClosedPublic

Authored by aprantl on Mar 13 2017, 3:55 PM.

Details

Summary

This patch improves debug info quality in InstCombine by looking at values that are about to be deleted, checking whether there are any dbg.value instrinsics referring to them, and potentially encoding the semantics of the deleted instruction into the dbg.value's DIExpression.

In the example in the testcase (which was extracted from XNU) there is a sequence of

%4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
%5 = bitcast %struct.entry* %4 to i8*, !dbg !42
%add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
%6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34

When these instructions are eliminated by instcombine one after another, we can still salvage the otherwise dead debug info:

  • bitcasts have no effect, so have the dbg.value point to operand(0)
  • loads can be expressed via a DW_OP_deref
  • constant gep instructions can be replaced by DWARF expression arithmetic

The API introduced by this patch is not specific to instcombine and can be useful in other places, too.

Diff Detail

Repository
rL LLVM

Event Timeline

aprantl created this revision.Mar 13 2017, 3:55 PM
dblaikie added inline comments.Mar 13 2017, 4:15 PM
lib/Transforms/Utils/Local.cpp
1382–1383 ↗(On Diff #91632)

Is this tested?

1386 ↗(On Diff #91632)

Possible refactoring: using an enum rather than a bool parameter, to help legibility.

test/Transforms/InstCombine/debuginfo-dce.ll
2–12 ↗(On Diff #91632)

Any way this could be simplified/changed to make it clearer (possibly separate into individual functions) which salvage operations are being tested (the gep, load, and bitcast)?

majnemer added inline comments.
lib/Transforms/Utils/Local.cpp
1366–1375 ↗(On Diff #91632)

Any idea on the perf impact of this change?

1386 ↗(On Diff #91632)

BuildReplacementDIExpr takes an int for the offset. Won't this be problematic if isIntN returns true but getLimitedValue returns something bigger than INT_MAX?

1386–1387 ↗(On Diff #91632)

Hmm, should this be isSignedIntN and getSExtValue? I could believe it shouldn't be but I'd like to understand the rationale.

1394 ↗(On Diff #91632)

Doesn't this shadow the other DbgValues?

Working on an updated version now...

lib/Transforms/Utils/Local.cpp
1366–1375 ↗(On Diff #91632)

I made a fairly unscientific test by compiling running time ninja llc (building a stage2 llc Release+Assert using a Release+Assert compiler) on my iMac and the time difference between with and without this patch is well within the noise.

I also captured the resulting .dSYM bundles and ran dwarfdump | grep -c AT_location on it and it showed a 20% increase of variables with a location (993456/827467).

aprantl marked 4 inline comments as done.Mar 14 2017, 4:55 PM

New version attached.

lib/Transforms/Utils/Local.cpp
1382–1383 ↗(On Diff #91632)

No and apparently this is sort of redundant. While attempting to create a testcase for this I learned that it is impossible to create GEP offsets with a type other than i32. I should probably just remove it.

1386–1387 ↗(On Diff #91632)

It looks like it doesn't matter. In the testcase the offset is -8 and it comes out correctly. Not sure yet whether this is a happy accident or the documed behavior.

aprantl updated this revision to Diff 91794.Mar 14 2017, 4:56 PM
aprantl updated this revision to Diff 91888.Mar 15 2017, 9:19 AM
aprantl marked an inline comment as done.

Use the functionally equivalent but more appropriate APInt::getSExtValue() and rename some locals for better readability.
I think that should be all outstanding comments.

aprantl marked 5 inline comments as done.Mar 15 2017, 9:19 AM
dblaikie accepted this revision.Mar 15 2017, 4:42 PM

As best as I can tell, this seems OK.

Code readability - it looks like findDbgValues should return a sequence instead of populating it, then it'd be easier to read. (or, alternatively, findDbgValues could be a function template that takes a lambda describing what to do with the dbg values)

include/llvm/Transforms/Utils/Local.h
289 ↗(On Diff #91888)

I'd give this a name & use it for the function parameter. Maybe - I can sort of see how the implementation's more legible as a boolean than an enum... *shrug*

This revision is now accepted and ready to land.Mar 15 2017, 4:42 PM
This revision was automatically updated to reflect the committed changes.