This is an archive of the discontinued LLVM Phabricator instance.

[codeview] omit debug locations for nested exprs unless column info enabled
ClosedPublic

Authored by inglorion on Sep 6 2017, 1:15 PM.

Details

Summary

Microsoft Visual Studio expects debug locations to correspond to
statements. We used to emit locations for expressions nested inside statements.
This would confuse the debugger, causing it to stop multiple times on the
same line and breaking the "step into specific" feature. This change inhibits
the emission of debug locations for nested expressions when emitting CodeView
debug information, unless column information is enabled.

Fixes PR34312.

Event Timeline

inglorion created this revision.Sep 6 2017, 1:15 PM
inglorion updated this revision to Diff 114060.Sep 6 2017, 1:19 PM

removed accidentally left in include and reformatted mangled comment

zturner added inline comments.Sep 6 2017, 1:37 PM
clang/lib/CodeGen/CGDebugInfo.h
65 ↗(On Diff #114060)

Can you move this line up to put it next to another bool? Not a huge deal, but might as well pack the class members.

clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
45

Can you make a function called int foo() and make this int a = bar(foo(), y) + ...

rnk added a subscriber: echristo.Sep 6 2017, 2:15 PM
rnk added inline comments.
clang/lib/CodeGen/CGStmt.cpp
38 ↗(On Diff #114060)

"Stop point" is a hold-over from the llvm.dbg.stoppoint, which doesn't exist anymore:
http://releases.llvm.org/2.6/docs/SourceLevelDebugging.html#format_common_stoppoint

@echristo renamed CGDebugInfo::EmitStopPoint to EmitLocation long ago, and we should do the same for CodeGenFunction::EmitStopPoint. I'd like to have something like EmitStmtLocation and EmitExprLocation.

45 ↗(On Diff #114060)

Does MSVC accept this? I think it will emit the copy ctor call in an -O0 build.

145 ↗(On Diff #114060)

Doesn't this end up recursing? Won't InhibitDebugLocation prevent us from applying the inner statement locations?

clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
13

This is pretty painful to test. :(

If we let ourselves do an asm test, .cv_loc is printed with some nice assembly comments that make this testing easy.

49

I'd like to see some compound statement tests: for, while, if, regular braces, etc.

inglorion added inline comments.Sep 6 2017, 2:33 PM
clang/lib/CodeGen/CGStmt.cpp
45 ↗(On Diff #114060)

I wrote this thinking that the right thing would happen under copy elision (there is only one object, move constructor isn't called, and the destructor only runs once) and without copy elision (there are two objects, move constructor is called, destructor is run for both objects but is a no-op for the moved-from object). If that's not the case, how would you rewrite this to do the right thing?

145 ↗(On Diff #114060)

Yeah, this looks wrong. Let me get back to you with fixed code or an explanation of why it actually does the right thing.

clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
13

Would you like me to write an asm test in addition to / instead of this one?

45

Yes. Why? To test an additional level of nesting?

rnk added inline comments.Sep 6 2017, 2:39 PM
clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
13

Oops, I thought I wrote this comment and deleted it before hitting send, but apparently not. No, let's stick with the IR test.

zturner added inline comments.Sep 6 2017, 2:48 PM
clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
45

Yes. for that matter, even better would be if this call to foo() spans multiple lines. Right here you've got a single statement which spans multiple lines, but no individual sub-expression spans multiple lines. And FWICT there is no nesting at all, since + is just a builtin operator.

inglorion added inline comments.Sep 6 2017, 3:09 PM
clang/lib/CodeGen/CGStmt.cpp
45 ↗(On Diff #114060)

FWIW, the test passes with MSVC in a Debug build, too.

inglorion added inline comments.Sep 6 2017, 3:20 PM
clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
45

The nesting here is the calls to bar, baz, and qux inside the declaration of a. The old behavior would emit separate locations for each of the calls, the new behavior annotates each of the calls with the same location as the declaration. This causes the debugger to stop only once for the entire statement when using step over, and makes step into specific work.

zturner added inline comments.Sep 6 2017, 3:26 PM
clang/test/CodeGenCXX/debug-info-nested-exprs.cpp
45

Sure, but does that execute the same codepath as:

int a = bar(
                  baz());

a = bar(baz());

for (const auto &X : foo(bar()))

for (int x = foo(); x < bar(); baz(x))

if (foo() && bar())

if (createObject().func())

etc? And what if one or more of these functions get inlined? For example, what if you have:

int a = foo(bar(baz()), bar(buzz()));
inglorion updated this revision to Diff 114097.Sep 6 2017, 4:38 PM
inglorion marked 5 inline comments as done.

I limited the change to only calls, returns, and declarations. I also
updated the test case to include a multi-variable declaration, a while
loop, a for loop, and an if statement (after verifying the behavior in
the debugger, compared to MSVC). I discovered that there is a
difference between the generated info for DWARF with or without
-dwarf-column-info, so I included that in the test, too. I also made a
couple of minor changes that were suggested.

inglorion planned changes to this revision.Sep 6 2017, 5:34 PM

rnk and I talked about a different approach. The idea is to explicitly emit locations in some cases (e.g. inside compound statements, the braces of for loops, ...), and otherwise emit locations only when emitting column info or emitting non-codeview debug info. That may lead to more elegant code. I'll give it a try later.

inglorion updated this revision to Diff 114292.Sep 7 2017, 5:27 PM

Following conversation with @rnk, I managed to whittle this down to a very small change that seems to do what we need. Step into specific works and single stepping through the program behaves similarly whether compiled with clang-cl or cl.

@zturner: I am still thinking about your comment about other cases to test. My concern is that there are very many possible combinations.

I'm actually not too concerned about not exactly matching cl's behavior in every single case. The difference in behavior here is us emitting a debug location for an expression that doesn't get its own debug location from cl. In general, I think having more fine-grained information is good, so I don't think differing in this way is a problem. That is, unless we end up breaking functionality in the debugger (Visual Studio). The behavior I know of we can end up breaking this way is step into specific, which appears to require multiple calls that are associated with a single debug location.

I think, at a minimum, the test case should cover a scenario where we would normally like to emit some debug locations, but need to elide them if we want to be compatible with Visual Studio. I also think it makes sense to include some cases where we want the behavior to be the same whether or not we're targeting MS compatibility. That way, we can verify that we aren't throwing away too much information. Beyond that, I feel there are diminishing returns. To avoid going too far down that path, I would like to start with a relatively small test case (as I've done), fix the bug that prompted me to write this code, and then add additional tests if we find out there are other cases where people care strongly about the granularity of the debug locations we emit. Does that sound reasonable?

inglorion updated this revision to Diff 114299.Sep 7 2017, 6:01 PM

removed compound statement; that was only in there to double check that the debugger stops on the first child statement, but that's true with or without this change

zturner edited edge metadata.Sep 7 2017, 6:02 PM

@inglorion : Since we're already in this code anyway, should we at least do due diligence and check the basics, even if only manually? As in, at least we should check in MSVC if any of those other types of examples that I came up with present a problem. Otherwise we're going to be back here fixing this again in a few weeks. And it doesn't seem like it would be that hard to whip up some sample code with 10 or 12 cases and just do a sanity check that they all work. If they don't, we can then decide whether it should be fixed in this patch, or a bug can be filed to track it.

inglorion updated this revision to Diff 114463.Sep 8 2017, 4:42 PM

added examples suggested by @zturner, verified step over and step into specific behavior matches MSVC, and added tests for them

inglorion updated this revision to Diff 114465.Sep 8 2017, 5:19 PM

Of course, ApplyDebugLocation is also a perfectly legitimate way to add a debug location to nodes that are not nested inside nodes that already have a location. I updated the diff so that we do end up applying the location in such cases.

rnk accepted this revision.Sep 11 2017, 12:25 PM

lgtm

clang/lib/CodeGen/CodeGenModule.h
517

I'd drop the "nested" part of this name. Statements are also nested, in a sense. A for loop is a statement, and it has child statements. This is really about expression locations vs. statement expressions.

This revision is now accepted and ready to land.Sep 11 2017, 12:25 PM
majnemer added inline comments.
clang/lib/CodeGen/CodeGenModule.h
32

Any reason to do this? I'd just keep getNestedExpressionLocationsEnabled in the .cpp file.

rnk added inline comments.Sep 11 2017, 2:03 PM
clang/lib/CodeGen/CodeGenModule.h
32

+1, I think we can continue to expect that ApplyDebugLocation will be used for expression emission and CodeGenFunction::EmitStopPoint (or whatever we rename it to) will be called for statements in CGStmt.

inglorion updated this revision to Diff 114715.Sep 11 2017, 3:09 PM

renamed get{Nested,}ExpressionLocationsEnabled and moved it into CodeGetModule.cpp

This revision was automatically updated to reflect the committed changes.