This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Supporting all entities declared in lexical scope in LLVM debug info
ClosedPublic

Authored by aaboud on Jan 7 2016, 3:33 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

aaboud updated this revision to Diff 44273.Jan 7 2016, 3:33 PM
aaboud retitled this revision from to [Clang] Supporting all entities declared in lexical scope in LLVM debug info.
aaboud updated this object.
aaboud added a subscriber: cfe-commits.
probinson edited edge metadata.Jan 19 2016, 3:10 PM

I don't feel comfortable enough with Clang to review the code part.
The tests are generally good (one inline comment) but it seems like they could be done all in one test file, which is more efficient in terms of test run time.

test/CodeGenCXX/debug-info-lb-static2.cpp
13 ↗(On Diff #44273)

I think the second check wants to end with scope: [[FuncScope]], (as written I think it's just redefining FuncScope).

aaboud updated this revision to Diff 45400.Jan 20 2016, 8:59 AM
aaboud edited edge metadata.

Merged all 4 Lit tests into one.

dblaikie added inline comments.Jan 20 2016, 3:26 PM
lib/CodeGen/CGDebugInfo.cpp
1479 ↗(On Diff #45400)

Dose this assert not fire if there are two decls in the same lexical scope?

1481 ↗(On Diff #45400)

Should we assert that LexicalBlockStack isn't empty?

1488 ↗(On Diff #45400)

Perhaps we should assert here that if D's scope is a lexical block, that it should have a corresponding entry in the LexicalBlockMap?

1489 ↗(On Diff #45400)

Why does the type need to be added to the retained types list?

1491 ↗(On Diff #45400)

else-after-return (just drop the else and let the following code remain unconditionally)

2066 ↗(On Diff #45400)

Would it be reasonable/possible to do these changes one at a time & demonstrate the different situations in which types were not correctly nested in lexical scopes? Or are they all necessary for some correctness/invariant that must hold?

aaboud marked an inline comment as done.Jan 25 2016, 7:05 AM

Thanks David for the feedback.
I have some answers below, please let me know if they are not sufficient or you are still have concerns.

lib/CodeGen/CGDebugInfo.cpp
1479 ↗(On Diff #45400)

Why two decls would have the same object instance "Decl"?
I am not sure even that you can declare two decls with same name at same lexical-scope, but even if they are defined in the same lexical scope, they will have different line numbers, right?

1481 ↗(On Diff #45400)

It might be empty, if the declaration is in the program scope, right?
So, we do not want to assert, just to check.

1488 ↗(On Diff #45400)

We will get to this function also for D's that are not in lexical scope.
We should not assert, that is also why we have the else handling (which is how we used to get context before).

I can agree with you that the function name is misleading, and I am open to other suggestions.

1489 ↗(On Diff #45400)

Excellent question :)
This was your suggestion few months ago.
The reason is that backend will need to know what types are declared in lexical blocks.
Imported entities are collected in the import entity list, while types are collected in the retained types.

Check line 518 at DwarfDebug.cpp (http://reviews.llvm.org/D15976#51cfb106)

2066 ↗(On Diff #45400)

Clang changes cannot be committed before the LLVM changes, otherwise we would crash in backend.
Regarding splitting Clang changes into several comments it is possible with the following splits:

  1. Static variable (lines 2508-2517)
  2. Types (all the others)

Now types can be split into several comments as well: Records and Typedef
But the infrastructure of recording lexical-scope declarations and getting the lexical-scope context should be committed with the first patch (nevertheless what one it would be, Records or Typedef).

Regarding the test, I used to have 3 (even 4) separate tests, but I was asked to merged them.
Now, if we do commit patches in steps, I suggest that I still end up with one final test, but I will need to modify it with each patch (I will be building it step by step).

Now, what is your call? do you want me to commit the this patch in 3 steps? Or just go with it as one piece?

dblaikie accepted this revision.Jan 29 2016, 1:21 PM
dblaikie edited edge metadata.

Looks good to me. Test case could be simplified a bit further, but feel free to commit after that.

lib/CodeGen/CGDebugInfo.cpp
1479 ↗(On Diff #45400)

Right, I think I'm following now. I had some other ideas in my head that were wrong.

1481 ↗(On Diff #45400)

Yep, it's starting to make more sense to me now...

1489 ↗(On Diff #45400)

Not quite following - could we discover the types lazily as we're generating DWARF in the backend - if we happen to emit a variable of that type & generate the DIE* for the type, etc? (so that if a type becomes unused it's able to be dropped - rather than being preserved only when the type is local)

I suppose that would make it difficult to choose which scopes we had to preserve when emitting debug info in the backend (if we didn't see a type until after the scope was created - we wouldn't've known whether to emit/create that scope or collapse it into its parent, etc) - so this probably isn't possible/practical.

2066 ↗(On Diff #45400)

Right, incremental patches that update the test case (test case could be committed as-is with CHECK-NOTs in places where the expected output didn't match the desired result, or with them removed & the types that are incorrect omitted until the functionality was added)

But I think I'm understanding this mostly well enough for it go in as-is/in one go.

test/CodeGenCXX/debug-info-lb.cpp
7 ↗(On Diff #45400)

Could just make this a struct.

It probably doesn't need any members either. (simply "struct X { }")

14 ↗(On Diff #45400)

This block seems unnecessary, and its contents are probably more complicated than needed. This seems to suffice:

X a;
Y b;

& the global variable is emitted without referencing it anyway.

(you could drop the int return and x parameter, too - no need to return anything from the function, and the if (x) could be removed - the scope can remain & still produce the same debug info I think)

This revision is now accepted and ready to land.Jan 29 2016, 1:21 PM
aaboud marked an inline comment as done.Feb 2 2016, 2:35 PM

Thanks David for the comments.
See answer below.

lib/CodeGen/CGDebugInfo.cpp
1489 ↗(On Diff #45400)

Your analysis is correct.
I just tried to re-check how we can discover types lazily, but I could not figure out how to make sure the lexical scope would be created for these types.

I guess we can revisit this issue later and re-think a better implementation that make sure optimized types will not be emitted in the debug info. However, I think we can do that in a separate (following) patch.

test/CodeGenCXX/debug-info-lb.cpp
14 ↗(On Diff #45400)

OK, I simplify the test, and it is still does what I needed.

The only thing that I need to clarify is the extra block scope that I added to wrap the "a" and "b" variable definition.
The reason I did that is to test the case where the types "X" and "Y" are defined in a different scope than they are used at.
Without handling this case correctly, the LLVM IR could mistakenly associate "X" and "Y" to the scope of the usage and not the definition, i.e. the scope of "a" and "b", to test that I had to have different scope as in the example.
I added a comment in the test explaining that.

aaboud updated this revision to Diff 46708.Feb 2 2016, 2:36 PM
aaboud edited edge metadata.

Reduced the LIT tests and added comment explaining part of it.

Looks good - though you could do that one further simplification to the test case, I think. (removing x/if x)

test/CodeGenCXX/debug-info-lb.cpp
5 ↗(On Diff #46708)

Looks like you could remove this "if (x)" and the "int x" parameter & still keep the test otherwise in tact.

aaboud updated this revision to Diff 47988.Feb 15 2016, 5:55 AM

Simplified LIT tests according to David comment.

Thanks - please commit when ready

This revision was automatically updated to reflect the committed changes.