This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix handling of `DW_AT_decl_file` according to D91014
ClosedPublic

Authored by werat on Feb 16 2021, 6:18 AM.

Details

Summary

Apply changes from https://reviews.llvm.org/D91014 to other places where DWARF entries are being processed.

Diff Detail

Event Timeline

werat created this revision.Feb 16 2021, 6:18 AM
werat requested review of this revision.Feb 16 2021, 6:18 AM
Herald added a project: Restricted Project. · View Herald Transcript
werat added a comment.Feb 16 2021, 6:20 AM

Hi @jankratochvil, can you take a look at this, please?

I can't claim I fully understand what's the difference here, but this aligns with your comment at https://reviews.llvm.org/D92643#inline-900717 :)

jankratochvil accepted this revision.Feb 16 2021, 7:08 AM
jankratochvil added a reviewer: labath.

Thanks for catching those.

This revision is now accepted and ready to land.Feb 16 2021, 7:09 AM

I can't claim I fully understand what's the difference here, but this aligns with your comment at https://reviews.llvm.org/D92643#inline-900717 :)

If interested the problem was DWARFAttributes can contain attributes collected from multiple DIEs (linked by DW_AT_specification or DW_AT_abstract_origin). And with (future) DWZ patchset applied for LLDB such DIEs can come from multiple CUs. Therefore it is not enough to assume each attribute comes from CU of the original DIE.

Without DWZ it is sure not a bug. And DWZ is not yet in LLDB.

In general, I believe such a situation can occur even now, if DW_AT_specification/DW_AT_abstract_origin uses DW_FORM_ref_addr. And I think dsymutil actually produces such references. However, I don't think one can meaningfully use DW_AT_specification/DW_AT_abstract_origin for enumerators. It might be possible for variable DIEs in inlined functions (with the function's abstract origin residing in another CU), but as the only effect of that will be a change in the file the variable was supposedly declared in, I don't think anybody would notice, even if we ran into debug info like this.

I can't claim I fully understand what's the difference here, but this aligns with your comment at https://reviews.llvm.org/D92643#inline-900717 :)

If interested the problem was DWARFAttributes can contain attributes collected from multiple DIEs (linked by DW_AT_specification or DW_AT_abstract_origin). And with (future) DWZ patchset applied for LLDB such DIEs can come from multiple CUs. Therefore it is not enough to assume each attribute comes from CU of the original DIE.

Without DWZ it is sure not a bug. And DWZ is not yet in LLDB.

As @labath mentioned, certainly there's some support in LLDB for cross-CU references, as LLVM produces these when performing LTO.

I expect it'd be good to have a test case showing the sort of DWARF that DWZ produces for cross-CU references of enumerators.

I expect it'd be good to have a test case showing the sort of DWARF that DWZ produces for cross-CU references of enumerators.

There is a testcase in D91014: lldb/test/Shell/SymbolFile/DWARF/DW_AT_decl_file-DW_AT_specification-crosscu.s
There is no testcase in this patch.

I expect it'd be good to have a test case showing the sort of DWARF that DWZ produces for cross-CU references of enumerators.

There is a testcase in D91014: lldb/test/Shell/SymbolFile/DWARF/DW_AT_decl_file-DW_AT_specification-crosscu.s
There is no testcase in this patch.

That tests some other cross-cu references, but if it was testing the changes in this patch, wouldn't the test be failing (& I guess it isn't)?

So seems like more test coverage might be in order? (Either in another test file, or as an addition to the existing one)

That tests some other cross-cu references, but if it was testing the changes in this patch, wouldn't the test be failing (& I guess it isn't)?

It isn't testing the specific cross-CU case of this patch.

So seems like more test coverage might be in order? (Either in another test file, or as an addition to the existing one)

Yes, I will try to extend/write one (or @werat can :-) ).

werat added a comment.EditedFeb 17 2021, 9:05 AM

To be honest, I'm not sure how to reproduce this kind of debug info. I've tried a few examples (e.g. force inline the function from another CU) , but it didn't work.

Should we postpone this patch until someone can figure out the test case then?

To be honest, I'm not sure how to reproduce this kind of debug info. I've tried a few examples (e.g. force inline the function from another CU) , but it didn't work.

Should we postpone this patch until someone can figure out the test case then?

How'd you come across the issue? Presumably it requires some dwz usage to get, but that's OK - if there's a small example using dwz we could probably recreate that with some hand crafted assembly without going to great lengcths.

The D91014 patch has been found by a testsuite regression when running in DWZ mode. This patch has only the same pattern in the code and currently it is unclear whether this change is really reproducible or it is in fact just NFC.
OTOH I would find this change even as a code cleanup as (1) the code has less letters and (2) the previous code may work or not but the new code definitely should work.
I can check whether I can make a testcase for this patch as I agree new testcases are always good (but today it is 11pm here already).

I expect it'd be good to have a test case showing the sort of DWARF that DWZ produces for cross-CU references of enumerators.

Here is a new testcase for the second (DW_TAG_variable) fix: https://people.redhat.com/jkratoch/D96778-test.patch
I cannot update the patch here.

DWZ can produce such case from GCC output (not used for the testcase): g++ -o inlinevar inlinevar{1,2}.C -Wall -g -O2;dwz ./inlinevar
Such case is never produced clang output as the DW_TAG_variable declaration is not merged by clang -flto and DWZ for unknown reason does not merge the clang variant. I did merge it for the testcase from clang output by hand.

I did not try a testcase for the enumerators if possible at all.

I expect it'd be good to have a test case showing the sort of DWARF that DWZ produces for cross-CU references of enumerators.

Here is a new testcase for the second (DW_TAG_variable) fix: https://people.redhat.com/jkratoch/D96778-test.patch
I cannot update the patch here.

Ah, cool - for the final version of this, I'd suggest the author/whomever rolls this into one file with two CUs - bit easier to deal with. You could remove a bunch of extraneous DWARF too - since it's hardcoded DWARF, the subprogram/inlined subroutine/etc could be removed from CU1, leaving only the abstract subprogram and variable, both to be referenced from CU2.

DWZ can produce such case from GCC output (not used for the testcase): g++ -o inlinevar inlinevar{1,2}.C -Wall -g -O2;dwz ./inlinevar
Such case is never produced clang output as the DW_TAG_variable declaration is not merged by clang -flto and DWZ for unknown reason does not merge the clang variant. I did merge it for the testcase from clang output by hand.

I did not try a testcase for the enumerators if possible at all.

Could you provide the source code for this - I wouldn't mind trying it out and seeing what might be different/why DWZ doesn't understand this.

The LTO point you make is an interesting one too... yeah, for non-inlined (but inline linkage) functions LLVM during LTO deduplicates correctly, but this doesn't work out once inlining happens.

rolls this into one file with two CUs - bit easier to deal with.

Then one could not use the .file directives and one would need to code also .debug_line by hand.

You could remove a bunch of extraneous DWARF too - since it's hardcoded DWARF, the subprogram/inlined subroutine/etc could be removed from CU1, leaving only the abstract subprogram and variable, both to be referenced from CU2.

True, I have removed it now.

Could you provide the source code for this - I wouldn't mind trying it out and seeing what might be different/why DWZ doesn't understand this.

dwz will merge it without -flto but not with -flto: http://people.redhat.com/jkratoch/inlinevar.d.tar.xz
Personally I am not interested in DWZ, I am implementing it only as a compatibility with existing file format, IMNSHO DWZ should be dropped.

rolls this into one file with two CUs - bit easier to deal with.

Then one could not use the .file directives and one would need to code also .debug_line by hand.

Ah, right, makes sense.

Could you provide the source code for this - I wouldn't mind trying it out and seeing what might be different/why DWZ doesn't understand this.

dwz will merge it without -flto but not with -flto: http://people.redhat.com/jkratoch/inlinevar.d.tar.xz

Ah, yeah, so I see - thanks for the repro!

If I had to guess, it'd be because with LTO clang emits the abstract origin lexically after the inlined instance of "var", whereas with non-LTO clang emits the abstract origin before the inlined instance.

(& yeah, clang should get this right with lto without the need for dwz - LLVM correctly deduplicates types when merging for LTO, but doesn't do so for already-inlined functions, could be done with the same sort of logic, though)

Personally I am not interested in DWZ, I am implementing it only as a compatibility with existing file format, IMNSHO DWZ should be dropped.

Not sure I follow. dwz does use the existing DWARF file format. you mean you're only interested in compatibility with the DWARF directly produced by compilers, not post-processed by dwz? Fair enough. I guess @werat has some interest in supporting dwz for their use cases.

Personally I am not interested in DWZ, I am implementing it only as a compatibility with existing file format, IMNSHO DWZ should be dropped.

Not sure I follow. dwz does use the existing DWARF file format. you mean you're only interested in compatibility with the DWARF directly produced by compilers, not post-processed by dwz? Fair enough. I guess @werat has some interest in supporting dwz for their use cases.

No, I need compatibility of LLDB with DWZ (or rather LLDB needs it if it should get used on Red Hat systems). DWZ is specified by DWARF-5 but there are various DWARF tools still not supporting DWZ. Also the only DWZ format currently in use is a non-standard GNU extension of DWARF-4. The DWZ format has been designed for GDB making it difficult to parse it by LLDB. LLDB does per-DIE DWARF->IR conversion compared to GDB doing per-CU DWARF->IR conversion, that means that LLDB needs to keep context of the DWZ parent DIE (as DWZ recursively imports DW_TAG_partial_unit) for each its reference of DIE in its various indices which is a PITA. One cannot parse a PU (partial unit) without knowing which CU did import it. -fdebug-types-section brings most of the size savings of DWZ but unfortunately not all of it. I tried to drop DWZ for Fedora in exchange of -fdebug-types-section but I failed, there is too strong push for DWZ. From the mail thread one can see they try hard to find any little benefits of DWZ ignoring any its disadvantages: Fedora change proposal + its fedora-devel mail thread
DWZ format could be fixed for better consumption by intelligent consumers like LLDB but then Red Hat wants compatibility with existing flawed DWZ format being already present in RHEL-7/8/9 anyway. That means I would have to implement two different DWZ file formats LLDB compatibility so I rather implemented only the only existing DWZ format (D96236 D96237 D96238 D96239 D96240 D96241 D96242 D96243) and hopefully Fedora+RHEL will rather drop DWZ in the future.
(Currently I am blocked by some clang assertion for testsuite running in DWZ mode with libc++ - the assertion does not happen with libc++ without DWZ mode nor with libstdc++ in DWZ mode.)

Personally I am not interested in DWZ, I am implementing it only as a compatibility with existing file format, IMNSHO DWZ should be dropped.

Not sure I follow. dwz does use the existing DWARF file format. you mean you're only interested in compatibility with the DWARF directly produced by compilers, not post-processed by dwz? Fair enough. I guess @werat has some interest in supporting dwz for their use cases.

No, I need compatibility of LLDB with DWZ (or rather LLDB needs it if it should get used on Red Hat systems).

Ah, OK. You want RH compatibility. RH uses DWZ. You'd rather they didn't, but working with it, given they still do use it.

DWZ is specified by DWARF-5

What part of DWZ is specified by DWARFv5? My understanding is that DWZ uses existing standard features/didn't need anything in particular standardized in any DWARF version (perhaps the original creation of partial units was motivated by DWZ? But partial units have been around for a few versions - seems like it was introduced in DWARFv3)

but there are various DWARF tools still not supporting DWZ.

DWARF is a very broad (as they like to say, "permissive") standard. Lots of valid DWARF won't work in lots of places, because the full spectrum of interesting ways one could use DWARF features is too broad to be uniformly implemented - generally consumers and produces implement enough to work with each other and that's about it.

(for instance I recently started dabbling with a feature that would use DW_AT_ranges on DW_TAG_subprograms - totally valid DWARF, but unsurprisingly, DWARF consumers that'd never had to deal with DWARF like this before didn't have support for it (lldb, in particular - had partial support, hopefully it's got full support for at least single-length DW_AT_ranges, but I think last I checked there were still issues with multi-element ranges))

Also the only DWZ format currently in use is a non-standard GNU extension of DWARF-4.

Which extensions would those be? At a glance when looking at the small dwz example earlier I didn't notice any DWARF extensions. (though, in any case, we deal with extensions on a pretty regular basis - see all the DWARFv4 + Split DWARF work, for instance (& then the call_site/call_site_parameter stuff, etc))

The DWZ format has been designed for GDB making it difficult to parse it by LLDB. LLDB does per-DIE DWARF->IR conversion compared to GDB doing per-CU DWARF->IR conversion, that means that LLDB needs to keep context of the DWZ parent DIE (as DWZ recursively imports DW_TAG_partial_unit) for each its reference of DIE in its various indices which is a PITA.

(probably best to leave out the 'PITA'/swear/colourful language)

I'm not sure I follow the description here - when generating Clang ASTs from some DIE in LLDB, one would still have to potentially pass through multiple CUs to retrieve all the relevant types, especially under any form of LTO - are there particular features of DWZ that make this more/differently problematic?

jankratochvil added a comment.EditedFeb 20 2021, 2:00 PM

What part of DWZ is specified by DWARFv5?

Only DW_*_sup - being used by DWZ with DWARF-4 as DW_*_GNU_*_alt (and dwarfstd.org proposal having it as DW_*_alt) for DWZ common files located in /usr/lib/debug/.dwz/ where each file is for one rpm package sharing DIEs to multiple *.debug files of the same rpm package.

(perhaps the original creation of partial units was motivated by DWZ?

Rather DWZ was motivated by partial units IMO.

Which extensions would those be?

Those DW_*_alt/DW_*_sup ones.

At a glance when looking at the small dwz example earlier I didn't notice any DWARF extensions.

This example is not using the DWZ common file (that makes sense only for >=2 *.debug files). Still DW_TAG_partial_unit, DW_AT_import etc. is used only by DWZ and AFAIK no other tool. So these features need an extra implementation only for DWZ in all DWARF tools incl. LLDB.

I'm not sure I follow the description here - when generating Clang ASTs from some DIE in LLDB, one would still have to potentially pass through multiple CUs to retrieve all the relevant types, especially under any form of LTO - are there particular features of DWZ that make this more/differently problematic?

If you have a reference to DIE then LLDB tries to ask for DW_AT_language of its CU. But if the DIE is in DW_TAG_partial_unit (PU) there is no DW_AT_language and you do not know which CU did use DW_TAG_imported_unit->DW_AT_import for that PU. Moreover the DW_TAG_partial_unit can be imported by two different CUs one having DW_AT_language as C and the other as C++. That is IMO a bug, each PU should have its DW_AT_language and then PU with different language would be separate.

Still such format change would not help much as DWARFASTParser, CompilerDeclContext and similar stuff needs to come from the parent CU as otherwise LLDB all breaks down when some DIEs (from DW_AT_partial_unit possibly even from DWZ common file) come from a different AST.

What part of DWZ is specified by DWARFv5?

Only DW_*_sup - being used by DWZ with DWARF-4 as DW_*_GNU_*_alt (and dwarfstd.org proposal having it as DW_*_alt) for DWZ common files located in /usr/lib/debug/.dwz/ where each file is for one rpm package sharing DIEs to multiple *.debug files of the same rpm package.

Ah, thanks for the pointers! I didn't know about the cross-file feature of dwz.

rolls this into one file with two CUs - bit easier to deal with.

Then one could not use the .file directives and one would need to code also .debug_line by hand.

Ah, right, makes sense.

Not keeping up with the entire debate, just chiming in to say that sometimes one can good results through conditional compilation (.ifdef and friends) -- put the common stuff (abbreviations or whatever) in an unconditional block, and the different things into conditional blocks...

werat updated this revision to Diff 325415.Feb 22 2021, 4:04 AM

Added the test cases provided by jankratochvil@

This revision was landed with ongoing or failed builds.Mar 1 2021, 7:06 AM
This revision was automatically updated to reflect the committed changes.

@stella.stamenova It failed on a Windows box:
https://lab.llvm.org/buildbot/#/builders/83/builds/4202

$ "c:\buildbot\lldb-x64-windows-ninja\build\bin\clang.exe" "--target=specify-a-target-or-use-a-_host-substitution" "-o" "C:\buildbot\lldb-x64-windows-ninja\build\tools\lldb\test\SymbolFile\DWARF\Output\DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s.tmp" "--target=x86_64-pc-linux" "C:\buildbot\lldb-x64-windows-ninja\llvm-project\lldb\test\Shell\SymbolFile\DWARF\DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s" "C:\buildbot\lldb-x64-windows-ninja\llvm-project\lldb\test\Shell\SymbolFile\DWARF/Inputs/DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu2.s"
# command stderr:
clang: error: unable to execute command: program not executable

Do you think the following change would fix it?

-// RUN: %clang -o %t --target=x86_64-pc-linux %s \
+// RUN: %clang_host -o %t %s \
werat added a comment.Mar 1 2021, 8:49 AM

The test was also executed on lldb-aarch64-ubuntu -- https://lab.llvm.org/buildbot/#/changes/13868
But we have REQUIRES: x86, shouldn't it exclude ARM?

@stella.stamenova It failed on a Windows box:
https://lab.llvm.org/buildbot/#/builders/83/builds/4202

$ "c:\buildbot\lldb-x64-windows-ninja\build\bin\clang.exe" "--target=specify-a-target-or-use-a-_host-substitution" "-o" "C:\buildbot\lldb-x64-windows-ninja\build\tools\lldb\test\SymbolFile\DWARF\Output\DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s.tmp" "--target=x86_64-pc-linux" "C:\buildbot\lldb-x64-windows-ninja\llvm-project\lldb\test\Shell\SymbolFile\DWARF\DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s" "C:\buildbot\lldb-x64-windows-ninja\llvm-project\lldb\test\Shell\SymbolFile\DWARF/Inputs/DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu2.s"
# command stderr:
clang: error: unable to execute command: program not executable

Do you think the following change would fix it?

-// RUN: %clang -o %t --target=x86_64-pc-linux %s \
+// RUN: %clang_host -o %t %s \

Yes, I would expect it to.

The test was also executed on lldb-aarch64-ubuntu -- https://lab.llvm.org/buildbot/#/changes/13868
But we have REQUIRES: x86, shouldn't it exclude ARM?

My understanding it that REQUIRES indicates a target that should be included in the build. This particular buildbot includes the X86 target, so the test ran. What is the actual test requirement?

Considering this broke two bots, you might want to commit a fix or revert soon.

What is the actual test requirement?

It requires x86_64 (it is incompatible with x86_32) and hopefully it is OS-agnostic (after fixing the %clang_host).

What is the actual test requirement?

It requires x86_64 (it is incompatible with x86_32) and hopefully it is OS-agnostic (after fixing the %clang_host).

It sounds like the statement should be either:

REQUIRES: x86_64-linux

or:

REQUIRES: target-x86_64

What is the actual test requirement?

It requires x86_64 (it is incompatible with x86_32) and hopefully it is OS-agnostic (after fixing the %clang_host).

This assembly is definitely not os-agnostic. I tried writing one once -- after writing a bunch of macros to abstract the differences (different labels, different ways to do section cross-references, ...) I eventually gave up. I think it should run on elf-based systems (linux + BSD variants). I believe the only way to express that now is to write UNSUPPORTED: system-darwin, system-windows, though it may be interesting to add an system-elf feature or something.