This is an archive of the discontinued LLVM Phabricator instance.

Be lazier about loading .dwo files
ClosedPublic

Authored by Eric on Apr 12 2021, 4:22 AM.

Details

Summary

This change makes sure that DwarfUnit does not load a .dwo file until necessary. I also take advantage of DWARF 5's guarantee that the first support file is also the primary file to make it possible to create a compile unit without loading the .dwo file.

Diff Detail

Event Timeline

Eric created this revision.Apr 12 2021, 4:22 AM
Eric requested review of this revision.Apr 12 2021, 4:22 AM
Eric updated this revision to Diff 336820.Apr 12 2021, 6:43 AM
Eric updated this revision to Diff 336829.Apr 12 2021, 8:35 AM
Eric updated this revision to Diff 337090.Apr 13 2021, 3:41 AM

Removing header files to make clang-tidy happy

Eric updated this revision to Diff 337161.Apr 13 2021, 8:33 AM

Not sure what's going on with clang-tidy, but those headers still exist and in any case I didn't add them.

Change is ready for review.

Eric updated this revision to Diff 337165.Apr 13 2021, 8:35 AM
Eric added a comment.Apr 20 2021, 8:38 AM

Let me know if I should request this review from someone else. It is important for scalability, as this change, in combination with my follow-up change (https://reviews.llvm.org/D100771) eliminate the need to load all .dwo files in the most common debugging scenarios.

Thanks,

  • Eric
Eric added a reviewer: kimanh.Apr 22 2021, 3:26 AM
Eric added a comment.May 17 2021, 9:37 AM

Can I get a review on this please?

Eric added a comment.Jul 6 2021, 2:55 AM

Ping on this

yeah, might need to track down folks who have some context here (I'm not sure who added the initial support for Split DWARF to lldb, for instance - but if they're still around/involved in the project, they'd probably be a reasonable reviewer) - maybe reaching out to them via email off-list (sometimes peoples email filters mean reviews/list mail can be hard to spot) or via chat mediums, etc.

Jan or Tamas, can either of you take a look?

I do not see any functionality flaw in this patch, thanks. I just wrote down many coding style improvements.

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
74

Despite LLDB contains this style I got confirmed it is preferred to reduce the indentation.

353

No longer needed.

364–365
699

GetIsOptimized should now call GetLazyIsOptimized.

700

Just to reduce the indentation.

702

Just reduce the indentation.

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
327
343

(I am not a native English speaker but) could not it use some more obvious name such as s/ensured/loaded/ or s/ensured/searched/?

344
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
729

But that is not a topic of this patch so not required, it would need new rvalue SetSupportFiles implementation.

733

The two blocks of code:

+          if (ParseSupportFiles(dwarf_cu, module_sp, support_files) &&
+               support_files.GetSize() > 0) { ... }
+         if (need_non_skeleton) { ... }

could be put into a function (or lambda).

1010

I do not see how this is related to this patch. Isn't it a separate bugfix? I haven't tried it on OSX and this function is Apple-specific. I understand it is probably correct+needed but it should be at least moved to a different patch/review.

This whole optimization should also have a testcase, it should be easy (tried it using log enable lldb object for a file built with -gdwarf-5 -gsplit-dwarf -gpubnames).

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1010

I do not think it is needed here because it gets called by SymbolFileDWARF::ParseCompileUnit:

-> 745 	            bool is_optimized = dwarf_cu.GetNonSkeletonUnit().GetIsOptimized();

It also works fine for a file built with: clang -glldb -gsplit-dwarf -O3
For such change there should be a testcase.
Maybe there could rather be:

lldbassert(!m_dwo);
Eric updated this revision to Diff 358643.Jul 14 2021, 9:30 AM
Eric marked 10 inline comments as done.
Eric added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
699

Done. I believe this preserves existing behavior regarding changes to m_is_optimized, though I'm not sure if that's desired.

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
343

Hope this is clearer!

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
729

It's a good idea, though -- copying the SupportFiles involves revalidating all the paths.

733

Why would this help?

1010

With my change, we may call GetLazyIsOptimized(), which may result in creating a CompileUnit where is_optimized is eLazyBoolCalculate, resulting in it being parsed on demand. Previously we always eagerly evaluated is_optimized when constructing the CompileUnit, meaning that this function was effectively dead code and also incorrect as far as I could tell.

This whole optimization should also have a testcase, it should be easy (tried it using log enable lldb object for a file built with -gdwarf-5 -gsplit-dwarf -gpubnames).

I haven't found you would write one. Wrote one as split-lazy-load.s: https://people.redhat.com/jkratoch/D100299-tests.patch

The testcases would be nice to review, maybe @dblaikie?

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
695

clang-format is right, please use it: clang/tools/clang-format/git-clang-format

699

GetIsOptimized is in SB API so the behavior should be preserved.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
733

Less code duplication? Less lines of code? You don't like such change (the diff is after git-clang-format)?

1010

OK, I get it now. Thanks for the explanation, I made a test mistake before myself.
But that is definitely worth a testcase. Created one as split-optimized.s: https://people.redhat.com/jkratoch/D100299-tests.patch

lldb/source/Symbol/CompileUnit.cpp
178–179 ↗(On Diff #358643)

A named rvalue reference is an lvalue. It would have no effect this way.
It should be a separate [nfc] patch.

jankratochvil requested changes to this revision.Jul 16 2021, 12:30 PM
This revision now requires changes to proceed.Jul 16 2021, 12:30 PM

This whole optimization should also have a testcase, it should be easy (tried it using log enable lldb object for a file built with -gdwarf-5 -gsplit-dwarf -gpubnames).

I haven't found you would write one. Wrote one as split-lazy-load.s: https://people.redhat.com/jkratoch/D100299-tests.patch

The testcases would be nice to review, maybe @dblaikie?

I'm not especilaly familiar with lldb testing, but happy to take a look - any particular aspects you'd like me to consider?

I'm not especilaly familiar with lldb testing, but happy to take a look - any particular aspects you'd like me to consider?

  • split-lazy-load.test is arch-independent but when I use %clang I have to use some specific --target:
error: unknown target triple 'specify-a-target-or-use-a-_host-substitution', please use -triple or -arch

So I used Linux-X86 but that is sure not perfect.

  • For split-optimized.s I would not need the assembly as one can build such file just with clang -gdwarf-5 -glldb -gsplit-dwarf=single -O3. But then I am concerned the clang output will change in the future possibly no longer really testing what it should (such as future clang may include DW_AT_APPLE_optimized even into the skeleton for example). So whether the is an agreement an assembly snapshot is appropriate in such case.

I'm not especilaly familiar with lldb testing, but happy to take a look - any particular aspects you'd like me to consider?

  • split-lazy-load.test is arch-independent but when I use %clang I have to use some specific --target:
error: unknown target triple 'specify-a-target-or-use-a-_host-substitution', please use -triple or -arch

How are other lldb tests written in this regard, and especially other tests for Split DWARF? I'm not sure how portable Split DWARF emission is with different object formats (like COFF and MachO) - so it may be that the test isn't portable? (checking how other tests for Split DWARF work might give some sense of how to write them to be as portable as is appropriate, etc)

So I used Linux-X86 but that is sure not perfect.

  • For split-optimized.s I would not need the assembly as one can build such file just with clang -gdwarf-5 -glldb -gsplit-dwarf=single -O3. But then I am concerned the clang output will change in the future possibly no longer really testing what it should (such as future clang may include DW_AT_APPLE_optimized even into the skeleton for example). So whether the is an agreement an assembly snapshot is appropriate in such case.

Since the lldb tests sort of act as a convenient place for end-to-end debugging testing, I think (but I don't know, I'm not an lldb developer) the preference is towards source based tests - if there's properties that could throw things off significantly, you could add some validation that the properties are interesting. Such as using llvm-dwarfdump to confirm that DW_AT_APPLE_optimized is not present on the skeleton. (though, side question: if it'd be reasonable/useful to have that on the skeleton & that'd help address some of the issues this review is trying to address, we could do that? It wouldn't be expensive to put that on the skeleton if it really helps).

How are other lldb tests written in this regard, and especially other tests for Split DWARF?

Sorry,there is %clangxx_host, I even remember it now.

Since the lldb tests sort of act as a convenient place for end-to-end debugging testing, I think (but I don't know, I'm not an lldb developer) the preference is towards source based tests - if there's properties that could throw things off significantly, you could add some validation that the properties are interesting. Such as using llvm-dwarfdump to confirm that DW_AT_APPLE_optimized is not present on the skeleton.

OK, done in: D106194

(though, side question: if it'd be reasonable/useful to have that on the skeleton & that'd help address some of the issues this review is trying to address, we could do that? It wouldn't be expensive to put that on the skeleton if it really helps).

That would violate DWARF-5 3.1.2 Skeleton Compilation Unit Entries (page 67 line 30). I do not think it is needed just that it could happen accidentally or not disabling this testcase. But it is protected now by that llvm-dwarfdump you suggested.

How are other lldb tests written in this regard, and especially other tests for Split DWARF?

Sorry,there is %clangxx_host, I even remember it now.

Since the lldb tests sort of act as a convenient place for end-to-end debugging testing, I think (but I don't know, I'm not an lldb developer) the preference is towards source based tests - if there's properties that could throw things off significantly, you could add some validation that the properties are interesting. Such as using llvm-dwarfdump to confirm that DW_AT_APPLE_optimized is not present on the skeleton.

OK, done in: D106194

Looks plausible, with the previous caveat that I'm not especially familiar with lldb testing.

(though, side question: if it'd be reasonable/useful to have that on the skeleton & that'd help address some of the issues this review is trying to address, we could do that? It wouldn't be expensive to put that on the skeleton if it really helps).

That would violate DWARF-5 3.1.2 Skeleton Compilation Unit Entries (page 67 line 30). I do not think it is needed just that it could happen accidentally or not disabling this testcase. But it is protected now by that llvm-dwarfdump you suggested.

Yeah... but if there's good reason to put it there I wouldn't mind violating the spec/filing a DWARF feature request, etc.

What's the optimized property used for? This patch currently looks like it "lies" if the query is made too early (before something else has forced the DWO to be loaded), right? That seems a bit concerning to me (not giving a consistent view of the property - subtly changing the answer to the query based on other dwarf loading (that's like when gdb has index issues - and will fail to lookup a name until you happen to do something related to teh CU that name is in, then the lookup works... ). Should the query instead return something else indicating that the answer isn't known? (or assert that the query shouldn't be made until the DWO has been loaded?)

What's the optimized property used for?

Just to print a warning.

This patch currently looks like it "lies" if the query is made too early (before something else has forced the DWO to be loaded), right?

Not in practice as the DWARFUnit::GetIsOptimized() will be called only for a function in CompileUnit which means GetNonSkeletonUnit() had to be loaded already. For the case there happens some other new caller of DWARFUnit::GetIsOptimized() it could be safer to add to DWARFUnit::GetIsOptimized() :

lldbassert(!m_dwo);
lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
101

We know that m_addr_base is not unset therefore use the address (I have also suggested to preserve the original SetAddrBase prototype).

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
160

Setting empty addr_base is not useful, no need to complicate it.

What's the optimized property used for?

Just to print a warning.

This patch currently looks like it "lies" if the query is made too early (before something else has forced the DWO to be loaded), right?

Not in practice as the DWARFUnit::GetIsOptimized() will be called only for a function in CompileUnit which means GetNonSkeletonUnit() had to be loaded already. For the case there happens some other new caller of DWARFUnit::GetIsOptimized() it could be safer to add to DWARFUnit::GetIsOptimized() :

lldbassert(!m_dwo);

Not sure I follow that second sentence - but if the first sentence is true, then could GetIsOptimized be changed so that this test (GetLazyIsOptimized() == eLazyBoolCalculate && GetUnitDIEPtrOnly()) is asserted (well, the inverse? Not sure) rather than conditional? (ie: There's an expectation that the value is always calculated by the time this is called, so if it's called without it being calculated, that means there's a bug in lldb, right?)

could GetIsOptimized be changed so that this test (GetLazyIsOptimized() == eLazyBoolCalculate && GetUnitDIEPtrOnly()) is asserted (well, the inverse? Not sure) rather than conditional?

Not really as GetIsOptimized call with (GetLazyIsOptimized() == eLazyBoolCalculate && GetUnitDIEPtrOnly()) happens in the case of

  • either non-split DWARF with missing DW_AT_APPLE_optimized
  • or DWO parsing with missing DW_AT_APPLE_optimized

GetLazyIsOptimized() has to return eLazyBoolCalculate so that it can be called from SymbolFileDWARF::ParseCompileUnit also for skeleton parsing where it needs to delay the decision until DWO is loaded.

(ie: There's an expectation that the value is always calculated by the time this is called, so if it's called without it being calculated, that means there's a bug in lldb, right?)

There are two reasons why it may not be possible to calculate it:

  • We are parsing skeleton but DW_AT_APPLE_optimized is in DWO which is not loaded yet.
  • DW_AT_APPLE_optimized is missing (either in the non-split unit or in DWO itself).
  • (ignoring here a third reason of corrupted unparseable DWARF)

In the former case we have to defer the decision until DWO is loaded. In the latter case we have to recognize eLazyBoolCalculate as it is eLazyBoolNo (the default assumption for the case missing DW_AT_APPLE_optimized).

For the case there happens some other new caller of DWARFUnit::GetIsOptimized() it could be safer to add to DWARFUnit::GetIsOptimized() :

lldbassert(!m_dwo);

Not sure I follow that second sentence

I was proposing to make it safe against future changes of LLDB where someone could start to call DWARFUnit::GetIsOptimized() for a skeleton DWARFUnit before its CompileUnit gets created (and therefore its DWO would still not be loaded).

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
695–696
699

I'm confused. I thought in https://reviews.llvm.org/D100299#2885712 it was said that GetIsOptimized would not be called until the value had been loaded (if present in the DWARF). But then in the last comment it suggests it may be?

But also I'm confused why if the attribute isn't present in DWARF that the query would observe the "uninitialized/calculate" state - shouldn't the attribute not being present, once the attribute has been looked for, result in a known value rather than the uninitialized state? Or do you mean there's really 4 states: optimized, unoptimized, attribute not present, attribute not checked for yet/don't know if the attribute is present?

Perhaps we could use those 4 states then, and assert that the value isn't in the 4th state whenever GetIsOptimized is called? That way we can be sure that the unloaded state is never observed?

Leaving the rewrite of the optimization flag getter up to @Eric, I could do it otherwise.

Eric added a comment.Jul 21 2021, 10:46 PM

Thanks for your feedback. I was off for a long weekend but will finish fixing up this change tomorrow.

On the question of the property accessor, the difference between the two is that GetIsOptimized is always called on the non-skeleton unit, so it gives a final answer, whereas GetLazyIsOptimized is called by code intended not to trigger loading the non-skeleton unit, so it may not have the answer, and then the property accessor on the CompileUnit will trigger the code path to get it from the non-skeleton unit.

A side effect of this change as currently written is that if some compiler doesn't follow the spec and puts the DW_AT_APPLE_optimized tag (or the language for that matter) in the skeleton unit, not only will we use it but it will override the non-skeleton unit. I wanted to provide this information eagerly when possible but now I'm thinking it would be better to not do this and instead leave the accessors on DwarfUnit unchanged and initialize the compile unit without these, only computing them on demand.

The downside of making the language computed on demand is that the CompileUnit::Dump() method does not trigger on-demand computation and so lldb will actually print language = "unknown" if you run an image lookup command. I discovered early that switching these to be always computed on demand triggered test failures for this reason, so I made the change to get both of these properties eagerly when possible. However this is entirely unnecessary for the optimized flag as Dump doesn't print that. The remaining question is what we want the behavior to be for the language. I'm only changing DWARF5 behavior, and no tests currently do a dump on dwarf 5 symbols. The question is should I always compute on demand, resulting in the language being unknown until the user does something that causes a call to CompileUnit::GetLanguage() (like evaluating an expression), or do I try to make it eager when possible as in the current iteration of this change?

pfaffe added a subscriber: pfaffe.Jul 22 2021, 7:10 AM

Sorry, I'm really not following well - this might just be too far out of my depth. I'm trying to distill my general concern to make it comprehensible/clear...

If this patch introduces a case where doing some relatively unrelated debugging activity (such as stepping into a function, etc) to cause some other operation to behave differently (going from language "unknown" to the correct language - to go from missing the "this thing might've been optimized, so will be harder to debug" to having that message) that would be a bad thing from a usability perspective. Whatever operations need to be done non-lazily to ensure that answers are consistent is probably worth doing.

(at the code level: a function that returns the same value for "this hasn't been loaded yet" as "this has been loaded, but the value wasn't present in the loaded data" seems really likely to cause this ^ problem - an early call gets the first answer, thinks it means "the value wasn't present" (& renders that to the user) and then later something causes the data to be loaded and it turns otu the value was present and now the user gets a different message when executing the same operation)

Eric updated this revision to Diff 361230.Jul 23 2021, 8:33 AM
Eric marked 6 inline comments as done.
Eric added a comment.Jul 23 2021, 8:39 AM

Updated and hopefully a little simpler now.

The optimized flag is always parsed on demand so there's no concern there and no need for the previous complication. Any call to CompileUnit::GetLanguage() will parse this on demand, but calls to ::Dump() cannot do this as it is a constant method. In my own testing I find that simply hitting a breakpoint is sufficient to get the language to be evaluated, so I don't think users will be seeing much language = "unknown" in practice.

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
695–696

Change undone

lldb/source/Symbol/CompileUnit.cpp
178–179 ↗(On Diff #358643)

Removing, to be added in separate patch

Updated and hopefully a little simpler now.

The optimized flag is always parsed on demand so there's no concern there and no need for the previous complication. Any call to CompileUnit::GetLanguage() will parse this on demand, but calls to ::Dump() cannot do this as it is a constant method.

So dump always passes a the skeleton unit to "GetLanguage" or it already makes a dynamic choice based on whether or not the dwo has already been loaded?

I'd still be a bit concerned if the way "language" is rendered is the dump is indistinguishable between "dwo hasn't been loaded" and "there is no language specified in the full (split or non-split) unit" - while dump probably isn't the most important feature for end users, it may be important for lldb developers - if they get confused/mislead that the "language" is not present/can't be parsed, rather than the dwo hasn't been loaded, that might result in some frustration/waste of time. It'd be good/important to render those different states differently.

In my own testing I find that simply hitting a breakpoint is sufficient to get the language to be evaluated, so I don't think users will be seeing much language = "unknown" in practice.

So dump always passes a the skeleton unit to "GetLanguage" or it already makes a dynamic choice based on whether or not the dwo has already been loaded?

GetLanguage has now a bug for -gsplit-dwarf as it will return eLanguageTypeUnknown if called on the skeleton - which could affect ManualDWARFIndex::IndexUnitImpl but then it is using LanguageType cu_language only for cu_language == eLanguageTypeObjC. Which is needed only on OSX and OSX does not have -gsplit-dwarf. GetLanguage should IMO force loading of the DWO.

I'd still be a bit concerned if the way "language" is rendered is the dump is indistinguishable between "dwo hasn't been loaded" and "there is no language specified in the full (split or non-split) unit"

Do you talk about CompileUnit::Dump? CompileUnit always has DWO already loaded.

(lldb) target modules dump symfile
...
Compile units:
0x9ef030: CompileUnit{0x00000000}, language = "c99", file = '/home/jkratoch/t/main.c'
                                               ^^^

There is also DWARFCompileUnit::Dump but that does not print the language.

lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
6 ↗(On Diff #361230)

Why have you removed %clangxx_host I was using in D106194? (I could use probably %clang_host instead.)
If it is x86_64-linux dependent then you need also:

// REQUIRES: target-x86_64, system-linux, native

But the x86 dependency is hopefully not needed.

lldb/test/Shell/SymbolFile/DWARF/x86/split-optimized.cpp
1 ↗(On Diff #361230)

%clang_host?

So dump always passes a the skeleton unit to "GetLanguage" or it already makes a dynamic choice based on whether or not the dwo has already been loaded?

GetLanguage has now a bug for -gsplit-dwarf as it will return eLanguageTypeUnknown if called on the skeleton - which could affect ManualDWARFIndex::IndexUnitImpl but then it is using LanguageType cu_language only for cu_language == eLanguageTypeObjC. Which is needed only on OSX and OSX does not have -gsplit-dwarf. GetLanguage should IMO force loading of the DWO.

Ah - yeah, I could see that particular thing either way (it gives a correct local answer, but probably not the answer someone wants). Yeah - maybe just assert(false) on getLanguage on a skeleton unit (or it should always load the DWO and delegate to it, as you say) - I'd be happy with either way.

I'd still be a bit concerned if the way "language" is rendered is the dump is indistinguishable between "dwo hasn't been loaded" and "there is no language specified in the full (split or non-split) unit"

Do you talk about CompileUnit::Dump? CompileUnit always has DWO already loaded.

(lldb) target modules dump symfile
...
Compile units:
0x9ef030: CompileUnit{0x00000000}, language = "c99", file = '/home/jkratoch/t/main.c'
                                               ^^^

There is also DWARFCompileUnit::Dump but that does not print the language.

I'm not sure which thing was being discussed. It was mentioned here: https://reviews.llvm.org/D100299#2895555 - so maybe that was confused/incorrect (or was correct at that point in the patch) and now it's addressed and as you say, loads the DWO and prints the language name correctly. That's fine by me.

Eric added a comment.Jul 26 2021, 9:09 AM

So dump always passes a the skeleton unit to "GetLanguage" or it already makes a dynamic choice based on whether or not the dwo has already been loaded?

No, Dump always returns a cached value either provided to the CompileUnit constructor or calculated on a previous call to GetLanguage(). It will not trigger calculation even if the necessary information has already been loaded because then it would have to update the cached copy which is a non-const operation.

I'd still be a bit concerned if the way "language" is rendered is the dump is indistinguishable between "dwo hasn't been loaded" and "there is no language specified in the full (split or non-split) unit" - while dump probably isn't the most important feature for end users, it may be important for lldb developers - if they get confused/mislead that the "language" is not present/can't be parsed, rather than the dwo hasn't been loaded, that might result in some frustration/waste of time. It'd be good/important to render those different states differently.

I'm testing a fix for this; will upload shortly.

jankratochvil added inline comments.Jul 26 2021, 9:19 AM
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
702–703

This will not work for the need_non_skeleton case for DWO as one has to use dwarf_cu.GetNonSkeletonUnit().GetDWARFLanguageType() in such case.
Original code was using: dwarf_cu.GetNonSkeletonUnit().GetUnitDIEOnly().GetAttributeValueAsUnsigned(DW_AT_language, 0)

Eric added a comment.Jul 26 2021, 9:19 AM

GetLanguage has now a bug for -gsplit-dwarf as it will return eLanguageTypeUnknown if called on the skeleton - which could affect ManualDWARFIndex::IndexUnitImpl but then it is using LanguageType cu_language only for cu_language == eLanguageTypeObjC. Which is needed only on OSX and OSX does not have -gsplit-dwarf. GetLanguage should IMO force loading of the DWO.

The behavior of GetLanguage should be unchanged. Could you explain what you think is different? Also what do you mean by "if called on the skeleton" -- this is a method on the CompileUnit, which can't be called on just the skeleton unit or just the non-skeleton unit.

Do you talk about CompileUnit::Dump? CompileUnit always has DWO already loaded.

This was true prior to this change but is no longer the case with this change.

lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
6 ↗(On Diff #361230)

I'll try changing it back, but I'm not sure what the difference is? Other tests in DWARF/x86 seem to just use %clang and I'm not clear on why this test would need to be different?

jankratochvil added inline comments.Jul 26 2021, 10:03 AM
lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
6 ↗(On Diff #361230)
  • Those tests in DWARF/x86 do not do run.
  • Those tests in DWARF/x86 are impossible to be used on different arch than x86_64 (as they are typically coded in x86_64 asm). Tests from this D100299 can run on any arch. When you develop LLDB on non-x86_64 arch (yes, some people incl. me do that) you want to have as rich testsuite as possible to catch regressions easier.
Eric updated this revision to Diff 361765.Jul 26 2021, 12:47 PM

Now distinguishing between unknown language and not loaded.

Could you split that <not loaded> change to an extra patch, please?

Eric added a comment.Jul 26 2021, 12:55 PM

The "<not loaded>" message would never be seen without the changes in this patch, so I don't think it makes sense to do independently.

kimanh added inline comments.Jul 27 2021, 8:02 AM
lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
54

nit: ExtractUnitDIEIfNeeded -> ExtractUnitDIENoDwoIfNeeded

I tried this patch showing the language in more cases when DWO is already in memory: https://people.redhat.com/jkratoch/dwouseifloaded.patch
It would need some cleanup. But I find now questionable whether it is worth it as it affects only the debug dumps.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
702–703

Variables are lowercase in LLDB, this is not LLVM.

727

This should be done inside InitializeCU - that is it needs to apply also for the optimized case.

Eric updated this revision to Diff 362121.Jul 27 2021, 11:45 AM
Eric marked 2 inline comments as done.
Eric added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
702–703

Okay. I figured function style made more sense but I can see either way.

727

ParseSupportFiles takes care of path remapping already. (See line 244)

I am fine with the patch except for the nitpicks.
@dblaikie not sure if you like more availability of the language? https://people.redhat.com/jkratoch/dwouseifloaded.patch

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
727

OK, thanks. There could be a comment for it as I find it far from obvious.

lldb/source/Symbol/CompileUnit.cpp
104 ↗(On Diff #362121)

Missing newline after closing }.

lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
1–2 ↗(On Diff #362121)

This description was copy-pasted and not updated.

4 ↗(On Diff #362121)

Due to the %clangxx_host this test is now arch-independent. Also the lld should not be needed. So this whole line can be deleted.

6 ↗(On Diff #362121)

Is there a reason this testcase is .cpp and not .c and it is using %clangxx_host and not %clang?
One should remove -target x86_64-pc-linux as the test is now arch-independent.

10 ↗(On Diff #362121)

You can use %clangxx_host %t1.o %t2.o -o %t then you need no lld dependency as lld is not available on some archs.
And it is generally dangerous/incompatible to run the linker directly, one should always use the driver (clang/clang++) even just for linking. It matters for this testcase compared to other testcases as it is using run.

lldb/test/Shell/SymbolFile/DWARF/x86/split-optimized.cpp
1 ↗(On Diff #362121)

When there is the _host variant there should no longer be the -target x86_64-pc-linux as the test is arch-independent now.
Also the first line should be a description of the test.

Eric updated this revision to Diff 362201.Jul 27 2021, 3:15 PM
Eric marked 4 inline comments as done.
Eric updated this revision to Diff 362202.Jul 27 2021, 3:21 PM
Eric marked an inline comment as done.
Eric updated this revision to Diff 362516.Jul 28 2021, 2:17 PM

Now eagerly evaluates the language if there is no dwo. Lazy evaluation only happens for split dwarf case.

Eric added a comment.Jul 28 2021, 2:34 PM

Jan, I don't think there's any point in checking for the case when the dwo is already loaded as that wouldn't be triggered if the CompileUnit hasn't already been created. However I added a check for the case where there is no dwo because it wasn't compiled with -gsplit-dwarf. Note that the debug-types-address-ranges.s test case no longer requires modification in this version. Let me know if you have any further nitpicks for SymbolFileDWARF::ParseCompileUnit. Nothing else required changes since your last review. I'm also happy to bring back the previous version with your nitpicks fixed if you'd prefer.

I am fine with the patch except for the nitpicks.
@dblaikie not sure if you like more availability of the language? https://people.redhat.com/jkratoch/dwouseifloaded.patch

Mostly I'm just concerned that there isn't a "trap" left where someone might call the API, get an answer that looks reasonable, but it is wrong or out of date.

so I guess order of concern:

If the same call can produce different results depending on whether some other operation has caused more parsing to happen - that's probably a problematic API that should be avoided.
-> If it can't be avoided, then it'd be good if that API returns a "indeterminate" Result to distinguish it from "the parsing happened and nothing was found"/"the parsing happened and the thing was found".

If the API is available on the split and full unit, and it always returns the same value in each case (ie: always returns the parsed result in the split unit, and it always returns "we looked and found nothing" on the skeleton unit) - that's /probably/ OK, at least it doesn't have the temporal inconsistency risk of the previous point - but maybe if the attempt to call on the skeleton unit were invalid/assert-failed, that might be better, since it's sort of not expected/meaningful to query the skeleton unit for the language.

Does that make sense?

(but this is all pretty out of my wheelhouse, so I'm happy to leave the details to you folks - if my concerns are understood, you can weigh/factor them in as you see fit)

Eric added a comment.Jul 28 2021, 5:37 PM

The only changed behavior is CompileUnit::Dump(), which I've changed to distinguish between an uncalculated languange ("<not loaded>") and an unspecified one ("unknown"). No other API should behave differently, as GetLanguage() will trigger parsing of the non-skeleton unit, and no API would be parsing language information out of Dump() output. With my latest update there is also no effect if not using -gsplit-dwarf, and this change substantially benefits those using -gsplit-dwarf. In practice I don't see a compelling user story where this would be a regression.

The only changed behavior is CompileUnit::Dump(), which I've changed to distinguish between an uncalculated languange ("<not loaded>") and an unspecified one ("unknown"). No other API should behave differently, as GetLanguage() will trigger parsing of the non-skeleton unit, and no API would be parsing language information out of Dump() output. With my latest update there is also no effect if not using -gsplit-dwarf, and this change substantially benefits those using -gsplit-dwarf. In practice I don't see a compelling user story where this would be a regression.

That sounds all good to me - so consider my concerns addressed. (I'll leave it to folks more familiar with lldb for the final review/approval)

Eric added a comment.Jul 29 2021, 9:17 AM

Jan, let me know if there's anything else I should change. Also, I believe I will need someone else to actually submit this for me.

lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
4 ↗(On Diff #362121)

Should I also move these tests up a folder since they are no longer x86 specific?

jankratochvil added a comment.EditedJul 29 2021, 11:11 AM

Mostly I'm just concerned that there isn't a "trap" left where someone might call the API, get an answer that looks reasonable, but it is wrong or out of date.

With the <not loaded> it should not happen anywhere.

but maybe if the attempt to call on the skeleton unit were invalid/assert-failed, that might be better, since it's sort of not expected/meaningful to query the skeleton unit for the language.

I agree but that would be (another) hijack of this patch, than can be implemented separately/orthogonally.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
696

IIUC you do not check GetDebugMapSymfile() as I did in my https://people.redhat.com/jkratoch/dwouseifloaded.patch (and which I did copy from SymbolFileDWARF::GetDwoSymbolFileForCompileUnit) as it is already checked here? I hope dwarf_cu.GetOffset() == 0 is always satisfied on OSX. Unfortunately I do not know much OSX and the testsuite recently fails a lot on my OSX.

744

Empty line before the comment.

lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.c
10 ↗(On Diff #362516)

As there could be otherwise a false PASS.

21 ↗(On Diff #362516)

As there could be otherwise a false PASS.
But then it needs also settings set stop-line-count-before 0 otherwise it is a false FAIL.

jankratochvil accepted this revision.Jul 29 2021, 11:12 AM
jankratochvil added inline comments.
lldb/test/Shell/SymbolFile/DWARF/x86/dwarf5-lazy-dwo.cpp
4 ↗(On Diff #362121)

Yes, you are right, it should not be in x86/.

This revision is now accepted and ready to land.Jul 29 2021, 11:12 AM
Eric updated this revision to Diff 362876.Jul 29 2021, 1:57 PM
Eric marked 4 inline comments as done.
Eric added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
696

I'm not sure about that. Added the check.

This revision was landed with ongoing or failed builds.Jul 30 2021, 4:35 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2021, 4:35 AM

I have reverted it as it takes some time to build on arm32 to investigate it.

Eric added a comment.Jul 30 2021, 11:37 AM

Would it make sense to turn the split-optimized test back into an x86 only test, or just leave it out of the change as it's not actually testing a code path that this changed?

Would it make sense to turn the split-optimized test back into an x86 only test, or just leave it out of the change as it's not actually testing a code path that this changed?

The question is why it fails. I have no idea, do you? Unfortunately I haven't found a ready to use arm32 box, I even bricked armv7-test01.fedorainfracloud.org for it. I have started a local arm32 VM and I will try to build LLDB there over weekend and we can decide afterwards. Sometimes the exotic arches surprisingly discover an arch-unspecific bug (I do not think it is this case but who knows).

Eric added a comment.Jul 30 2021, 1:11 PM

Is arm hardware necessary to test this, or can the test be modified to cross-compile to arm to see what is going on? Is there a way to determine what build target the test bot is using?

In any case the broken test doesn't exercise lldb at all so it could be separated from the patch.

Is arm hardware necessary to test this,

In this case it is not as it does not require linking. Usually I find easier to run it natively than to setup all the cross-compilation libraries and include files. I agree it was my mistake as the cross-compilation is easier in this case.

or can the test be modified to cross-compile to arm to see what is going on?

In fact it is visible already from the build log and it is then obvious from source code. I did not see it first from the default shortened dump.

Is there a way to determine what build target the test bot is using?

Yes, armv8l-unknown-linux-gnueabihf.

In any case the broken test doesn't exercise lldb at all so it could be separated from the patch.

If the test did not exercise LLDB it should have been removed. But it does exercise LLDB.

I have added to lldb/test/Shell/SymbolFile/DWARF/split-optimized.c:

+// ObjectFileELF::ApplyRelocations does not implement arm32.
+// XFAIL: target-arm && linux-gnu

Hopefully it will satisfy the buildbot now.

Is arm hardware necessary to test this,

In this case it is not as it does not require linking. Usually I find easier to run it natively than to setup all the cross-compilation libraries and include files. I agree it was my mistake as the cross-compilation is easier in this case.

or can the test be modified to cross-compile to arm to see what is going on?

In fact it is visible already from the build log and it is then obvious from source code. I did not see it first from the default shortened dump.

Is there a way to determine what build target the test bot is using?

Yes, armv8l-unknown-linux-gnueabihf.

In any case the broken test doesn't exercise lldb at all so it could be separated from the patch.

If the test did not exercise LLDB it should have been removed. But it does exercise LLDB.

I have added to lldb/test/Shell/SymbolFile/DWARF/split-optimized.c:

+// ObjectFileELF::ApplyRelocations does not implement arm32.
+// XFAIL: target-arm && linux-gnu

Hopefully it will satisfy the buildbot now.

This also fails on the Windows lldb bot:

https://lab.llvm.org/buildbot/#/builders/83/builds/8842

Perhaps something more than several xfails is needed? More specifically, I think it makese more sense to use something like requires or unsupported

This also fails on the Windows lldb bot:

Sorry I did not check (probably it sends messages to Author and not Commiter). Added there:

// -gsplit-dwarf is supported only on Linux.
// REQUIRES: system-linux

I did not know -gsplit-dwarf is implemented only in clang/lib/Driver/ToolChains/Gnu.cpp. It is implemented also in clang/lib/Driver/ToolChains/MinGW.cpp but I could not get it working there (-target i686-w64-mingw32).