This is an archive of the discontinued LLVM Phabricator instance.

Fix DWO breakage in r264909
ClosedPublic

Authored by labath on Mar 31 2016, 6:30 AM.

Details

Summary

In case of Dwo, DIERef stores a compile unit offset in the main object file, and not in the dwo.
The implementation of SymbolFileDWARFDwo::GetDIE inherited from SymbolFileDWARF tried to lookup
the compilation unit in the DWO based on the main object file offset (and failed). I change the
implementation to verify the DIERef indeed references compile unit belonging to this dwo and then
lookup the die based on the die offset alone.

Includes a couple of fixes for mismatched struct/class tags.

Diff Detail

Repository
rL LLVM

Event Timeline

labath updated this revision to Diff 52196.Mar 31 2016, 6:30 AM
labath retitled this revision from to Fix DWO breakage in r264909.
labath updated this object.
labath added reviewers: tberghammer, clayborg.
labath added a subscriber: lldb-commits.
tberghammer accepted this revision.Mar 31 2016, 6:33 AM
tberghammer edited edge metadata.

Looks good

This revision is now accepted and ready to land.Mar 31 2016, 6:33 AM
This revision was automatically updated to reflect the committed changes.
clayborg edited edge metadata.Mar 31 2016, 9:33 AM

See inlined comment for a follow up fix.

lldb/trunk/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
136

I would prefer to not use an assertion here, we can't crash if things go south for some reason. Can we change this to something like:

lldbassert(m_base_dwarf_cu->GetOffset() == die_ref.cu_offset);
if (m_base_dwarf_cu->GetOffset() == die_ref.cu_offset)
    return DebugInfo()->GetDIEForDIEOffset(die_ref.die_offset);
else
    return DWARFDIE();

The lldbassert will be in debug builds but not release builds so it can fire during testing, but won't crash a release build. assert() is dangerous as it is us to the builders to ensure DEBUG or NDEBUG is defined and if the assert is left in it can crash your lldb, IDE, or any program directly loading LLDB which isn't acceptable.

tberghammer added inline comments.Mar 31 2016, 9:46 AM
lldb/trunk/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
136

I asked Pavel to add the assert in as if the condition isn't met that means a pretty serious problem and will cause incorrect behavior or crash at a later stage. I think it is better to crash at the location where we detect the error instead of crashing at a random point or displaying incorrect data to the user.

If you don't want LLDB to take down you full IDE then I think you have 2 option:

  • Compile with assert turned off and hope that we don't crash
  • Move LLDB to a separate process and handle the case from the IDE when LLDB crashes

We have a lot of assert in different parts of LLDB and in clang/llvm as well and usually they help a lot in finding strange bugs. I would prefer to continue adding more assert to the code so if somebody start miss-using an API we detect it immediately. Even if we remove every assert from LLDB (I strongly against it) clang and llvm will contain a large number of asserts what can take down your IDE the same way.

I don't mind asserts when they shouldn't happen if they are lldbassert calls since these are removed for release builds, but we can't just call assert and crash because something bad happened. LLDB is a shared library and a framework and it shouldn't crash no matter what happens if we can help it. We need to get rid of any of these kinds of asserts that can actually bring down LLDB and code it so it can deal with things. I am fine with adding logging that is always enabled for these cases, but this is just a bad design that is hitting LLDB is crashing all over the place recently. So we need to code things correctly to deal with bad things and add logging so we know when we see these cases. So please fix SymbolFileDWARFDwo::GetDIE() to not crash as suggested.

I don't agree that asserts are good in released code unless you have no way of backing out of the situation you find yourself in. After all, you are saying to some unlucky user out there that they can't use the debugger on their app and in general there's nothing they can do about it. Greg's suggestion is for this low-level API to say "I couldn't find this DIE" and then if that's something higher layers can work around - by saying "Yeah I couldn't find that type" then you've allowed the user to continue their debug session instead of stopping them cold.

Not asserting prematurely is particularly important for handling debug information; since we don't control the compiler we need to handle as much junk information as gracefully as possible.

Also, asserts, especially for debug information, don't tend to be very helpful in the field. You get a crash trace which really doesn't tell you the important stuff - what debug file was this, what DIE was bad, etc... And given the nature of life, this error is going to occur for a user who can't give you their project to repro the bug and can't reduce it to a smaller test case. Logs are pretty much all you have to go on. So an un-annotated assert like this is not a good idea.

So orthogonal to the assert issue, if you find something not copacetic in the debug information, you should log out as much local information as you can regardless of what you are going to do with the error.

Jim

I think it depends on how bad the thing that happens is. If you are out of
memory, sure you can assert. If a syscall fails that is supposed to never
fail, you can assert. But if one piece of debug info appears corrupt, I
don't think it's worth bringing down the whole process, which could be an
entire IDE. (FWIW yes I agree that having LLDB out of process would be an
even better solution to this, but that's quite a large undertaking). You
could be debugging other processes using other debug info which is not
corrupt, but you terminate all those sessions because one piece of
unrelated debug info in an unrelated process is bad. Doesn't seem right to
me.

debug info is program input, and you would never assert on program input,
you have to be able to handle unreasonable values gracefully.

This assert is NOT verifying the debug info itself. It is verifying that LLDB asked the right SymbolFileDWARFDwo to find a DIE based on a DIERef. Independently of the debug info (what can be garbage) passed in to LLDB this assert should never fire if LLDB has no major bug in the DWO handling logic.

Considering that we have ~10k assert in LLVM, ~6k assert in clang and ~1k assert in LLDB you should compile your release binary without asserts if you want to avoid some crash what is caused by them.

In this concrete case if we return an empty DWARFDIE from this function then the caller of the function will crash (with a SEGV) as it is not prepared to handle the case when no DWARFDIE can be found based on a DIERef as it should never happen if LLDB works correctly (independently of the debug info).

I think the very high level question we are discussing is what should LLDB do when it notices that it has a bug in itself. In terms of a compiler displaying an "internal compiler error" and the exiting/crashing is definitely the good solution (to avoid miss compilation) but what is the expected behavior for an interactive application. If you want to be protected against every possible issue then we should remove all assert from LLDB, convince the LLVM and clang guys to do the same and protect every pointer deference with a null check just in case some of the previous function calls returned NULL. I don't think this is the approach we are wanting to go down as it will add a lot of overhead to LLDB and make a lot of bug silent what is very problematic in case of a debugger where displaying incorrect information is almost as bad as crashing the IDE.

PS: Pavel changed the assert to an lldb_assert in the meantime.