This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix a crash when the ASTImporter is giving us two Imported callbacks for the same target decl
ClosedPublic

Authored by teemperor on Aug 10 2020, 6:46 AM.

Details

Summary

The ASTImporter has an Imported(From, To) callback that notifies subclasses that a declaration
has been imported in some way. LLDB uses this in the CompleteTagDeclsScope to see which
records have been imported into the scratch context. If the record was declared inside the expression,
then the CompleteTagDeclsScope will forcibly import the full definition of that record to the scratch
context so that the expression AST can safely be disposed later (otherwise we might end up going
back to the deleted AST to complete the minimally imported record). The way this is implemented
is that there is a list of decls that need to be imported (m_decls_to_complete) and we keep completing
the declarations inside that list until the list is empty. Every To Decl we get via the Imported
callback will be added to the list of Decls to be completed.

There are some situations where the ASTImporter will actually give us two Imported calls with the
same To Decl. One way where this happens is if the ASTImporter decides to merge an imported
definition into an already imported one. Another way is that the ASTImporter just happens to get
two calls to ASTImporter::Import for the same Decl. This for example happens when importing
the DeclContext of a Decl requires importing the Decl itself, such as when importing a RecordDecl
that was declared inside a function.

The bug addressed in this patch is that when we end up getting two Imported calls for the same
To Decl, then we would crash in the CompleteTagDeclsScope. That's because the first time
we complete the Decl we remove the Origin tracking information (that maps the Decl back to
from where it came from). The next time we try to complete the same To Decl the Origin
tracking information is gone and we hit the to_context_md->getOrigin(decl).ctx == m_src_ctx
assert (getOrigin(decl).ctx is a nullptr the second time as the Origin was deleted).

This is actually a regression coming from D72495. Before D72495 m_decls_to_complete
was actually a set so every declaration in there could only be queued once to be completed.
The set was changed to a vector to make the iteration over it deterministic, but that also causes
that we now potentially end up trying to complete a Decl twice.

This patch essentially just reverts D72495 and makes the CompleteTagDeclsScope use a SetVector
again for the list of declarations to be completed. The SetVector should filter out the duplicates
and also ensure that the completion order is deterministic. I actually couldn't find any way to cause LLDB
to reproduce this bug by merging declarations (this would require that we for example declare
two namespaces in a non-top-level expression which isn't possible). But the bug reproduces
very easily by just declaring a class in an expression, so that's what the test is doing.

Diff Detail

Event Timeline

teemperor created this revision.Aug 10 2020, 6:46 AM
teemperor requested review of this revision.Aug 10 2020, 6:46 AM

FWIW, the case where we get two Imported calls when importing a RecordDecl inside a FunctionDecl seems a bit like a bug. But the only way I see how this could be fixed is by making the ASTImporter keep track when it actually created a new decl in an ASTImporter::Import(Decl *) call. I'll make a follow up patch for that.

shafik accepted this revision.Aug 13 2020, 11:42 AM

LGTM, it is sad we lose the deterministic ordering though. Would it be efficient enough to use a vector but keep the set to prevent duplicates?

This revision is now accepted and ready to land.Aug 13 2020, 11:42 AM

LGTM, it is sad we lose the deterministic ordering though. Would it be efficient enough to use a vector but keep the set to prevent duplicates?

I think that's just a mistake in the review description. Actually the SetVector is deterministic and a set, so we won't reintroduce the non-determinism. (Originally this was a set, then we changed it to a vector to fix the non-determinism and now it's a SetVector which has both wanted properties).

teemperor edited the summary of this revision. (Show Details)Sep 9 2020, 1:15 AM
  • Clarify that this isn't just a direct revert, but reverts by establishing a set again instead of just a plain vector (Thanks Shafik!)
Herald added a project: Restricted Project. · View Herald TranscriptSep 9 2020, 1:32 AM