This is an archive of the discontinued LLVM Phabricator instance.

[ASTImporter] Mark erroneous nodes in from ctx
ClosedPublic

Authored by martong on May 24 2019, 2:56 AM.

Details

Summary

During import of a specific Decl D, it may happen that some AST nodes
had already been created before we recognize an error. In this case we
signal back the error to the caller, but the "to" context remains
polluted with those nodes which had been created. Ideally, those nodes
should not had been created, but that time we did not know about the
error, the error happened later. Since the AST is immutable (most of
the cases we can't remove existing nodes) we choose to mark these nodes
as erroneous.
Here are the steps of the algorithm:

  1. We keep track of the nodes which we visit during the import of D: See

ImportPathTy.

  1. If a Decl is already imported and it is already on the import path

(we have a cycle) then we copy/store the relevant part of the import
path. We store these cycles for each Decl.

  1. When we recognize an error during the import of D then we set up this

error to all Decls in the stored cycles for D and we clear the stored
cycles.

Diff Detail

Repository
rL LLVM

Event Timeline

martong created this revision.May 24 2019, 2:56 AM
Herald added a reviewer: shafik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

Hi Gabor,
Could you please provide any test for the import itself?

clang/lib/AST/ASTImporter.cpp
8668 ↗(On Diff #201173)

const?

martong updated this revision to Diff 206022.Jun 21 2019, 9:24 AM
  • Add test ErrorPropagatesThroughImportCycles
  • Change existing test to new behaviour

Alexei,

Thank you for the review, I have added a test which demonstrates the changes.
By tracking the import paths and cycles we implement a very strict error handling mechanism, but this seems to be the way to avoid reaching any faulty AST nodes for the ASTImporter clients.

martong marked an inline comment as done.
a_sidorin accepted this revision.Jun 23 2019, 2:11 PM

Thanks!

This revision is now accepted and ready to land.Jun 23 2019, 2:11 PM

Alexei,

I think I still owe you some explanation about this patch.
I do consider this patch as one of the most intricate patches regarding ASTImporter.
I'd like to answer the following questions in this comment:
What is an ImportPath and why do we need to track it?
What does it mean if we have a cycle in the import path?
How do we use these cycles during the error handling?

An ImportPath is the list of the AST nodes which we visit during an Import call.
If node A depends on node B then the path contains an A->B edge.
From the call stack of the import functions we can read the very same path.

Now imagine the following AST, where the -> represents dependency in therms of the import.

A->B->C->D
   `->E

We would like to import A.
The import behaves like a DFS, so we will visit the nodes in this order: ABCDE.
During the visitation we will have the following ImportPaths:

A
AB
ABC
ABCD
ABC
AB
ABE
AB
A

If during the visit of E there is an error then we set an error for E, then as the call stack shrinks for B, then for A:

A
AB
ABC
ABCD
ABC
AB
ABE // Error! Set an error to E
AB  // Set an error to B
A   // Set an error to A

However, during the import we could import C and D without any error and they are independent from A,B and E.
We must not set up an error for C and D.
So, at the end of the import we have an entry in ImportDeclErrors for A,B,E but not for C,D.

Now what happens if there is a cycle in the import path?
Let's consider this AST:

A->B->C->A
   `->E

During the visitation we will have the below ImportPaths and if during the visit of E there is an error then we will set up an error for E,B,A.
But what's up with C?

A
AB
ABC
ABCA
ABC
AB
ABE // Error! Set an error to E
AB  // Set an error to B
A   // Set an error to A

This time we know that both B and C are dependent on A.
This means we must set up an error for C too.
As the call stack reverses back we get to A and we must set up an error to all nodes which depend on A (this includes C).
But C is no longer on the import path, it just had been previously.
Such situation can happen only if during the visitation we had a cycle.
If we didn't have any cycle, then the normal way of passing an Error object through the call stack could handle the situation.
This is why we must track cycles during the import process for each visited declaration.

I hope this explanation makes it even cleaner than the tests and thanks again for the review!

balazske added inline comments.Jun 24 2019, 3:33 AM
clang/lib/AST/ASTImporter.cpp
7840 ↗(On Diff #206022)

It is possible to use the make_scope_exit instead, this results in less written code.

martong updated this revision to Diff 206228.Jun 24 2019, 8:45 AM
  • Remove the macro and use std::string.op+
  • We may set up an error twice in case of a cycle, thus remove the assert
balazske added inline comments.Jun 24 2019, 9:00 AM
clang/lib/AST/ASTImporter.cpp
8647 ↗(On Diff #206228)

We should not remove this assert? (Or check if the error is the same, otherwise keep the first or last error? Normally if we set an error multiple times it should be the same error otherwise something is wrong with the import.)

martong updated this revision to Diff 206244.Jun 24 2019, 9:42 AM
  • Add back an assertion in setImportDeclError(), remove the condition in Import()
martong marked 2 inline comments as done.Jun 24 2019, 9:44 AM
martong added inline comments.
clang/lib/AST/ASTImporter.cpp
8647 ↗(On Diff #206228)

Well, yes that is a good point. I was thinking about a theoretical scenario where we could set two different Errors for the same node, but failed to fabricate one. So I added an assert that they have the same kind,

martong updated this revision to Diff 206254.Jun 24 2019, 10:17 AM
martong marked 5 inline comments as done.
  • Use make_scope_exit
  • Make hasCycleAtBack const
clang/lib/AST/ASTImporter.cpp
7840 ↗(On Diff #206022)

Thanks! Changed to use that.

8668 ↗(On Diff #201173)

Thanks! I made to be const.

martong marked an inline comment as done.Jun 24 2019, 10:19 AM
martong added inline comments.
clang/lib/AST/ASTImporter.cpp
7892 ↗(On Diff #206254)

We may set up an error multiple times now, but the error should be the same, this is handled in setImportDeclError.

balazske added inline comments.Jun 25 2019, 12:39 AM
clang/unittests/AST/ASTImporterTest.cpp
5000 ↗(On Diff #206254)

I see now that this test is really not required, the previous test checks the almost same thing in case of class B and NS (but more test is always better).

martong marked 2 inline comments as done.Jun 25 2019, 2:32 AM
martong added inline comments.
clang/unittests/AST/ASTImporterTest.cpp
5000 ↗(On Diff #206254)

Actually, this test belongs to the previous patch , not to this one.

Hello Gabor,
Thank you for the explanation. I got the idea of this patch anyway, but it will be definitely useful for anyone digging into the code. Should we make it a comment for ImportPathTy?

clang/lib/AST/ASTImporter.cpp
8682 ↗(On Diff #206254)

I think these definitions are small enough to move them into the class definitions.

martong marked 3 inline comments as done.Jul 1 2019, 6:29 AM

Thank you for the explanation. I got the idea of this patch anyway, but it will be definitely useful for anyone digging into the code. Should we make it a comment for ImportPathTy?

Ok, I have added the explanation to ImportPathTy as a class comment.

clang/lib/AST/ASTImporter.cpp
8682 ↗(On Diff #206254)

Ok, I have made them in-class defined.

martong updated this revision to Diff 207288.Jul 1 2019, 6:29 AM
martong marked an inline comment as done.
  • Move ImportPathTy's funcitons in-class, add class comment
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2019, 7:19 AM

@shafik I've been looking for any lldb regression in our Mac machine, could not find any. Now I am looking at http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/ . I don't expect regression because here we changed logic about the error handling only, so I'd expect to have regression only if we had testcases in lldb for erroneous cases, but apparently there are no such tests.