This is an archive of the discontinued LLVM Phabricator instance.

Avoid many std::tie/tuple instantiations in ASTImporter
ClosedPublic

Authored by rnk on Jan 29 2020, 5:05 PM.

Details

Summary

To factor the error checking, use importChecked instead of importSeq. This
avoids repeating the names of all of the imported child nodes once, and allows
errors to be checked with a single conditional as it is with importSeq.

After:

peak memory: 601.63MB
real: 0m19.172s
obj size: 8,352kb

Before:

peak memory: 954.11MB
real: 0m26.188s
obj size: 10,000kb

The speed is not as impressive as I hoped, but the memory use reduction
is impressive, and seems worth it.

Diff Detail

Event Timeline

rnk created this revision.Jan 29 2020, 5:05 PM
Herald added a reviewer: shafik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: teemperor. · View Herald Transcript
Quuxplusone added inline comments.
clang/lib/AST/ASTImporter.cpp
1146

As the author of P1155 "More Implicit Move", I would expect that you don't need return std::move(E)return E should just as well perform "implicit move" in C++11 and later, assuming that llvm::Expected<QualType> has a valid constructor from llvm::Error&&.

You're not seeing any compiler diagnostic that suggests you use std::move here, are you?

Unit tests: pass. 62315 tests passed, 0 failed and 838 were skipped.

clang-tidy: fail. clang-tidy found 0 errors and 114 warnings. 0 of them are added as review comments below (why?).

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Thanks for this patch too!
However, I have a minor concern with this new approach:
If importSeq fails then a ToSomething still holds a value from the "From" context. This could cause problems if it is used later in creating the new AST node. Normally, this should not happen because we do check the return value of importSeq. But as the AST increases, new properties are added to a node, let's say ToSomething2 and when we forget (by mistake) to add this to importSeq then we end up having a node that has a reference to something in the "From" context:

auto ToSomething = From->getToSomething();
auto ToSomething2 = From->getToSomething2();
if (Error E = ImportSeq(ToSomething)) 
  return std::move(E);
createNewNodeInToCtx(ToSomething, ToSomething2);

In contrast to the original behavior where we would end up with an uninitialized value (pointer) which seems to be easier to recognize when someone uses the merged AST.
(I am not sure how could we force ToSomething2 into importSeq, perhaps a clang-tidy checker could do that.)

clang/lib/AST/ASTImporter.cpp
1266

My concern here is that if importSeq fails here then ToEPI.ExceptionSpec.NoexceptExpr still holds a value from the "From" context and that can be a (fatal) problem later. The import should happen right after the initialization of these variables.
So, we should have something like this:

ToEPI.ExceptionSpec.NoexceptExpr = FromEPI.ExceptionSpec.NoexceptExpr;
ToEPI.ExceptionSpec.SourceDecl = FromEPI.ExceptionSpec.SourceDecl;
ToEPI.ExceptionSpec.SourceTemplate = FromEPI.ExceptionSpec.SourceTemplate;
if (Error E = importSeq(ToEPI.ExceptionSpec.NoexceptExpr,ToEPI.ExceptionSpec.SourceTemplate))
  return std::move(E);
ToEPI.ExtInfo = FromEPI.ExtInfo;
ToEPI.Variadic = FromEPI.Variadic;
ToEPI.HasTrailingReturn = FromEPI.HasTrailingReturn;
ToEPI.TypeQuals = FromEPI.TypeQuals;
ToEPI.RefQualifier = FromEPI.RefQualifier;
ToEPI.ExceptionSpec.Type = FromEPI.ExceptionSpec.Type;
ToEPI.ExceptionSpec.Exceptions = ExceptionTypes;
martong added inline comments.
clang/lib/AST/ASTImporter.cpp
1146

I have some vague and obscure memory about that GCC 4.8 (or 5.2) required explicitly the && cast, otherwise we had a diagnostic (@gamesh411 maybe you help me to recall that?)

clang/lib/AST/ASTImporter.cpp
1146

Ah, you are absolutely correct: GCC 4.9.4 does not do implicit move in this situation, whereas GCC 5.1 and later do do it.
https://godbolt.org/z/YDHtTH
I didn't realize that portability back to GCC 4.9 was still a concern for LLVM/Clang. If it is, then please feel free to keep the std::move here and throughout (but maybe mention it in the commit message for the benefit of others who might think as I did). If portability back to pre-GCC-5 is not a concern, then I would continue to recommend removing the std::move.

rnk planned changes to this revision.Jan 30 2020, 11:09 AM
rnk marked an inline comment as done.

Thanks for this patch too!
However, I have a minor concern with this new approach:
If importSeq fails then a ToSomething still holds a value from the "From" context. This could cause problems if it is used later in creating the new AST node. Normally, this should not happen because we do check the return value of importSeq. But as the AST increases, new properties are added to a node, let's say ToSomething2 and when we forget (by mistake) to add this to importSeq then we end up having a node that has a reference to something in the "From" context:

auto ToSomething = From->getToSomething();
auto ToSomething2 = From->getToSomething2();
if (Error E = ImportSeq(ToSomething)) 
  return std::move(E);
createNewNodeInToCtx(ToSomething, ToSomething2);

In contrast to the original behavior where we would end up with an uninitialized value (pointer) which seems to be easier to recognize when someone uses the merged AST.
(I am not sure how could we force ToSomething2 into importSeq, perhaps a clang-tidy checker could do that.)

It occurs to me that we don't really need importSeq. The only thing it's doing for us is to help tidy up the error checking. All this code really needs is an accurate list of all the AST nodes to import from one AST to another. We could use a code pattern like this instead:

Error E = Error::success();
auto ToType = checkedImport(E, D->getType());
auto ToLoc = checkedImport(E, D->getLoc());
auto ToTSI = checkedImport(E, D->getTypeSourceInfo());
...
if (E)
  return std::move(E);

checkedImport would have similar behavior to the current helpers:

  • if there was already an error, do nothing, return a default constructed node (null, sloc 0, etc)
  • if importing produces an error, store it in the error, return a default constructed node
  • else, return the node

I think this would address your concern that the To nodes are always in the new ASTContext.

I'll see if I can redo this in that direction. I should be able to rewrite all of the Expr imports, which are very uniform, automatically...

clang/lib/AST/ASTImporter.cpp
204–205

I should add that I'm a novice with variadic templates. If anyone else has suggestions for a better way to iteratively call importInPlace on each argument, left to right, I'm open to it. I found this initializer list trick on stack overflow.

rnk updated this revision to Diff 241552.Jan 30 2020, 12:50 PM
  • Rewrite with importChecked, no importSeq
rnk edited the summary of this revision. (Show Details)Jan 30 2020, 12:55 PM

Unit tests: fail. 62352 tests passed, 1 failed and 839 were skipped.

failed: libc++.std/containers/sequences/array/array_creation/to_array.fail.cpp

clang-tidy: fail. clang-tidy found 0 errors and 147 warnings. 0 of them are added as review comments below (why?).

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

martong accepted this revision.Jan 31 2020, 2:44 AM

Ok, this looks good to me now. And I really appreciate the work you have done here, thanks!

This revision is now accepted and ready to land.Jan 31 2020, 2:44 AM

Thank you for this work, it is awesome!

I really like the removal of importSeq I feel like the resulting code is more clear.

shafik added inline comments.Jan 31 2020, 6:24 AM
clang/lib/AST/ASTImporter.cpp
1168

Should this group be using importChecked as well?

shafik added inline comments.Jan 31 2020, 6:25 AM
clang/lib/AST/ASTImporter.cpp
1181

importChecked?

martong added inline comments.Jan 31 2020, 7:01 AM
clang/lib/AST/ASTImporter.cpp
1168

Yes, this is a good catch!

rnk marked an inline comment as done.Feb 4 2020, 3:29 PM

Regarding the Error Err; ... return std::move(Err); pattern, I think it needs to be written like this for now. At the very least, NRVO is not possible because of the Expected conversion, so std::move is not a pessimization.

Thanks for the review!

clang/lib/AST/ASTImporter.cpp
1168

Thanks, I fixed this and the one below. I did some regexes to try to find other instances based on this pattern, but I didn't find any more. Hopefully I haven't introduced new bugs. :)

This revision was automatically updated to reflect the committed changes.

I forgot to mention this earlier but LLDB is a major user of the ASTImporter and so in general it is a good idea to run check-lldb as well.

rnk added a comment.Feb 5 2020, 5:04 PM

I forgot to mention this earlier but LLDB is a major user of the ASTImporter and so in general it is a good idea to run check-lldb as well.

I ran the LLDB tests on Linux, but I get 42 failures:
https://reviews.llvm.org/P8192
If I revert my change, I get the same 42 failures.

I debugged the first test, and it doesn't run because it's trying to use libc++.so, but it's not on the library path:

$ ./a.out
./a.out: error while loading shared libraries: libc++.so.1: cannot open shared object file: No such file or directory

I can add the lib directory inside my checkout to LD_LIBRARY_PATH and it will run:

$ LD_LIBRARY_PATH="$HOME/llvm-project/build/lib" ./a.out
Test

So, the fix is straightforward, but I think it might take an expert to figure out exactly where to put that logic. :)

I ran the patch on macOS and Linux through check-lldb and there were no regressions, so this is LGTM. The libc++ failures should go away when you add libcxx;libcxxabi to LLVM_ENABLE_PROJECTS (the tests are using libc++).