This is an archive of the discontinued LLVM Phabricator instance.

[clang] ASTImporter: Fix importing of va_list types and declarations
AbandonedPublic

Authored by vabridgers on Jan 28 2023, 4:56 PM.

Details

Summary

This patch was originally submitted by @mizvekov, then reverted because
it caused crashes on Arm and AArch64. This has since been debugged as a
problem in the DontModifyStdNamespaceCheck.cpp Tidy check exposed by
Arm and AArch64 architectures defining va_list in the std namespace. The
Tidy check was fixed by excluding the implicit cases. See D136886 for
original patch and notes. This patch takes changes from D136886, adds
the fix and LIT cases to cover the fix.

Original description from @mizvekov for the base portion of this fix.

This fixes a problem where __va_list_tag was not correctly imported,
possibly leading to multiple definitions with different types.

This adds __va_list_tag to it's proper scope, so that the ASTImporter
can find it.

Crash seen in original fix, addressed by improvements.

$ clang-tidy crash.cpp -checks="cert-dcl58-cpp" --  -target arm
clang-tidy: <root>/clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp:178:
  clang::DiagnosticBuilder clang::tidy::ClangTidyContext::diag(llvm::StringRef,
  clang::SourceLocation, llvm::StringRef, clang::DiagnosticIDs::Level):
  Assertion `Loc.isValid()' failed.

Stack dump:
0.    Program arguments: clang-tidy crash.cpp -checks=cert-dcl58-cpp -- -target arm
1.    <eof> parser at end of file
2.    ASTMatcher: Processing 'cert-dcl58-cpp' against:
      CXXRecordDecl std::__va_list : <<invalid sloc>>
--- Bound Nodes Begin ---
    decl - { CXXRecordDecl std::__va_list : <<invalid sloc>> }
    nmspc - { NamespaceDecl std : <<invalid sloc>> }
--- Bound Nodes End ---
 #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
      <root>/llvm/lib/Support/Unix/Signals.inc:567:22
 #1 PrintStackTraceSignalHandler(void*)
      <root>/llvm/lib/Support/Unix/Signals.inc:641:1
 ...
 #9 clang::tidy::ClangTidyContext::diag(llvm::StringRef,
      clang::SourceLocation, llvm::StringRef, clang::DiagnosticIDs::Level)
      <root>/clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp:179:17
      clang::DiagnosticIDs::Level)
      <root>/clang-tools-extra/clang-tidy/ClangTidyCheck.cpp:27:54
      clang::ast_matchers::MatchFinder::MatchResultconst&)
      <root>/clang-tools-extra/clang-tidy/cert/DontModifyStdNamespaceCheck.cpp:121:10

Diff Detail

Event Timeline

vabridgers created this revision.Jan 28 2023, 4:56 PM
vabridgers requested review of this revision.Jan 28 2023, 4:56 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
vabridgers edited the summary of this revision. (Show Details)Jan 28 2023, 5:00 PM
vabridgers added a subscriber: balazske.

Updates per suggestion from @balazske

@DavidSpickett and/or @jrtc27 , I started with @mizvekov 's patch ( D136886) and attempted to address the problems with that patch on arm and aarch64. Is it possible for you to try testing this patch and/or commenting? Thank you. - Vince

rebase, bump review.

On AArch64 we have the following failures:

Failed Tests (15):
  Clang :: CXX/basic/basic.stc/basic.stc.dynamic/p2-nodef.cpp
  Clang :: CodeCompletion/ordinary-name-cxx11.cpp
  Clang :: CodeCompletion/ordinary-name.cpp
  Clang :: CodeGenCXX/cxx20-module-std-subst-2b.cpp
  Clang :: CodeGenCXX/cxx20-module-std-subst-2c.cpp
  Clang :: Index/complete-preamble.cpp
  Clang :: Index/load-namespaces.cpp
  Clang :: Modules/implicit-declared-allocation-functions.cppm
  Clang :: PCH/chain-friend-instantiation.cpp
  Clang :: PCH/cxx1z-aligned-alloc.cpp
  Clang :: SemaCXX/constructor-initializer.cpp
  Clang :: SemaCXX/using-directive.cpp
  Clang-Unit :: AST/./ASTTests/ParameterizedTests/DeclContextTest/removeDeclOfClassTemplateSpecialization/0
  Clang-Unit :: StaticAnalyzer/./StaticAnalysisTests/CallDescription/RejectOverQualifiedNames
  Clang-Unit :: StaticAnalyzer/./StaticAnalysisTests/CallDescription/SkipAnonimousNamespaces

None of which are caused by assertions but instead errors of the form:

input.cc:12:7: error: reference to 'std' is ambiguous
      std::container v;
      ^
note: candidate found by name lookup is 'std'
input.cc:3:15: note: candidate found by name lookup is 'my::std'
    namespace std {
              ^
1 error generated.
/home/david.spickett/llvm-project/clang/test/CodeGenCXX/cxx20-module-std-subst-2b.cpp:14:18: error: reference to 'std' is ambiguous
void f(str<char, std::char_traits<char>> &s) {
                 ^
/home/david.spickett/llvm-project/clang/test/CodeGenCXX/cxx20-module-std-subst-2b.cpp:7:11: note: candidate found by name lookup is 'std'
namespace std {
          ^

On Arm it's the exact same set of failures.

What would help you debug those? I don't know the background for the change but I can check specifics on the machine itself if needed.

Test log from AArch64:

donat.nagy added inline comments.
clang/test/AST/ast-dump-overloaded-operators.cpp
27

Why did a backtick disappear here and in many other locations? Is this somehow related to va_list handling?

I looked at some of the failing tests but can not decide how to fix the problems. The problem seems to be with the extra namespace std that was not there before. I do not know what the tests are exactly testing. A part of the tests can be fixed by adding new expect lines but these must be platform specific.

clang-tools-extra/test/clang-tidy/checkers/cert/dcl58-cpp.cpp
3

I am not sure if these platform specific should be added. A problem was discovered on this platform but this could be true for all other tests.

I looked at some of the failing tests but can not decide how to fix the problems. The problem seems to be with the extra namespace std that was not there before. I do not know what the tests are exactly testing. A part of the tests can be fixed by adding new expect lines but these must be platform specific.

It seems that your change in Sema.cpp creates the std namespace (or makes it visible?) in some situations where it's not declared by the code and should not be visible. I suspect that this behavior of your commit is not compliant with the standard, and there is precedent that Clang code pays attention to avoiding this sort of thing. For example commit 87f54060 introduces the simple testcase

int *use_new(int N) {
  return new int [N];
}

int std = 17;

which ensures that the std namespace is not visible even if a reference to operator new (whose full type references std:bad_alloc) is present.

I think the commit needs to be updated to avoid "leaking" the std namespace into code that do not declare it.

/home/david.spickett/llvm-project/clang/test/CodeGenCXX/cxx20-module-std-subst-2b.cpp:14:18: error: reference to 'std' is ambiguous
void f(str<char, std::char_traits<char>> &s) {
                 ^
/home/david.spickett/llvm-project/clang/test/CodeGenCXX/cxx20-module-std-subst-2b.cpp:7:11: note: candidate found by name lookup is 'std'
namespace std {
          ^

This looks like an ouchie of the highest order. I've been looking into Modules... 5-ish years ago, when it was still under design. There are two major concerns here, one is that we've been having a feature called "Modules" which was very Clang-specific (but a good proof-of-concept spiritual predecessor of what became the Standard Modules), and there are the standard modules. Now the file name explicitly mentions cxx20 so this is likely about Standard Modules stuff. But to be fair, I'd take this implementation with a pinch of salt. Due to the lack of build-system support for Modules (there are some murmurs these days on the CMake GitLab about very soon adding meaningful support for this...) we do not really seem to have large-scale real-world experience as to how the standard really could be made working, especially in such a weird case like touching namespace std which might not be standards-compliant in the first place. (But yeah, we are test code here in a compiler; we can cut ourselves some slack.)

The code itself in these test files is really weird; it seems the test is centred around the idea that there are:

// TU "A"
module; // Code in the *global module fragment*
namespace std { /* ... */ class basic_string; }

export module Whatever; // Code in the public module interface unit of module Whatever.
export /* ... */ using str = std::basic_string<...>;

So we have a symbol called str, which is publicly visible (to TUs importing Whatever), and the namespace std preceding Whatever in TU "A" is not visible but reachable? (The standard's way of expressing reachability is a kind of a mess, so I'd definitely try finding the people who worked in developing the actual solution that makes Standard Modules work in Clang... call the help of someone who's really expert on the Standard...) Perhaps @rsmith could help us clarify this situation.

You can't name std::basic_string<...> in the client code (because it is not visible), but you can depend on the symbol (e.g., decltype it to an alias!) because it is reachable!

And then you have

// TU "B"
import Whatever;

namespace std { /* ... */ struct char_traits {}; }

// use Sb mangling, not Ss, as this is not global-module std::char_traits
/* ... */
void f(str<..., std::char_traits<...>>)

In which there is "another" (?) namespace std, which should(??) also be part of the global module:

The global module is the collection of all global-module-fragments and all translation units that are not module units. Declarations appearing in such a context are said to be in the purview of the global module.

I believe that TU "B" is not a module unit... But the comment (which is somehow misformatted?) in it would like to say that this std is, in fact, not part of the global module.

So somehow, the changes in this patch are making both stds part of the same purview (whatever that means, really...), thus resulting in ambiguity.

  • Does this only happen on non-X86? @vabridgers, you could try adding a --target= flag locally in the test file to the compiler invocation and re-running the tests to verify. (Might need an LLVM build that is capable of generating code of other architectures.)
  • My other suggestion would be putting some ->dump() calls in your change, running the code and observing how the ASTs are looking in the states when execution is within the functions you're trying to change.
clang/test/AST/ast-dump-traits.cpp
55

How is this change legit? Is this FunctionDecl node just "floating"? If so, why is there a - preceding in the textual dump? Usually top-level nodes do not have such a prefix.

clang/test/AST/fixed_point.c
405

Ditto.

But there should be a way to make the namespace "invisible" like it is done with std::bad_alloc, the linked commit seems to contain the code to "hide" the first std definition and this may not work any more. Another question is why this architecture has a probably not standard requirement. This may be a question for the discourse forum to reach the developers that can make a working fix.

With this fix the std::__va_list is created at initialization of Sema, previously it was not there. Function CreateAArch64ABIBuiltinVaListDecl (ASTContext.cpp) makes the namespace std and __va_list record. This change fixes the problem with ASTImporter probably because the __va_list record exists before start of AST import, this way it is found and added to ASTImporterLookupTable at initialization (the original problem was caused because std::__va_list is missing from ASTImporterLookupTable). But I was thinking of other ways to fix the problem. My old fix for the problem may still work without any test failures:

ASTImporterLookupTable do not contain an entry for __va_list_tag, I do not know why it is missing. If it is added "manually" the crash disappears (without fix in VisitTypedefType). Following code was used to add VaListTagDecl:

ASTImporterLookupTable::ASTImporterLookupTable(TranslationUnitDecl &TU) {
  Builder B(*this);
  B.TraverseDecl(&TU);
  // Add __va_list_tag to the table, it is not visited by the builder.
  if (NamedDecl *D = dyn_cast_or_null<NamedDecl>(TU.getASTContext().getVaListTagDecl()))
    add(&TU, D);
}

The problem probably existed before but did not have visible effect until the new assertion was added.

It can happen (without any of the new fixes) that std::__va_list is created during the AST import process, it is created at first access. This can be the reason why it is missing from the lookup table: It is created implicitly by ASTContext code in the "to" context during import. To fix this problem, I think the above quoted code is a good solution: The va_list declaration is created at start of AST import (if it did not exist yet) (by the get function) and added to the lookup table. Probably it can be a problem that the va_list declaration and std namespace appears in a TU by just importing any code into it, but probably no tests will fail for this case. Another way to fix the problem is to add special handling of std::__va_list at least on the affected platforms to ASTImporterLookupTable but I did not check of this can be done. I do not like to change function ASTImporter::VisitTypedefType like in the very first fix of @vabridgers because it affects how every typedef is imported, and that can be incorrect. Probably some special handling of a record called __va_list at AST import can be a good fix too.

@donat.nagy , regarding the namespace leaking, there was a change -> https://reviews.llvm.org/D116774 that modified the behavior for aarch64 and arm. While not incorrect, it may offer some historical view or examples of how to address the current cases.

@whisperity , I'll try your advice and post the results.

clang/test/AST/ast-dump-overloaded-operators.cpp
27

this backspace appeared in the original patch submitted by @mizvekov in review https://reviews.llvm.org/D136886, discussed for the test case clang/test/AST/ast-dump-overloaded-operators.cpp between @aaron.ballman and @mizvekov.

Comments from that review from @mizvekov

<q>
What is happening here (and in all the other such instances) is that on the import case, this declaration stops being the last one on the TU. So the beginning of the line would match on |- instead of `-, but the non-import case this remains the last one.

So I simply relaxed the match.
</q>

jrtc27 added inline comments.Feb 16 2023, 4:52 PM
clang/test/AST/ast-dump-overloaded-operators.cpp
27

Then make this abundantly clear with

{{\||`}}

or whatever the right escaping is (which, incidentally, I can't figure out how to do as inline code in Phabricator...)

whisperity edited the summary of this revision. (Show Details)Feb 17 2023, 5:14 AM
whisperity added inline comments.Feb 17 2023, 5:17 AM
clang-tools-extra/test/clang-tidy/checkers/cert/dcl58-cpp.cpp
3

I agree because there might be a case that some buildbot running the tests won't have these targets. If we really need to explicitly test something related to these targets, perhaps they should go into a separate file, and have a check-prefix. Or the expectation is that things work in the same way across all three platforms?

clang/docs/ReleaseNotes.rst
63
whisperity added inline comments.Feb 17 2023, 5:29 AM
clang-tools-extra/test/clang-tidy/checkers/cert/dcl58-cpp.cpp
5–6

(Style nits.)

vabridgers planned changes to this revision.Feb 17 2023, 5:51 AM

Changes are planned. Please do not waste any more time on this for now. This will probably be abandoned.

I have created the new patch D144273 that should fix the same AST import problems as this patch, without change of Sema.

vabridgers abandoned this revision.Feb 24 2023, 10:10 AM