Page MenuHomePhabricator

[C++20][Modules] Implementation of GMF decl elision.
Needs ReviewPublic

Authored by iains on May 31 2022, 5:41 AM.

Details

Summary

This implements the provisions of [module.global.frag] p3/4. This provides for discarding
declarations in the global module fragment if they are not reachable from any use within
the named module purview.

Assuming that the current TU has a global module fragment, we start out by creating decls
in that fragment marked as "ModuleDiscardable".

Then when in the module purview of the named module, whenever we encounter a use
or reference of a declaration, we check to see if that declaration (or any other declaration that
becomes indirectly reachable via it) is on the GMF. If so, then we mark that declaration as
reachable. We also check for cases that types in such declarations cause type decls in the
GMF to be reachable.

In the most general case, this can represent a significant amount of work - walking through
the bodies of functions to check for indirect reachability. To counter that as far as possible
we use two sets (one for decls, one for types) to ensure that each is only visited once.

Diff Detail

Unit TestsFailed

TimeTest
70 msx64 debian > Clang.CXX/basic/basic_lookup/basic_lookup_argdep::p4-friend-in-reachable-class.cpp
Script: -- : 'RUN: at line 6'; rm -fr /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/CXX/basic/basic.lookup/basic.lookup.argdep/Output/p4-friend-in-reachable-class.cpp.tmp
100 msx64 debian > Clang.Modules::derived_class.cpp
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Modules/Output/derived_class.cpp.tmp
70 msx64 debian > Clang.Modules::explicitly-specialized-template.cpp
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Modules/Output/explicitly-specialized-template.cpp.tmp
70 msx64 debian > Clang.Modules::template-function-specialization.cpp
Script: -- : 'RUN: at line 1'; rm -fr /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Modules/Output/template-function-specialization.cpp.tmp
70 msx64 debian > Clang.Modules::template_default_argument.cpp
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Modules/Output/template_default_argument.cpp.tmp

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ChuanqiXu added inline comments.Jun 7 2022, 2:19 AM
clang/lib/Sema/SemaModule.cpp
982–983

clang prefer shorter indentation.

iains updated this revision to Diff 434834.Jun 7 2022, 8:42 AM
iains marked 8 inline comments as done.

rebased and removed dependency on p1874 initializer patch.

Some tidying - added 104. ex2 testcase.

iains added a comment.Jun 7 2022, 8:42 AM

again, thanks for review - but please do not spend any effort on style points yet - the debug and dump stuff is intentionally present this is "for comment on the approach"

i.e. what is important is to establish that this is a reasonable approach.

As of now we have the following fails - which have to be analysed (of course, markReachableGMFDecls is expected to be incomplete at this point.

FAIL: Clang :: CXX/basic/basic.lookup/basic.lookup.argdep/p4-friend-in-reachable-class.cpp (3992 of 15859)
FAIL: Clang :: Modules/derived_class.cpp (6764 of 15859)
FAIL: Clang :: Modules/explicitly-specialized-template.cpp (6831 of 15859)
FAIL: Clang :: Modules/template-function-specialization.cpp (7060 of 15859)
FAIL: Clang :: Modules/template_default_argument.cpp (7207 of 15859)
clang/include/clang/AST/DeclBase.h
647

The only place that the ownership is ModuleUnreachable is in the GMF - so that we do not need to do an additional test.

clang/include/clang/Sema/Sema.h
2273

If the consensus is to add an extra test, OK.

However, as above, specifically to avoid making more and more tests in code that is executed very frequently - as the design currently stands, the only place that ModuleUnreachable is set is in the GMF.

clang/lib/AST/TextNodeDumper.cpp
1622

I don't think that is a matter of style __module_private__ is a keyword used elsewhere?

If you look though the file you will see mostly that the printed output does not prepend or append underscores.

BTW, similar changes are probably needed in other node printers, this was done early to add debug.

clang/lib/Sema/SemaModule.cpp
266

this code has been removed (it belongs in the D126959 patch).

(not relevant to the patch, but for the record), for the functionality to be correct - we must ensure that the interface module is registered first - so that the module names table contains the pointer to that

346–347

this has been reorganised

957

what is D here?
markReachableGMFDecls is only called in the case that Orig is not discarded (we are marking decls as reachable because they are used in the interface..

clang/lib/Serialization/ASTReaderDecl.cpp
635–639

yes, perhaps we could do that - I am wondering if that code can all be factored into the switch.

clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
32–33

I do not think it is a matter of a diagnostic message.

If we omit the decl from the BMI (which we are permitted to do if it is ModuleUnreachable), then it cannot be found and therefore the information on which header it came from would not be available.

iains added a comment.EditedJun 7 2022, 1:16 PM

the first failure is like this:

x-h.h:
struct A {
  friend int operator+(const A &lhs, const A &rhs) {
    return 0;
  }
};

X.cpp:
module;
#include "x-h.h"
export module X;

export using ::A;

This does not mark *anything* in the GMF (x-h.h) as 'used', and so everything there is unreachable (and hence the fails).
I.e export using ::A; neither marks A as used or referenced.
I am not sure if we are supposed to special-case this - since https://eel.is/c++draft/module#global.frag-3.6 explicitly says "In this determination, it is unspecified .. whether using-declaration, ... is replaced by the declarations they name prior to this determination,
so .. not about how to proceed with this one at present;
edit: but it seems most reasonable to make it behave as if A was itself exported.

the first failure is like this:

x-h.h:
struct A {
  friend int operator+(const A &lhs, const A &rhs) {
    return 0;
  }
};

X.cpp:
module;
#include "x-h.h"
export module X;

export using ::A;

This does not mark *anything* in the GMF (x-h.h) as 'used', and so everything there is unreachable (and hence the fails).
I.e export using ::A; neither marks A as used or referenced.
I am not sure if we are supposed to special-case this - since https://eel.is/c++draft/module#global.frag-3.6 explicitly says "In this determination, it is unspecified .. whether using-declaration, ... is replaced by the declarations they name prior to this determination,
so .. not about how to proceed with this one at present;
edit: but it seems most reasonable to make it behave as if A was itself exported.

I highly recommend we should mark A here. Maybe we need other interfaces than markDeclUsed and setDeclReferenced. If we don't support this, we couldn't use modules like https://github.com/alibaba/async_simple/blob/CXX20Modules/third_party_module/asio/asio.cppm. This manner is important to use C++20 modules before the build system is ready. Also, I think it is an important tool to implement C++23's std modules. So we should support it.

clang/include/clang/AST/DeclBase.h
240

Would you like to use the term 'ModuleDiscarded'? My point is that not all unreachable declaration in modules are in GMF. And the term discard is used in standard to describe it. So it looks better.

647

It is more robust and clear to add the additional check to me. Since the constraint now lives in the comment only. If anyone goes to change it, the codes wouldn't complain.

clang/include/clang/Sema/Sema.h
2273

Yeah, my opinion is the same as above. Although it is the design, it is more semantically clear and robust to add a additional check. I am just afraid it would confuse and block other readers or contributors.

clang/lib/AST/TextNodeDumper.cpp
1622

Oh, I found __module_private__ is a keyword in clang modules. I didn't recognize it. Even in this case, I still prefer to keep the style consistently. I think users would be more comfortable to read consistent symbols. Also I think it is acceptable to keep ModuleUnreachable since it doesn't matter a lot to me.

clang/lib/Sema/SemaModule.cpp
957

Oh, D should be Orig, my bad. Yeah, I found markReachableGMFDecls is only called when Orig is not discarded. So I suggest to add the assertion to make it explicit and clear and it could avoid misuse in the future.

clang/lib/Serialization/ASTReaderDecl.cpp
635–639

It looks possible.

clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
32–33

Your word makes sense.

clang/test/Modules/cxx20-10-4-ex2.cpp
44

There is invisible symbol.

iains added a comment.Jun 8 2022, 12:55 AM

the first failure is like this:

x-h.h:
struct A {
  friend int operator+(const A &lhs, const A &rhs) {
    return 0;
  }
};

X.cpp:
module;
#include "x-h.h"
export module X;

export using ::A;

This does not mark *anything* in the GMF (x-h.h) as 'used', and so everything there is unreachable (and hence the fails).
I.e export using ::A; neither marks A as used or referenced.
I am not sure if we are supposed to special-case this - since https://eel.is/c++draft/module#global.frag-3.6 explicitly says "In this determination, it is unspecified .. whether using-declaration, ... is replaced by the declarations they name prior to this determination,
so .. not about how to proceed with this one at present;
edit: but it seems most reasonable to make it behave as if A was itself exported.

I highly recommend we should mark A here. Maybe we need other interfaces than markDeclUsed and setDeclReferenced. If we don't support this, we couldn't use modules like https://github.com/alibaba/async_simple/blob/CXX20Modules/third_party_module/asio/asio.cppm. This manner is important to use C++20 modules before the build system is ready. Also, I think it is an important tool to implement C++23's std modules. So we should support it.

Actually, after thinking some more, what seems to be wrong here is that we should be making the exported item "VisibleIfImported" .. which is not being done - I guess this was a bug already and it has been exposed by the recent changes in the module ownership processing. I will next take a look at that (and the other comments).

iains retitled this revision from [C++20][Modules] Initial implementation of GMF decl elision. to [C++20][Modules] Implementation of GMF decl elision..Jun 12 2022, 12:54 PM
iains edited the summary of this revision. (Show Details)
iains updated this revision to Diff 436245.Jun 12 2022, 12:58 PM
iains edited the summary of this revision. (Show Details)

this is a rework of the implementation.

it now passes all test-cases except one - which is one of the cases added
by D113545; that test-case now seems to produce multiple error messages for
each of the cases that it is required to (i.e. the lines that should error
do produce an error - but they also produce a number of similar or duplicated
errors).

iains marked 3 inline comments as done.Jun 12 2022, 1:16 PM

@ChuanqiXu - I changed the module ownership name to "ModuleDiscardable" - because, while we are permitted to discard these, we might choose not to (to give your better diagnostics) - but we still need to treat them as non-reachable and non-visible. Please could you examine what is happening with Modules/explicitly-specialized-template.cpp where there a multiple error messages for each (intentionally) failing line .. add the test from the std.

@rsmith
this seems like an awful lot of work that has to be done for every decl we mark used in the module purview - we cannot even short-cut ones that are not in the GMF, since the provisions of [module.global.frag] p3.5 seem to allow this to happen for indirect reaching. I wonder if I misread the std.
I am also wondering what is supposed to happen when an interface makes a type reachable, but that type is not visible to the implementation .. it seems to mean that interfaces would have to add using declarations for each such type.

clang/include/clang/AST/DeclBase.h
240

ModuleDiscardable Is better, it says we have permission to discard it (so that it cannot be used or reached) but allows for the case we elect to keep them around for better diagnostics. You might want to consider renaming the wrapper function similarly?

647

I am very concerned about the amount of work this (GMF decl elision) adds, so would like to minimise this - what I have done is to add an assert at the point we might make a change to the ownership that tests to see the decl is in the fragment we expect.

clang/lib/Sema/SemaModule.cpp
988

of course, as noted, the patch here is for comment on approach.

@ChuanqiXu - I changed the module ownership name to "ModuleDiscardable" - because, while we are permitted to discard these, we might choose not to (to give your better diagnostics) - but we still need to treat them as non-reachable and non-visible. Please could you examine what is happening with Modules/explicitly-specialized-template.cpp where there a multiple error messages for each (intentionally) failing line .. add the test from the std.

I've looked into this. The multiple duplicated diagnostic message is a legacy problem. I tried to fix it but it is hard (change it would break many other tests). To filter out the redundant duplicated diagnostic message, you could use '+1' suffix after expected-error. For example: https://github.com/llvm/llvm-project/blob/16ca490f450ea3ceaeda92addd8546967af4b2e1/clang/test/Modules/cxx-templates.cpp#L215-L232 BTW, after I filtered out the duplicated diagnostic message, I find the complains the use of foo<int> in the module purview, which is not right and should be a bug of this revision.

@rsmith
this seems like an awful lot of work that has to be done for every decl we mark used in the module purview - we cannot even short-cut ones that are not in the GMF, since the provisions of [module.global.frag] p3.5 seem to allow this to happen for indirect reaching. I wonder if I misread the std.

Yeah, and I am not sure what is better idea here. I tried to implement GMF decl elision before. My first idea is similar to your first revision. But I tried to implement indirect discarding that time. So I chose to traverse the decls in GMF until a fixed point when we handle the end of the TU. It looks bad clearly (the time complexity is not satisfying). So I didn't post it up.

I am also wondering what is supposed to happen when an interface makes a type reachable, but that type is not visible to the implementation .. it seems to mean that interfaces would have to add using declarations for each such type.

@rsmith should be the best to answer here. But I am trying to answer it. If 'the implementation' means the implementation unit, I think the type is naturally visible to the implementation unit. We could find the example in: module.interface-example5 .

clang/include/clang/AST/DeclBase.h
240

From my reading, it looks like the std says "we should discard things in following cases... although there are some cases we could make decision by ourself". So it looks like ModuleDiscarded is better than ModuleDiscardable to me.

647

How about something like:

assert (getModuleOwnershipKind() != ModuleOwnershipKind::ModuleDiscardable || getOwningModule()->isGlobalModule);
return getModuleOwnershipKind() == ModuleOwnershipKind::ModuleDiscardable;

So that we could get the clear semantics and the performance.

iains updated this revision to Diff 436378.Jun 13 2022, 7:07 AM
iains marked 2 inline comments as done.

rebased, fixed the fail for explicitly-specialized-template.cpp.

iains marked 3 inline comments as done.Jun 13 2022, 7:26 AM

@ChuanqiXu - I changed the module ownership name to "ModuleDiscardable" - because, while we are permitted to discard these, we might choose not to (to give your better diagnostics) - but we still need to treat them as non-reachable and non-visible. Please could you examine what is happening with Modules/explicitly-specialized-template.cpp where there a multiple error messages for each (intentionally) failing line .. add the test from the std.

I've looked into this. The multiple duplicated diagnostic message is a legacy problem. I tried to fix it but it is hard (change it would break many other tests). To filter out the redundant duplicated diagnostic message, you could use '+1' suffix after expected-error. For example: https://github.com/llvm/llvm-project/blob/16ca490f450ea3ceaeda92addd8546967af4b2e1/clang/test/Modules/cxx-templates.cpp#L215-L232 BTW, after I filtered out the duplicated diagnostic message,

This was not the problem in this particular case - but (for reference) is there an issue for the duplicated diagnostics? - that seems not very user-friendly, especially since C++ emits a lot of long diagnostics already.

I find the complains the use of foo<int> in the module purview, which is not right and should be a bug of this revision.

Yes, fixed with the latest upload; the provisions of [module.global.frag] p3 are quite complex, it's easy to miss one (and now the explicitly-specialized-template.cpp tase passes too)

@rsmith
this seems like an awful lot of work that has to be done for every decl we mark used in the module purview - we cannot even short-cut ones that are not in the GMF, since the provisions of [module.global.frag] p3.5 seem to allow this to happen for indirect reaching. I wonder if I misread the std.

Yeah, and I am not sure what is better idea here. I tried to implement GMF decl elision before. My first idea is similar to your first revision. But I tried to implement indirect discarding that time. So I chose to traverse the decls in GMF until a fixed point when we handle the end of the TU. It looks bad clearly (the time complexity is not satisfying). So I didn't post it up.

I am also wondering what is supposed to happen when an interface makes a type reachable, but that type is not visible to the implementation .. it seems to mean that interfaces would have to add using declarations for each such type.

@rsmith should be the best to answer here. But I am trying to answer it. If 'the implementation' means the implementation unit, I think the type is naturally visible to the implementation unit. We could find the example in: module.interface-example5 .

That is not the problem I am describing - which more like:

type.h:
typedef int MyInt;

mod-interface.cpp:
module;
#include "type.h"
export module A;

MyInt foo (); // makes MyInt reachable, but not visible.

mod-imp.cpp:
module A;

MyInt foo () { return 42; } // will not compile because the type declaration is not visible.

I am told that this is the intention .. it just seems a wee bit odd.
clang/include/clang/AST/DeclBase.h
240

I think we should observe the "as if" rule here, and say that the implementation would behave "as if" the decls were removed (but we choose to leave them, at least for now, because that improves the diagnostic you mentioned before).

The name should be helpful to programmers - if we make it "Discarded" then it will surely be confusing to a programmer to find that the decls are still present - "Discardable" says the right thing (that you should not use them because at some time they could be removed).

647

let's finalise this once we have the rest of the patch in reasonable shape and the approach is agreed to be an acceptable way forward.

@ChuanqiXu - I changed the module ownership name to "ModuleDiscardable" - because, while we are permitted to discard these, we might choose not to (to give your better diagnostics) - but we still need to treat them as non-reachable and non-visible. Please could you examine what is happening with Modules/explicitly-specialized-template.cpp where there a multiple error messages for each (intentionally) failing line .. add the test from the std.

I've looked into this. The multiple duplicated diagnostic message is a legacy problem. I tried to fix it but it is hard (change it would break many other tests). To filter out the redundant duplicated diagnostic message, you could use '+1' suffix after expected-error. For example: https://github.com/llvm/llvm-project/blob/16ca490f450ea3ceaeda92addd8546967af4b2e1/clang/test/Modules/cxx-templates.cpp#L215-L232 BTW, after I filtered out the duplicated diagnostic message,

This was not the problem in this particular case - but (for reference) is there an issue for the duplicated diagnostics? - that seems not very user-friendly, especially since C++ emits a lot of long diagnostics already.

I failed to find one and I filed one at: https://github.com/llvm/llvm-project/issues/56024. Hope this helps.

@rsmith
this seems like an awful lot of work that has to be done for every decl we mark used in the module purview - we cannot even short-cut ones that are not in the GMF, since the provisions of [module.global.frag] p3.5 seem to allow this to happen for indirect reaching. I wonder if I misread the std.

Yeah, and I am not sure what is better idea here. I tried to implement GMF decl elision before. My first idea is similar to your first revision. But I tried to implement indirect discarding that time. So I chose to traverse the decls in GMF until a fixed point when we handle the end of the TU. It looks bad clearly (the time complexity is not satisfying). So I didn't post it up.

I am also wondering what is supposed to happen when an interface makes a type reachable, but that type is not visible to the implementation .. it seems to mean that interfaces would have to add using declarations for each such type.

@rsmith should be the best to answer here. But I am trying to answer it. If 'the implementation' means the implementation unit, I think the type is naturally visible to the implementation unit. We could find the example in: module.interface-example5 .

That is not the problem I am describing - which more like:

type.h:
typedef int MyInt;

mod-interface.cpp:
module;
#include "type.h"
export module A;

MyInt foo (); // makes MyInt reachable, but not visible.

mod-imp.cpp:
module A;

MyInt foo () { return 42; } // will not compile because the type declaration is not visible.

I am told that this is the intention .. it just seems a wee bit odd.

Oh, it makes sense now. We could include "type.h" in the implementation unit so it wouldn't matter I think.

iains updated this revision to Diff 438968.Jun 22 2022, 4:12 AM
iains marked 2 inline comments as done.

reabsed onto latest version of D113545.

ChuanqiXu added inline comments.Jun 28 2022, 10:08 PM
clang/include/clang/AST/DeclBase.h
624

According to the opinion from @rsmith, the discarded declaration is private too.

iains updated this revision to Diff 440915.Jun 29 2022, 2:58 AM
iains marked an inline comment as done.

rebased after D113545 was landed and removed that as a parent.

clang/include/clang/AST/DeclBase.h
624

I guess you mean >= ... however Discardable is a stronger constraint than Private

If a decl remains marked Discardable (after the processing to determine reachable ones) that means it is both unreachable and invisible.
So it must not participate in any processing (with the one exception of diagnostic output). I would be concerned that the change you suggest above could cause a Discardable decl to be considered in merging or lookup and we would then need (maybe a lot) of logic like:

if (D->isModulePrivate() && !D->isModuleDiscardable())
   ...

I will take a look on the next iteration.

rsmith added inline comments.Jul 1 2022, 6:56 PM
clang/include/clang/Sema/Sema.h
2268–2269

These names seem too general to live directly in Sema.

2271
clang/lib/Sema/SemaModule.cpp
912–916

Do we need the special MakeVisible handling here? I would have thought that it would be sufficient for Child itself to be visible. Making the target of a using declaration visible seems like it would do the wrong thing for a case like:

module;
int f();
export module M;
namespace N { export using ::f; }
import M;
int a = f(); // should be an error, ::f should not be visible
int b = N::f(); // should be OK, UsingShadowDecl is visible

I wonder if we need any special handling here at all -- if the GMF decls are referenced by something in an export block, then I'd have expected they'll already be marked as used and that marking would have marked them as reachable.

1006–1018

Instead of storing a SeenDecls set and checking it here, is it feasible to check whether we're transitioning this declaration from discardable to retained, and only doing the work below if so?

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules. I would try to add more tests about C++20 Modules.

iains added a comment.Jul 5 2022, 12:02 AM

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

I would try to add more tests about C++20 Modules.

of course, more tests can be useful, but it would be better to try to be specific - it does not crash for any of the tests in the clang test suite.

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

Here is the crash log:

unhandled type: 0x7cda9e0 LValueReferenceType 0x7cda9e0 'const struct std::_PairT &'
`-QualType 0x7cd85f1 'const struct std::_PairT' const
  `-RecordType 0x7cd85f0 'struct std::_PairT'
    `-CXXRecord 0x7cd8540 '_PairT'
unhandled type: 0x7cdabb0 RValueReferenceType 0x7cdabb0 'struct std::_PairT &&'
`-RecordType 0x7cd85f0 'struct std::_PairT'
  `-CXXRecord 0x7cd8540 '_PairT'
unhandled type: 0x80e0400 LValueReferenceType 0x80e0400 'struct std::_PairT &'
`-RecordType 0x7cd85f0 'struct std::_PairT'
  `-CXXRecord 0x7cd8540 '_PairT'
unhandled type: 0x7c0e1a0 TemplateTypeParmType 0x7c0e1a0 '_T1' dependent depth 0 index 0
`-TemplateTypeParm 0x7c0e148 '_T1'
unhandled type: 0x7c0e220 TemplateTypeParmType 0x7c0e220 '_T2' dependent depth 0 index 1
`-TemplateTypeParm 0x7c0e1d0 '_T2'
unhandled type: 0x7c0e6a0 LValueReferenceType 0x7c0e6a0 'const pair<_T1, _T2> &' dependent
`-QualType 0x7ace031 'const pair<_T1, _T2>' const
  `-InjectedClassNameType 0x7ace030 'pair<_T1, _T2>' dependent
    `-CXXRecord 0x7c0e290 'pair'
unhandled type: 0x7c0e8e0 RValueReferenceType 0x7c0e8e0 'pair<_T1, _T2> &&' dependent
`-InjectedClassNameType 0x7ace030 'pair<_T1, _T2>' dependent
  `-CXXRecord 0x7c0e290 'pair'
clang++: /home/chuanqi.xcq/workspace.xuchuanqi/llvm-project-for-work/clang/lib/Sema/SemaModule.cpp:1084: void clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool): Assertion `T->isRecordType() && "base not a record?"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang++ -std=c++20 --precompile -fprebuilt-module-path=. coroutine.cppm -o std-coroutine.pcm -isystem ../build_libcxx/include/c++/v1 -I../build_libcxx/include/x86_64-unknown-linux-gnu/c++/v1 -nostdinc++
1.	<eof> parser at end of file
2.	../build_libcxx/include/c++/v1/__functional/hash.h:308:12: instantiating function definition 'std::__scalar_hash<std::_PairT, 2>::operator()'
3.	../build_libcxx/include/c++/v1/__functional/hash.h:93:18: instantiating function definition 'std::__murmur2_or_cityhash<unsigned long, 64>::operator()'
 #0 0x0000000001f60f90 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x0000000001f5ede4 llvm::sys::CleanupOnSignal(unsigned long) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x1f5ede4)
 #2 0x0000000001eaa468 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007ffff7fb29d0 __restore_rt sigaction.c:0:0
 #4 0x00007ffff7675f35 raise (/lib64/libc.so.6+0x3bf35)
 #5 0x00007ffff765f8d7 abort (/lib64/libc.so.6+0x258d7)
 #6 0x00007ffff765f7a7 _nl_load_domain.cold loadmsgcat.c:0:0
 #7 0x00007ffff766e536 (/lib64/libc.so.6+0x34536)
 #8 0x0000000004696390 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4696390)
 #9 0x0000000004695d87 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695d87)
#10 0x0000000004696559 clang::Sema::findGMFReachableDeclsForType(clang::Type const*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4696559)
#11 0x000000000469689b clang::Sema::findGMFReachableDeclExprs(clang::Stmt*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x469689b)
#12 0x00000000046967bd clang::Sema::findGMFReachableDeclExprs(clang::Stmt*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x46967bd)
#13 0x00000000046967bd clang::Sema::findGMFReachableDeclExprs(clang::Stmt*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x46967bd)
#14 0x00000000046967bd clang::Sema::findGMFReachableDeclExprs(clang::Stmt*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x46967bd)
#15 0x00000000046967bd clang::Sema::findGMFReachableDeclExprs(clang::Stmt*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x46967bd)
#16 0x00000000046958c2 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x46958c2)
#17 0x0000000004695749 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695749)
#18 0x0000000004695e75 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695e75)
#19 0x0000000004695e75 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695e75)
#20 0x0000000004695d87 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695d87)
#21 0x0000000004695b71 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4695b71)
#22 0x000000000469653b clang::Sema::findGMFReachableDeclsForType(clang::Type const*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x469653b)
#23 0x0000000004696114 clang::Sema::markGMFDeclsReachableFrom(clang::Decl*, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4696114)
#24 0x0000000004a15993 clang::Sema::BuildVariableInstantiation(clang::VarDecl*, clang::VarDecl*, clang::MultiLevelTemplateArgumentList const&, llvm::SmallVector<clang::Sema::LateInstantiatedAttribute, 16u>*, clang::DeclContext*, clang::LocalInstantiationScope*, bool, clang::VarTemplateSpecializationDecl*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a15993)
#25 0x0000000004a1f3be clang::TemplateDeclInstantiator::VisitVarDecl(clang::VarDecl*, bool, llvm::ArrayRef<clang::BindingDecl*>*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a1f3be)
#26 0x0000000004a241f4 void llvm::function_ref<void ()>::callback_fn<clang::Sema::SubstDecl(clang::Decl*, clang::DeclContext*, clang::MultiLevelTemplateArgumentList const&)::'lambda'()>(long) SemaTemplateInstantiateDecl.cpp:0:0
#27 0x000000000411d7a1 clang::Sema::runWithSufficientStackSpace(clang::SourceLocation, llvm::function_ref<void ()>) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x411d7a1)
#28 0x00000000049d7042 clang::Sema::SubstDecl(clang::Decl*, clang::DeclContext*, clang::MultiLevelTemplateArgumentList const&) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x49d7042)
#29 0x000000000498fb15 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformDeclStmt(clang::DeclStmt*) SemaTemplateInstantiate.cpp:0:0
#30 0x00000000049c9479 clang::TreeTransform<(anonymous namespace)::TemplateInstantiator>::TransformCompoundStmt(clang::CompoundStmt*, bool) SemaTemplateInstantiate.cpp:0:0
#31 0x00000000049cdfa5 clang::Sema::SubstStmt(clang::Stmt*, clang::MultiLevelTemplateArgumentList const&) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x49cdfa5)
#32 0x0000000004a197af clang::Sema::InstantiateFunctionDefinition(clang::SourceLocation, clang::FunctionDecl*, bool, bool, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a197af)
#33 0x0000000004a17c0f clang::Sema::PerformPendingInstantiations(bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a17c0f)
#34 0x0000000004a19a3c clang::Sema::InstantiateFunctionDefinition(clang::SourceLocation, clang::FunctionDecl*, bool, bool, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a19a3c)
#35 0x0000000004a17c0f clang::Sema::PerformPendingInstantiations(bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x4a17c0f)
#36 0x000000000413b45a clang::Sema::ActOnEndOfTranslationUnitFragment(clang::Sema::TUFragmentKind) (.part.0) Sema.cpp:0:0
#37 0x000000000413bba6 clang::Sema::ActOnEndOfTranslationUnit() (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x413bba6)
#38 0x0000000003ff6448 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x3ff6448)
#39 0x0000000003feae7a clang::ParseAST(clang::Sema&, bool, bool) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x3feae7a)
#40 0x0000000002a66409 clang::FrontendAction::Execute() (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x2a66409)
#41 0x00000000029f2ec6 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x29f2ec6)
#42 0x0000000002b40433 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x2b40433)
#43 0x0000000000a0edca cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0xa0edca)
#44 0x0000000000a08298 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#45 0x000000000286cd75 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#46 0x0000000001eaa5d4 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x1eaa5d4)
#47 0x000000000286d46f clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#48 0x0000000002838d52 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x2838d52)
#49 0x000000000283983d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x283983d)
#50 0x00000000028430dc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0x28430dc)
#51 0x0000000000a0d343 clang_main(int, char**) (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0xa0d343)
#52 0x00007ffff7661193 __libc_start_main (/lib64/libc.so.6+0x27193)
#53 0x0000000000a07ebe _start (/disk2/workspace.xuchuanqi/llvm-project-for-work/build/bin/clang-14+0xa07ebe)
clang-14: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 15.0.0 (git@github.com:llvm/llvm-project.git 487d8ba3f33b127a7996ab9d3ba9c056e5e5b8c2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/chuanqi.xcq/workspace.xuchuanqi/llvm-project-for-work/build/bin
clang-14: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-14: note: diagnostic msg: /tmp/coroutine-73dc76.cppm
clang-14: note: diagnostic msg: /tmp/coroutine-73dc76.sh
clang-14: note: diagnostic msg: 

********************

I would try to add more tests about C++20 Modules.

of course, more tests can be useful, but it would be better to try to be specific - it does not crash for any of the tests in the clang test suite.

Yeah, small test is helpful for debugging and developing. But large examples are an important part for testing. Although there are many tests in clang test suite, we often find breaking changes in new patches. So it would be better to have both.

iains updated this revision to Diff 442849.Jul 7 2022, 3:48 AM
iains marked 25 inline comments as done.

rebased, addressed review comments, added another test.

iains added a comment.Jul 7 2022, 3:49 AM

@rsmith, @ChuanqiXu apologies for the multiple revisions, this has turned out to be much more involved than I imagined from the standard's text.

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

Here is the crash log:

this code now compiles without error,

clang/include/clang/AST/DeclBase.h
624

I did try this and there are a number of regressions - when I looked into these there is some interaction with the changes made in D113545, so I think we should make these changes in a follow-one patch to avoid having two purposes in this one.

clang/include/clang/Sema/Sema.h
2268–2269

(revised we only need to track the type pointers)

On the basis that the name is poor, for its intended purpose, I revised.

2273

this has now been revised and a check is applied before any change is made to discardable decls.

clang/lib/AST/TextNodeDumper.cpp
1622

OK we now have more fine-grained output for the module ownership in the dumps which allows specific tests to be constructed. At present, the naming is as per decl.h (with the single exception of module_private which has a user-facing representation).

clang/lib/Sema/Sema.cpp
1130 ↗(On Diff #433061)

I think the comment refers to an older version of the changes.

1131–1133 ↗(On Diff #433061)

this is now implemented differently,

1138–1141 ↗(On Diff #433061)

this is now implemented differently

1146 ↗(On Diff #433061)

this is now implemented differently

1150 ↗(On Diff #433061)

this is now implemented differently

clang/lib/Sema/SemaModule.cpp
912–916

Do we need the special MakeVisible handling here? I would have thought that it would be sufficient for Child itself to be visible. Making the target of a using declaration visible seems like it would do the wrong thing for a case like:

module;
int f();
export module M;
namespace N { export using ::f; }
import M;
int a = f(); // should be an error, ::f should not be visible
int b = N::f(); // should be OK, UsingShadowDecl is visible

fixed, and added a test case to cover this.

I wonder if we need any special handling here at all -- if the GMF decls are referenced by something in an export block, then I'd have expected they'll already be marked as used and that marking would have marked them as reachable.

Indeed. this was over cautious, However, the one case that we have to cover is the target of a using decl - those are neither marked used nor referenced.

957

in the revised code, we have an assert.

1006–1018

I had that as my original implementation, which I revised to do the same kind of thing as the type pointers. However, the check for isModuleDiscardable is cheaper, indeed.

(JFTR, I also tried an implementation with a decl visitor, but it did not seem to be any less complex or really any shorter code-wise)

I cannot see how we can avoid tracking the types.

clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp
32–33

in the current implementation, we mark the decls but do not yet actually remove them - so that the better diagnostic should still be available.

@rsmith, @ChuanqiXu apologies for the multiple revisions, this has turned out to be much more involved than I imagined from the standard's text.

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

Here is the crash log:

this code now compiles without error,

Thanks for looking into it!

My personal plan for this revision is to review the details after we add more large tests (at least we have a more complete std modules implementation and I am trying for it. But I find another bug now). So we might need to wait for a while for this patch. How do you think about it?

iains added a comment.Jul 8 2022, 12:51 AM

@rsmith, @ChuanqiXu apologies for the multiple revisions, this has turned out to be much more involved than I imagined from the standard's text.

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

Here is the crash log:

this code now compiles without error,

Thanks for looking into it!

My personal plan for this revision is to review the details after we add more large tests (at least we have a more complete std modules implementation and I am trying for it. But I find another bug now). So we might need to wait for a while for this patch. How do you think about it?

Well, the difficulty there is that "add more large tests" is not a very specific objective.
I will be first to say that we can tell that the implementation here is necessary, but we cannot tell if it is sufficient - however, IMO we need to find a more definite way to make progress.

@rsmith, @ChuanqiXu apologies for the multiple revisions, this has turned out to be much more involved than I imagined from the standard's text.

BTW, after I applied the patch, the compiler crashes at https://github.com/ChuanqiXu9/stdmodules.

That link points to a project - is there (say) a gist of the crash information?

Here is the crash log:

this code now compiles without error,

Thanks for looking into it!

My personal plan for this revision is to review the details after we add more large tests (at least we have a more complete std modules implementation and I am trying for it. But I find another bug now). So we might need to wait for a while for this patch. How do you think about it?

Well, the difficulty there is that "add more large tests" is not a very specific objective.
I will be first to say that we can tell that the implementation here is necessary, but we cannot tell if it is sufficient - however, IMO we need to find a more definite way to make progress.

Yeah, clearness matters. Let me try to state my idea clearly:
(1) Make std modules sufficient to be able to cover at least <thread>, <vector>, <string>, <algorithm>, <memory> and <atomic>. So that we could run the tests from cppreference at least.
(2) Make https://github.com/alibaba/async_simple/tree/CXX20Modules compilable by clang trunk and std modules

Then we could use (1) or (2) to check some relatively large patches (including this one but not limited). In this way, I feel like we have more confident to say the C++20 Modules in Clang is workable. (Tests is never enough. I know). I would like to see how far I can go in the next month. So I would like to give a more specific answer what test ((1) or (2) or others) we need to use. I think one month might not be super long time in this area. How do you think about it?


BTW, I've sent two patches to enhance the usability: D128974 and D129068. And I am fighting with the other problems.

iains updated this revision to Diff 447094.Jul 23 2022, 11:46 AM

rebase, update testcases for upstream changes.

urnathan resigned from this revision.Aug 16 2022, 12:20 PM
iains updated this revision to Diff 462657.Sat, Sep 24, 4:31 AM

rebased and reworked.

The version here has now been tested to consume all of the libc++ headers including
those in experimental and ext.

Herald added a project: Restricted Project. · View Herald TranscriptSat, Sep 24, 4:31 AM