The best semantics I can come up with (based on discussions with
Richard) for inline asm and namespace scope internal linkage variables
in modular headers is to emit them into all users and never into modular
objects.
The reason for this is that such asm might create important symbols
(like the iostreams global initializer) that can only be produced if the
submodule is used (doesn't strictly apply to inline asm, as such - which
is currently emitted if the module is used at all (regardless of
submodules) I think). This does mean that if any modular code does use
these entities it will be ill formed/NDR (link error, generally) for
now. It could be diagnosed at parse/module-creation time.
This change has a bit of refactoring to allow this to work for inline
asm (& to work more efficiently for internal linkage namespace-scope
variables). Making the MODULAR_CODEGEN_DECLS and
EAGERLY_DESERIALIZED_DECLS distinct. Not only does this allow the
modular codegen not to needlessly deserialize and then ignore the
eagerly deserialized decls, it also allows there to be eagerly
deserialized decls that modular codegen never sees - such as inline asm.
(if this isn't the right design - if the absence of an entry in
eagerly/modular codegen shouldn't produce different behavior (ie: it
should only be an optimization) happy to make this work
differently/harder by making sure inline asm can be filtered, even if it
were loaded)
This change does a little work to preserve a scenario that it probably
doesn't need to: 'classic' AST codegen. There's a few tests
(test/Frontend/ast-*) that verify that the codegen of a serialized AST
file (without -fmodules-codegen) produces basically the same code as if
the AST hadn't been serialized. Is this important? Should we kill off
those tests & not worry about it? (this behavior is preserved by still
respecting the EAGERLY_DESERIALIZED_DECLS when deserializing an AST file
if it doesn't contain MODULAR_CODEGEN_DECLS).