This is an archive of the discontinued LLVM Phabricator instance.

PR60985: Fix merging of lambda closure types across modules.
ClosedPublic

Authored by rsmith on Mar 9 2023, 4:06 PM.

Details

Summary

Previously, distinct lambdas would get merged, and multiple definitions
of the same lambda would not get merged, because we attempted to
identify lambdas by their ordinal position within their lexical
DeclContext. This failed for lambdas within namespace-scope variables
and within variable templates, where the lexical position in the context
containing the variable didn't uniquely identify the lambda.

In this patch, we instead identify lambda closure types by index within
their context declaration, which does uniquely identify them in a way
that's consistent across modules.

This change causes a deserialization cycle between the type of a
variable with deduced type and a lambda appearing as the initializer of
the variable -- reading the variable's type requires reading and merging
the lambda, and reading the lambda requires reading and merging the
variable. This is addressed by deferring loading the deduced type of a
variable until after we finish recursive deserialization.

This also exposes a pre-existing subtle issue where loading a
variable declaration would trigger immediate loading of its initializer,
which could recursively refer back to properties of the variable. This
particularly causes problems if the initializer contains a
lambda-expression, but can be problematic in general. That is addressed
by switching to lazily loading the initializers of variables rather than
always loading them with the variable declaration. As well as fixing a
deserialization cycle, that should improve laziness of deserialization
in general.

LambdaDefinitionData had 63 spare bits in it, presumably caused by an
off-by-one-error in some previous change. This change claims 32 of those bits
as a counter for the lambda within its context. We could probably move the
numbering to separate storage, like we do for the device-side mangling number,
to optimize the likely-common case where all three numbers (host-side mangling
number, device-side mangling number, and index within the context declaration)
are zero, but that's not done in this change.

Fixes #60985.

Diff Detail

Event Timeline

rsmith created this revision.Mar 9 2023, 4:06 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: martong. · View Herald Transcript
rsmith requested review of this revision.Mar 9 2023, 4:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2023, 4:06 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
shafik added a reviewer: Restricted Project.Mar 9 2023, 10:37 PM
rsmith updated this revision to Diff 504296.Mar 10 2023, 3:47 PM
  • Add some missing recursive deserialization guards.

Thank you for working on this Richard, that is a pretty subtle issue! I'm still wrapping my head around the changes (which mostly look reasonable to me), but I did have a few questions I ran into.

clang/include/clang/AST/DeclCXX.h
1787–1789

My kingdom for a strong typedef so we can differentiate between the various indices here within the type system. (It probably isn't worth the extra effort, but I do have some slight concerns about how easy it is to accidentally pass the wrong number.)

clang/include/clang/AST/ExternalASTSource.h
387

I think this helps make it clear that the function is only being called for its side effects.

clang/include/clang/AST/MangleNumberingContext.h
65

Does the index 0 mean "perhaps not-yet-assigned" when deserializing? Assuming I have that correct, perhaps we want to return ++LambdaIndex instead?

clang/lib/AST/ASTContext.cpp
6766–6767

Do we have to worry about the case where this function is not called during deserialization, but some other time when the variable had an error regarding its type? Or do we recover in that situation (usually with an int type) sufficiently that this shouldn't result in surprises?

clang/lib/Serialization/ASTReaderDecl.cpp
574–576

Do we need to do anything for structured bindings?

873–876
1604–1608
1966–1970
clang/lib/Serialization/ASTWriter.cpp
5330
clang/lib/Serialization/ASTWriterDecl.cpp
299–301
clang/test/Modules/lambda-in-variable.cpp
95

Should we add a test for structured bindings as well?

rsmith updated this revision to Diff 505240.Mar 14 2023, 1:20 PM
rsmith marked 6 inline comments as done.
  • Address review comments.
rsmith added inline comments.Mar 14 2023, 1:25 PM
clang/include/clang/AST/DeclCXX.h
1787–1789

I've moved Sema::LambdaNumbering here and used it more broadly, which should help us avoid errors from getting the argument order wrong.

clang/include/clang/AST/MangleNumberingContext.h
65

0 has no special meaning. Every lambda that we parse locally gets an index assigned when we give it a context declaration, and every lambda that we import has this field filled in when the struct containing the field is loaded. There's no window where the lambda has a context declaration but doesn't have a numbering within that context.

clang/lib/AST/ASTContext.cpp
6766–6767

The type being null is a violation of AST invariants, so that should never be observed here except while deserializing.

[It's unfortunate that this function got moved out of the AST reader. That's inviting bugs, because deserialization can't rely on AST invariants holding, and in particular can't call arbitrary AST functions. I've added some words of caution at the start of this function at least, but we should look at whether we can move this back into the AST reader. It looks like there aren't many dependencies on it and they're only using a very small part of the functionality here.]

clang/lib/Serialization/ASTReaderDecl.cpp
574–576

Hmm. Great question ;-)

So, the type of a BindingDecl can involve a lambda closure type, and the BindingDecl will be deserialized with the DecompositionDecl, before we had a chance to merge the lambda. Eg:

template<typename T> struct wrap {
  T v;
};
inline auto [x] = wrap{[]{}};

... won't merge properly across modules.

This is probably not hard to fix -- I think we can defer initializing the mapping from a DecompositionDecl to its BindingDecls until we've otherwise finished deserialization, and maybe we could make the BindingDecl pointers in a DecompositionDecl be lazy pointers for good measure. But this patch is already pretty big; mind if I do that as a follow-up change?

shafik added inline comments.Mar 14 2023, 9:04 PM
clang/lib/Serialization/ASTReader.cpp
7370

Curious, why do we need this here and below.

clang/lib/Serialization/ASTReaderDecl.cpp
2192–2193
aaron.ballman added inline comments.Mar 15 2023, 6:36 AM
clang/include/clang/AST/DeclCXX.h
1787–1789

Great idea, thank you!

clang/include/clang/AST/MangleNumberingContext.h
65

Thank you for the clarification!

clang/lib/AST/ASTContext.cpp
6766–6767

The type being null is a violation of AST invariants, so that should never be observed here except while deserializing.

Okay, I thought that was the case, but I think there's one more time when we could observe it, and that's when calling this function from a debugger, but, that's YOLO mode and nothing we need to worry about here.

Good point about moving this back into the AST reader, I had not realized how fragile this API was.

clang/lib/Serialization/ASTReaderDecl.cpp
574–576

But this patch is already pretty big; mind if I do that as a follow-up change?

Totally fine to do in a follow-up, thank you!

rsmith updated this revision to Diff 505631.Mar 15 2023, 2:44 PM
rsmith marked an inline comment as done.
  • Add = to parameter name comment.
clang/lib/Serialization/ASTReader.cpp
7370

I'm actually not sure how we got away with not having this before. This was probably always broken but just rarely caused problems?

Without this, the following can happen:

  1. Deserializing a list of CXXCtorInitializers recursively triggers some other deserialization step, that does create a Deserializing object (for example, we deserialize a new type).
  2. That other deserialization step finishes, and the ~Deserializing destructor notices it's the last one, so performs pending actions.
  3. Those pending actions perform an ODR checking step (or something else, it doesn't matter what -- the point is that during the post-deserialization cleanup we do things that assume we're not deserializing any more). Maybe there's a static data member for which we loaded multiple definitions from different modules.
  4. That step triggers loading a variable's initializer, which calls GetExternalDeclStmt, which moves around the DeclsCursor and doesn't put it back when it's done. (That should be fine, because we're supposed to be done with recursive deserialization at this point -- and GetExternalDeclStmt asserts that. But the assert doesn't fire because there's no Deserializing object here.)
  5. ~Deserializing finishes and we continue loading CXXCtorInitializers, but we crash because the DeclsCursor has been moved to some random other place.

Broadly, we should always have a Deserializing object whenever we're doing any kind of recursive deserialization, in order to delay any pending cleanup actions (that assume we're not in the middle of deserialization) until we're done.

Checking: are the libc++ precommit CI failures related to these changes? It's a modules-specific build config, so I figured it's worth double-checking.

Checking: are the libc++ precommit CI failures related to these changes? It's a modules-specific build config, so I figured it's worth double-checking.

I took a quick glance when I sent this out and thought it was just a change in Clang's diagnostic spelling that libc++ hadn't caught up with, but looking again it seems like there's something more serious happening there; I'll investigate.

Checking: are the libc++ precommit CI failures related to these changes? It's a modules-specific build config, so I figured it's worth double-checking.

I took a quick glance when I sent this out and thought it was just a change in Clang's diagnostic spelling that libc++ hadn't caught up with, but looking again it seems like there's something more serious happening there; I'll investigate.

I tracked this down, and it's an unrelated modules bug. I'll be sending out a fix for that shortly.

aaron.ballman accepted this revision.Mar 27 2023, 8:51 AM

The changes here LGTM, but I'd appreciate it if a second set of eyes double-checked whether I missed anything, given the complexity of the changes. (So wait a day or so before landing in case other reviewers want to chime in.)

This revision is now accepted and ready to land.Mar 27 2023, 8:51 AM
This revision was landed with ongoing or failed builds.Mar 30 2023, 2:23 PM
This revision was automatically updated to reflect the committed changes.