This is an archive of the discontinued LLVM Phabricator instance.

[WIP][modules] Avoid deserializing Decls from hidden (sub)modules.
AbandonedPublic

Authored by vsapsai on Nov 22 2021, 8:40 PM.

Details

Reviewers
None
Summary

The test case is for Objective-C but it is likely other languages like
C++ are affected as well, just haven't created appropriate test cases.

The test case fails with the following error

clang/test/Modules/Output/ambiguous-enum-lookup.m.tmp/test.m:7:11: error: reference to 'kEnumValue' is ambiguous
    s.x = kEnumValue;
          ^
clang/test/Modules/Output/ambiguous-enum-lookup.m.tmp/Frameworks/NonModular.framework/Headers/NonModular.h:8:5: note: candidate found by name lookup is 'kEnumValue'
    kEnumValue = 1,
    ^
clang/test/Modules/Output/ambiguous-enum-lookup.m.tmp/Frameworks/NonModular.framework/Headers/NonModular.h:8:5: note: candidate found by name lookup is 'kEnumValue'
    kEnumValue = 1,
    ^

It happens because we have 2 EnumConstantDecl for 'kEnumValue' - one from
PiecewiseVisible module, another non-modular. And the the name lookup is
ambiguous because these decls have different canonical decls.
Non-modular decl believes it is the canonical decl because that's the
behavior for Mergeable and in the source code non-modular decl comes
before other EnumConstantDecl. Modular EnumConstantDecl has no intention
of being canonical decl but it is deserialized *before* non-modular decl
is created, so as the only decl it becomes the canonical decl.

The idea for the fix is to skip deserialization of decls from hidden
(sub)modules, thus allowing to establish correct canonical decl. This is
a proof of concept, the implementation isn't optimal, and there are
failing tests. I mostly wanted to share the approach to discuss it.

Another high-level approach I've considered is to enhance handling decls
from hidden modules and to delay connecting them with redecl chains and
canonical decls till the moment they become visible. I've decided to go
with "avoid deserialization" approach because this way we should be
doing less work and it seems less error-prone compared to juggling with
existing decls.

rdar://82908206

Diff Detail

Event Timeline

vsapsai created this revision.Nov 22 2021, 8:40 PM
vsapsai requested review of this revision.Nov 22 2021, 8:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2021, 8:40 PM
vsapsai added a subscriber: Restricted Project.Nov 22 2021, 8:41 PM
vsapsai updated this revision to Diff 389374.Nov 23 2021, 7:25 PM

Fix some failing tests to demonstrate the magnitued of changes better.

In some places check if a decl is deserialized successfully as it's not guaranteed.
Also tried to make sure a module is marked as visible *before* we try to deserialize its decls.

vsapsai updated this revision to Diff 389379.Nov 23 2021, 8:00 PM

Mark identifiers as out-of-date not only on loading a new module but on making a (sub)module visible too.

vsapsai updated this revision to Diff 389630.Nov 24 2021, 5:01 PM

Switch from checking module visibility to checking if a module is imported.

Required to handle not re-exported modules - allow to deserialize decls from
them even if the modules aren't visible.

vsapsai abandoned this revision.Nov 25 2021, 5:12 PM

After more investigation and comparing different entities (like structs, enums) this approach seems to be too involved and probably wrong. A better choice seems to be following the same direction as in ⦗Modules⦘ Implement ODR-like semantics for tag types in C/ObjC than inventing something new. Specifically, I plan to look into discrepancies between EnumDecl and EnumConstantDecl as we don't have problems with enums themselves, only with the constants.