Page MenuHomePhabricator

[Modules] Handle tag types and complain about bad merges in C/Objective-C mode
Needs ReviewPublic

Authored by bruno on Dec 19 2019, 4:29 PM.

Details

Summary

Take struct Z {...} defined differently and imported from both modules
X and Y. While in C/ObjC mode, clang used to pick one of the definitions
and ignore the other even though they might not be structurally
equivalent.

  • Instead of using the structural equivalence code for checking compatibility, use the ODR hash mechanism. This should significantly speed up the checks.
  • Hook the necessary name lookup bits to allow this logic in C.
  • Teach diagnoseOdrViolations to check the differences.

    Instead of silently compiling, clang now emits:
In module 'Y' imported from t.m:2:
./y.h:3:10: error: 'Z::m' from module 'Y' is not present in definition
of 'struct Z' in module 'X'
  double m;
         ^
./x.h:3:7: note: declaration of 'm' does not match
  int m;
      ^

rdar://problem/56764293

Event Timeline

bruno created this revision.Dec 19 2019, 4:29 PM
bruno edited the summary of this revision. (Show Details)Dec 19 2019, 4:30 PM
jdoerfert resigned from this revision.Dec 19 2019, 4:53 PM
martong added inline comments.Dec 20 2019, 2:28 AM
clang/lib/Serialization/ASTReader.cpp
9296

Would it be possible to check the structural non-equivalency with ODRHash?
I wonder if the ODRHash of the two Decls are different in this case or not.
ASTStructuralEquivalency and ODRHash seems to serve a very similar use case, however, AFAIK in the rest of the ASTReader code ODRHash is used exclusively.

(On a long term, I am thinking about that perhaps ASTImporter could use ODRHash instead of ASTStructuralEquivalency.)

bruno updated this revision to Diff 236964.Jan 8 2020, 9:50 PM

Change the approach: handle type merging for tag types using the ODRHash mechanism

bruno marked an inline comment as done.Jan 8 2020, 9:53 PM
clang/lib/Serialization/ASTReader.cpp
9296

Great idea. Updated the patch with another approach!

bruno edited the summary of this revision. (Show Details)Jan 8 2020, 9:54 PM
bruno added a reviewer: rtrieu.
bruno updated this revision to Diff 236965.Jan 8 2020, 9:58 PM

Remove some FIXMEs that are now done.

rtrieu added inline comments.Jan 9 2020, 6:49 PM
clang/lib/Serialization/ASTReader.cpp
11110

Is this just a copy of the other loop? That's a lot of code duplication.

bruno marked an inline comment as done.Jan 10 2020, 9:50 AM
bruno added inline comments.
clang/lib/Serialization/ASTReader.cpp
11110

There's a lot of copy but a lot of CXXRecord specific logic got trimmed out and there are small differences in the approach, so a direct refactor of it won't work, but there's certainly more small pieces that can be factored out, will follow up with that.

bruno updated this revision to Diff 237447.Jan 10 2020, 3:34 PM

Updated the patch to refactor mode code.

  • Move off everything that can be reused for RecordDecl.
  • Early exist when not C++.

Once the patch gets approved the idea is to upstream some of the refactoring
first so its easier to spot the actual non NFC part.

rtrieu added inline comments.Jan 10 2020, 6:16 PM
clang/lib/Serialization/ASTReader.cpp
11110

I like this refactoring much better. Probably more refactoring could be done to this function since it's so long, but that's an issue for another time.

Think I'll need to make another review pass but here are my comments so far:

  • Why are you adding ODR hash support for RecordDecl and not TagDecl? We already have support for EnumDecl, so TagDecl seems like a good candidate to cover both. Honestly, I don't know if it is possible or a good idea but it looks plausible enough to consider.
  • Are anonymous structs working? Worth adding test cases.
  • Are unions working? Didn't notice any code specifically for them but RecordDecl covers both structs and unions, so they should be working and we need to test that.
  • Few testing additions. These cases might be already covered or might be low value, so take these suggestions with a grain of salt:
    • test self-referential structs like struct Node { struct Node *next; };
    • test comparing structs and forward declarations, e.g., struct S; and struct S { ... };, and another couple struct S { ... }; and struct S; struct S { ... }; The motivation is to make sure we aren't stumped when we cannot find struct definition or when the definition is in unexpected place.
clang/lib/AST/Decl.cpp
4511

To be consistent with the existing code, it is better to call it as setHasODRHash(true)

Heads up in case it affects you refactoring work:
While looking through this code, I found a bug I previously made. I fixed it with a small change, but that lies in the middle of your refactoring during FieldDecl handling. The change is here:

https://reviews.llvm.org/rGa60e8927297005898b10a46300d929ba5cf7833c

bruno added a comment.Fri, Jan 31, 5:46 PM
  • Why are you adding ODR hash support for RecordDecl and not TagDecl? We already have support for EnumDecl, so TagDecl seems like a good candidate to cover both. Honestly, I don't know if it is possible or a good idea but it looks plausible enough to consider.

The reason is that in C++ ODR diagnostics for TagDecl are handled as part of CXXRecordDecl, I didn't want add a path in TagDecl for C/ObjC only.

  • Are anonymous structs working? Worth adding test cases.

They are at the top level but not nested ones, I fixed that and should upload a new patch with that next.

  • Are unions working? Didn't notice any code specifically for them but RecordDecl covers both structs and unions, so they should be working and we need to test that.

Done in upcoming patch.

  • Few testing additions. These cases might be already covered or might be low value, so take these suggestions with a grain of salt:
    • test self-referential structs like struct Node { struct Node *next; };
    • test comparing structs and forward declarations, e.g., struct S; and struct S { ... };, and another couple struct S { ... }; and struct S; struct S { ... }; The motivation is to make sure we aren't stumped when we cannot find struct definition or when the definition is in unexpected place.

Ditto.

Heads up in case it affects you refactoring work:
While looking through this code, I found a bug I previously made. I fixed it with a small change, but that lies in the middle of your refactoring during FieldDecl handling. The change is here:

https://reviews.llvm.org/rGa60e8927297005898b10a46300d929ba5cf7833c

Thanks for the heads up @rtrieu, I end up including that as part of rG90f58eaeff5f

bruno updated this revision to Diff 241855.Fri, Jan 31, 5:49 PM
  • Update patch after Volodymyr's review.
  • Refactor bits done as part of rG90f58eaeff5f, update it to remove that part.
rsmith added inline comments.Tue, Feb 4, 6:20 PM
clang/include/clang/AST/Decl.h
4009–4010

We shouldn't store this here; this will make all CXXRecordDecls larger to store a field they will never look at.

We have 27 spare trailing bits in RecordDeclBitfields (28 if you don't add the HasODRHash bit and use 0 to mean "hash not computed"). Can we store this there instead of here?

clang/lib/AST/ODRHash.cpp
480–481

This is, at least in principle, not reliable. After definition merging, we could have picked a different definition of this record type as "the" definition. (It's probably not straightforward to get this to happen in practice, and maybe not even possible, but I'm not certain of that.)

Maybe we could add to TagDecl something like

bool isThisDeclarationADemotedDefinition() const {
  return !isThisDeclarationADefinition() && BraceRange.isValid();
}

and then check isThisDeclarationADefinition() || isThisDeclarationADemotedDefinition() here? (I'm not sure we always update BraceRange properly for demoted definitions either, so maybe that's not quite right.)

clang/lib/Serialization/ASTReader.cpp
9579–9581

I don't understand why this assertion would be correct if the above branch can ever be taken. It's possible for two different types to have the same hash, and it looks like we'll assert here if that happens. Should we be branching on hasSameType above instead of comparing hashes?

bruno marked 3 inline comments as done.Thu, Feb 13, 8:00 AM

Thanks for taking a look Richard, comments inline.

clang/include/clang/AST/Decl.h
4009–4010

Makes sense!

clang/lib/AST/ODRHash.cpp
480–481

There are two things I realized while looking at this:

  • What I really intended here was to hash underlying anonymous structs/unions, not just any underlying RecordDecl. Will change the logic in ODRHash::AddRecordDecl do reflect that properly.
  • This patch was skipping RecordDecls without definitions during ODR diagnostics, I've changed to instead decide at merge time if we want to register those Decls for later diagnose, and the new rule is clang skips RecordDecls that disagree on whether they have a definition. This should still allow clang to diagnose declarations without defintions that have different attributes (which I intend to add for specific attributes in future patches).

Will update the patch to reflect that. WDYT?

clang/lib/Serialization/ASTReader.cpp
9579–9581

This is trying to cover two things. The first one is to handle nested anonymous unions, which might have the same type, but underlying mismatching fields:

#if defined(FIRST)
struct AU {
  union {
    int a;
  };
};
#elif defined(SECOND)
struct AU {
  union {
    char a;
  };
};
#else
struct AU au;

The second is to allow for @interfaces (downstream patches at the moment) to reuse logic to diagnose fields in ODRDiagField without having to add its own check for the underlying types. Does that makes sense?

bruno updated this revision to Diff 244435.Thu, Feb 13, 8:02 AM

Address Richard's review.

rsmith added inline comments.Tue, Feb 25, 9:47 AM
clang/lib/AST/DeclCXX.cpp
487

I think this change is no longer necessary.

clang/lib/Serialization/ASTReader.cpp
9579–9581

I still don't understand.

Either the types are always the same before the previous if and can only differ in type sugar (in which case this change is unnecessary), or the types can be different in more than just sugar (in which case this assert is wrong because there's no guarantee that different types will have different hashes).

What am I missing?

clang/lib/Serialization/ASTWriterDecl.cpp
489

It would be better to avoid writing this at all for a CXXRecordDecl, to avoid adding 6 unused bits per class to the bitcode. (Please also look at isa<CXXRecordDecl>(D) not at the LangOpts here.)