This is an archive of the discontinued LLVM Phabricator instance.

[modules] Merge equivalent extensions and diagnose ivar redeclarations for extensions loaded from different modules.
ClosedPublic

Authored by vsapsai on Mar 7 2022, 5:23 PM.

Details

Summary

Emitting metadata for the same ivar multiple times can lead to
miscompilations. Objective-C runtime adds offsets to calculate ivar
position in memory and presence of duplicate offsets causes wrong final
position thus overwriting unrelated memory.

Such a situation is impossible with modules disabled as clang diagnoses
ivar redeclarations during sema checks after parsing
(Sema::ActOnFields). Fix the case with modules enabled by checking
during deserialization if ivar is already declared. We also support
a use case where the same category ends up in multiple modules. We
don't want to treat this case as ivar redeclaration and instead merge
corresponding ivars.

rdar://83468070

Diff Detail

Event Timeline

vsapsai created this revision.Mar 7 2022, 5:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 5:23 PM
Herald added a subscriber: ributzka. · View Herald Transcript
vsapsai requested review of this revision.Mar 7 2022, 5:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 5:23 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The main alternative to the current implementation was to change getRedeclContext for ObjCCategoryDecl, so we handle ivar redeclarations and merges in ObjCInterfaceDecl with ASTDeclReader::findExisting. Decided against it because of noticeably different merging behavior:

  • should merge not any same-name ivars but only those located in equivalent extensions;
  • should error for incompatible same-name ivars;
  • should handle ObjCCategoryDecl too but we should check them not during their deserialization but after their ivars are deserialized.

Mostly for these reasons decided not to piggy-back on ASTDeclReader::findExisting but to have a different implementation.

vsapsai added a subscriber: Restricted Project.Mar 7 2022, 5:35 PM

I don't know about ObjC. Comments on style/name only.

clang/include/clang/Serialization/ASTReader.h
1110

How about change the name to:

template <typename DeclTy>
using DuplicateObjDecls = std::pair<DeclTy *, DeclTy *>;

to clarify it is only used for ObjC. So the developer of other languages would skip this when reading.

1112–1115

Similarly, I suggest to add words or change the name to clarify it is used in ObjC only.

1117

It looks a little bit odd for me to use Vector for duplicate vars. Especially, the structure is Vector<pair<>>. I feel it is better with llvm::SmallSet. If the order is important here, I think llvm::SmallSetVector might be an option.

clang/lib/Serialization/ASTReaderDecl.cpp
1250–1252

nit:

auto *ParentExt = dyn_cast<ObjCCategoryDecl>(IVD->getDeclContext());
auto *PrevParentExt = dyn_cast<ObjCCategoryDecl>(PrevIvar->getDeclContext());
if (ParentExt && PrevParentExt) {

}

We could reduce one identation in this way. Codes with less identation looks better.

Thanks for the review, Chuanqi! I believe you were touching recently isSameEntity. Do you have any concerns about using StructuralEquivalenceContext instead of isSameEntity? I've decided to go with StructuralEquivalenceContext because at this point we are still dealing with deserialization and deduplication while ASTContext::isSameEntity kinda assumes that all deduplication is already done, at least based on the comment

// Objective-C classes and protocols with the same name always match.
if (isa<ObjCInterfaceDecl>(X) || isa<ObjCProtocolDecl>(X))
  return true;
clang/include/clang/Serialization/ASTReader.h
1110

Would DuplicateObjCDecls work? ObjC part is common enough and I want to avoid just Obj because it can imply "Object" which is not the goal. Also we have getObjCObjectType which doesn't help with disambiguating the meaning of "Obj".

1112–1115

Thanks for sharing your perspective, I'll make it more obvious it is ObjC-related.

1117

I'm not sure the use of llvm::SmallSet is warranted here. We deserialize each ObjCIvarDecl from each module (and corresponding DeclContext) only once, so we don't have to ensure there are no repeated pairs of same ObjCIvarDecl. And for stable diagnostic we need to use some kind of ordered container. Also a bunch of other "Pending..." fields don't use sets as uniqueness is enforced in different ways.

clang/lib/Serialization/ASTReaderDecl.cpp
1250–1252

Thanks, good point, will change it.

Thanks for the review, Chuanqi! I believe you were touching recently isSameEntity. Do you have any concerns about using StructuralEquivalenceContext instead of isSameEntity? I've decided to go with StructuralEquivalenceContext because at this point we are still dealing with deserialization and deduplication while ASTContext::isSameEntity kinda assumes that all deduplication is already done, at least based on the comment

// Objective-C classes and protocols with the same name always match.
if (isa<ObjCInterfaceDecl>(X) || isa<ObjCProtocolDecl>(X))
  return true;

Yeah, I was touching isSameEntity. But I don't know about StructuralEquivalenceContext. I just skipped over the implementation of StructuralEquivalenceContext. I feel like it would be OK if we could assume the input is duplicated indeed.
I see it is guaranteed by lookupInstanceVariable(Identifier) now. I don't know much about ObjC. But I feel a little bit strange. Is it possible that two distinct variable with the same identifier in ObjC? If it is impossible, I think it might be OK.

clang/include/clang/Serialization/ASTReader.h
1110

Yeah, it should be ObjC. It's my typo originally.

1117

Oh, I got it. llvm::SmallVector might not be wrong in this case. And for the mapping relationship, llvm::SmallMapVector is considerable too.

vsapsai updated this revision to Diff 414780.Mar 11 2022, 5:35 PM

Address comments and make the tests use non-fragile ABI (it's not default everywhere).

vsapsai marked 6 inline comments as done.Mar 11 2022, 5:57 PM

I see it is guaranteed by lookupInstanceVariable(Identifier) now. I don't know much about ObjC. But I feel a little bit strange. Is it possible that two distinct variable with the same identifier in ObjC? If it is impossible, I think it might be OK.

Two distinct ivars with the same name in the same class are prohibited. But there are a bunch of situations when ivars don't end up in the same class. For example, the most common case (and a huge reason for non-fragile ABI) is you have @interface MyView: NSView that has ivar color. If NSView decides to add its own ivar color in @implementation, it won't break your MyView and your binary will work without recompilation. That is roughly achieved by treating those ivars as MyView.color and NSView.color (it is very hand-wavy over-simplified explanation). But within the same class ivar redeclarations are prohibited. Without splitting everything into modules, we don't allow redeclarations in

@interface TestSubject {
  int a;
}
@end

@interface TestSubject() {
  int a;  // <- error as it is a redeclaration
  int b;
  int c;
}
@end

@interface TestSubject() {
  int b;  // <- error as it is a redeclaration
}
@end

@implementation TestSubject {
  int c;  // <- error as it is a redeclaration
}
@end
ChuanqiXu accepted this revision.Mar 13 2022, 8:18 PM

OK, if it is prohibited. It looks good to me. Please wait for a few days before landing in case there are other comments.

This revision is now accepted and ready to land.Mar 13 2022, 8:18 PM
vsapsai updated this revision to Diff 425093.Apr 25 2022, 6:32 PM

Rebase and update test to use ptr instead of i64*. Also now the added test is failing for implementationIvar, investigating that.

vsapsai added inline comments.Apr 25 2022, 9:34 PM
clang/lib/Serialization/ASTReaderDecl.cpp
1261–1263

We emit diagnostic for

@implementation TestClass {
@public
  struct { int z; } implementationIvar;
}
@end

from merge-anon-record-definition-in-objc.m because we handle only ObjCCategoryDecl and not ObjCImplementationDecl. It is very uncommon to put implementation in a header file but I still need to check we don't emit duplicate ivar metadata in this case.

vsapsai updated this revision to Diff 425303.Apr 26 2022, 2:22 PM

Handle in VisitObjCIvarDecl only redeclarations in extensions.

Interface+interface redeclarations should be handled in VisitObjCInterfaceDecl
and implementation+implementation in VisitObjCImplementationDecl.

Checked that we don't emit duplicate ivar metadata because unlike extensions,
there can be only a single [non-category] implementation.

This revision was landed with ongoing or failed builds.Apr 27 2022, 3:53 PM
This revision was automatically updated to reflect the committed changes.