This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/AST/
-
AST/
9/14
ODRDiagsEmitter.cpp
3/14
ODRHash.cpp
-
test/Modules/
-
Modules/
1/1
odr_hash.cpp

Differential D135472

[ODRHash] Hash attributes on declarations.
AcceptedPublic

Authored by vsapsai on Oct 7 2022, 11:24 AM.

Download Raw Diff

Details

Reviewers

rtrieu
aaron.ballman
ChuanqiXu
Bigcheese
erichkeane

Summary

Arguments not for all attributes are covered. More can be added later as
needed.

rdar://100482330

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

vsapsai created this revision.Oct 7 2022, 11:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2022, 11:24 AM

Herald added a subscriber: ributzka. · View Herald Transcript

vsapsai requested review of this revision.Oct 7 2022, 11:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2022, 11:24 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

vsapsai added inline comments.Oct 7 2022, 11:32 AM

clang/lib/AST/ODRHash.cpp
479–480	I'm not sure `isImplicit` is the best indicator of attributes to check, so suggestions in this area are welcome. I think we can start strict and relax some of the checks if needed. If people have strong opinions that some attributes shouldn't be ignored, we can add them to the tests to avoid regressions. Personally, I believe that alignment and packed attributes should never be silently ignored.
clang/test/Modules/odr_hash.cpp
3640	As we land hashing for C and Objective-C, we can move these tests to their own file. But for now I think it makes sense to keep everything in odr_hash.cpp. Though I don't have a strong preference.

Harbormaster completed remote builds in B190981: Diff 466133.Oct 7 2022, 12:16 PM

I guess the reason why we didn't check odr violation for attributes is that the attributes was not a part of declarations in C++ before. But it is odd to skip the check for attributes from the perspective of users. So I think this one should be good. The only concern is that it misses too many checks for attributes. Do you plan to add it in near future or in longer future? And I guess we need to mention this in ReleaseNotes in Potential Breaking Changes section.

clang/lib/AST/ODRDiagsEmitter.cpp
341–358	I feel like we can merge these two statements.
clang/lib/AST/ODRHash.cpp
479–480	Agreed. I feel `isImplicit` is enough for now.

aaron.ballman added a reviewer: erichkeane.Oct 11 2022, 10:29 AM

aaron.ballman added inline comments.

clang/include/clang/AST/ODRDiagsEmitter.h
119 ↗	(On Diff #466133)
clang/lib/AST/ODRDiagsEmitter.cpp
159	Typedefs can have attributes, so should this be updated as well? (alignment, ext vector types, mode attributes, etc all come to mind)
306	Do we want to have more control over the printing policy here? e.g., do we really want to claim an ODR difference if one module's printing policy specifies indentation of 8 and another specifies indentation of 4? Or if one module prints `restrict` while another prints `__restrict`, etc?
315	You can probably drop the `getIdentifier()` here as the diagnostics engine knows how to print named declarations properly already.
322	Same here.
clang/lib/AST/ODRHash.cpp
479–480	The tricky part is -- sometimes certain attributes add additional implicit attributes and those implicit attributes matter (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380). And some attributes seem to just do the wrong thing entirely: https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344 So I think `isImplicit()` is a good approximation, but I'm more wondering what the principle is for whether an attribute should or should not be considered part of the ODR hash. Type attributes, attributes that impact object layout, etc all seem like they should almost certainly be part of ODR hashing. Others are a bit more questionable though. I think this is something that may need a per-attribute flag in Attr.td so attributes can opt in or out of it because I'm not certain what ODR issues could stem from `[[maybe_unused]]` or `[[deprecated]]` disagreements across module boundaries.
494	100% agreed.

Most of my concerns change based on the feedback others have given, so after those are dealt with, I'll do another depever view.

clang/include/clang/Basic/DiagnosticASTKinds.td
852 ↗	(On Diff #466133)	Wowzer these are tough to read... can you provide a magic-decoder ring for me of some sort?
clang/lib/AST/ODRHash.cpp
479–480	I don't think 'isImplicit' is particularly good. I think the idea of auto-adding 'type' attributes and having the 'rest' be analyzed to figure out which are important. Alternatively, I wonder if we're better off just adding ALL attributes and seeing what the fallout is. We can later decide when we don't want them to be ODR significant (which, might be OTHERWISE meaningful later!).

In D135472#3844688, @ChuanqiXu wrote:

I guess the reason why we didn't check odr violation for attributes is that the attributes was not a part of declarations in C++ before. But it is odd to skip the check for attributes from the perspective of users. So I think this one should be good.

I think we haven't seen many problems with attribute mismatches because in existing code attributes are often hidden behind macros. And we have sugar MacroQualifiedType that causes the definitions

#define NODEREF __attribute__((noderef))
struct StructWithField {
  int NODEREF *i_ptr;
};

struct StructWithField {
  int __attribute__((noderef)) *i_ptr;
};

to be treated as incompatible.

But the keyword alignas is used without macros and more attributes can be used in multiple compilers without macros. That's why I think it is useful to detect and to diagnose mismatches in attributes.

In D135472#3844688, @ChuanqiXu wrote:

The only concern is that it misses too many checks for attributes. Do you plan to add it in near future or in longer future?

In the near future I was planning to add various Objective-C and Swift attributes. For other attributes I don't know which are high-value. I definitely want to check attributes affecting the memory layout (alignment, packing) and believe I've addressed them (totally could have missed something).

In D135472#3844688, @ChuanqiXu wrote:

And I guess we need to mention this in ReleaseNotes in Potential Breaking Changes section.

Good idea, will do.

clang/lib/AST/ODRDiagsEmitter.cpp
341–358	Sorry, I don't really get what two statements you are talking about. Is it `!LHS \|\| !RHS \|\| LHS->getKind() != RHS->getKind()` `ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)` ?

vsapsai added inline comments.Oct 11 2022, 7:47 PM

clang/include/clang/Basic/DiagnosticASTKinds.td
852 ↗	(On Diff #466133)	Guess I went too far with avoiding repetition (was feeling soo smart doing it). I'll go back to the messages with a little bit more repetition but which should be easier to read and understand.
clang/lib/AST/ODRDiagsEmitter.cpp
159	It should be tested for sure, thanks for pointing it out.
306	`FullAttributeText` is used for diagnostics and not for the mismatch detection, so we shouldn't complain about `restrict/__ restrict` mismatches (`DifferentAlignmentKeywords` tests that `__attribute__((aligned(8)))` and `alignas(8)` are not a mismatch). We use the same `ASTContext` to print both attributes, so that shouldn't be confusing. Diagnostic can be unstable across clang versions and probably across language modes. But I think that can happen to other diagnostic messages too, so think that's acceptable.
315	Thanks for the hint, will do.
clang/lib/AST/ODRHash.cpp
479–480	One criteria to decide which attributes should be hashed is if they affect IRGen. But that's not exhaustive and I'm not sure how practical it is. The rule I'm trying to follow right now is if declarations with different attributes can be substituted instead of each other. For example, structs with different memory layout cannot be interchanged, so it is reasonable to reject them. But maybe we should review attributes on case-by-case basis. For example, for `[[deprecated]]` I think the best for developers is not to complain about it but merge the attributes and then have regular diagnostic about using a deprecated entity.
479–480	One option was to hash `isa<InheritedAttr>` attributes as they are "sticky" and should be more significant, so shouldn't be ignored. But I don't think this argument is particularly convincing. At this stage I think we can still afford to add all attributes and exclude some as-needed because modules adoption is still limited. Though I don't have strong feelings about this approach, just getting more restrictive later can be hard.

In the near future I was planning to add various Objective-C and Swift attributes. For other attributes I don't know which are high-value. I definitely want to check attributes affecting the memory layout (alignment, packing) and believe I've addressed them (totally could have missed something).

It looks like you're planning to add them one by one, or do I misunderstand? It looks not so good to add them one by one. Maybe it'll be a better idea to add them in the Attr.td?

clang/lib/AST/ODRDiagsEmitter.cpp
341–358	Sorry for the ambiguity. Since `LHS->getKind() != RHS->getKind()` is covered by `ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)`. I feel like it is reasonable to: if (!LHS \|\| !RHS \|\| ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)) { DiagError(); DiagNote(); DiagNote(); DiagNoteAttrLoc(); return true; }
clang/lib/AST/ODRHash.cpp
479–480	It looks not a bad idea to add all attributes as experiments, @vsapsai how do you feel about this?

aaron.ballman added inline comments.Oct 12 2022, 6:31 AM

clang/lib/AST/ODRDiagsEmitter.cpp
306	Oh, that's great, thank you!
clang/lib/AST/ODRHash.cpp
479–480	My intuition is that we're going to want this to be controlled from Attr.td on a case-by-case basis and to automatically generate the ODR hash code for attribute arguments. We can either force every attribute to decide explicitly (this seems pretty disruptive as a final design but would be a really good way to ensure we audit all attributes if done as an experiment), or we can pick a default behavior to ODR hash/not. I suspect we're going to want to pick a default behavior based on whatever the audit tells us the most common situation is. I think we're going to need something a bit more nuanced than "this attribute matters for ODR" because there are attribute arguments that don't always contribute to the entity (for example, we have "fake" arguments that are used to give the semantic attribute more information, there may also be cases where one argument matter and another argument does not such as `enable_if` where the condition matters greatly but the message doesn't matter much). So we might need a marking on the attribute level and on the parameter level to determine what all factors into attribute identity for generating the ODR hashing code. Hopefully we can avoid needing more granularity than that.

ChuanqiXu added inline comments.Oct 13 2022, 1:14 AM

clang/lib/AST/ODRHash.cpp
479–480	I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td. And I feel most of them affecting the code generation. For example, there're a lot of attributes which is hardware related. because there are attribute arguments that don't always contribute to the entity When you're talking about "entities", I'm not sure if you're talking the "entities" in the language spec or a general abstract ones. I mean, for example, the `always_inline` attribute doesn't contribute to the declaration from the language perspective. But it'll be super odd if we lost it. So I feel it may be better to change the requirement to be "affecting code generation" Also, to be honest, I am even not sure if we should take "affecting code generation " as the requirement. For example, for the `preferred_name` attribute, this is used for better printing names. It doesn't affect code generation nor the entity you mentioned. But if we drop it, then the printing name may not be what users want. I'm not sure if this is desired.

aaron.ballman added inline comments.Oct 13 2022, 8:17 AM

clang/lib/AST/ODRHash.cpp
479–480	I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td. And I feel most of them affecting the code generation. For example, there're a lot of attributes which is hardware related. Hmmm, I'm seeing a reasonably even split. We definitely have a ton of attributes like multiversioning, interrupts, calling conventions, availability, etc that I think all should be part of an ODR hash. We also have a ton of other ones that don't seem like they should matter for ODR hashing like hot/cold, consumed/returns retained, constructor/destructor, deprecated, nodiscard, maybe_unused, etc. Then we have the really fun ones where I think we'll want them to be in the ODR hash but maybe we don't, like `[[clang::annotate]]`, restrict, etc. So I think we really do need an actual audit of the attributes to decide what the default should be (if there should even be one). When you're talking about "entities", I'm not sure if you're talking the "entities" in the language spec or a general abstract ones. I mean, for example, the always_inline attribute doesn't contribute to the declaration from the language perspective. But it'll be super odd if we lost it. So I feel it may be better to change the requirement to be "affecting code generation" Not in a language spec meaning -- I meant mostly that there's some notion of identify for the one definition rule and not all attribute arguments contribute to that identity the same way. I don't know that "affecting code gen" really captures the whole of it. For example, whether a function is/is not marked hot or cold affects codegen, but really has nothing to do with the identity of the function (it's the same function whether the hint is there or not). `preferred_name` is another example -- do we really think a structure with that attribute is fundamentally a different thing than one without that attribute? I don't think so. To me, it's more about what cross-TU behaviors you'll get if the attribute is only present on one side. Things like ABI tags, calling conventions, attributes that behave like qualifiers, etc all contribute to the identity of something where a mismatch will impact the correctness of the program. But things like optimization hints and diagnostic hints don't seem like they contribute to the identity of something because they don't impact correctness.

Small fixes to address review comments.
Try to make diagnostics more understandable.
Check attributes on typedefs.

Harbormaster completed remote builds in B192027: Diff 467564.Oct 13 2022, 12:15 PM

Given all the discussion about which attributes should be added to ODR hash, I think it is useful at this point to have TableGen infrastructure to get this information from Attr.td. So I'll work on that.

clang/lib/AST/ODRDiagsEmitter.cpp
159	Typedefs are done. But while adding support for them I've realized we don't check method parameters. So it requires a little bit more work.
341–358	There are 2 separate cases to improve the diagnostic. In the first case we'd have ... first difference is definition in module 'FirstModule' found attribute 'enum_extensibility' attribute specified here but in 'SecondModule' found no attribute And if we reach the second ones, it implies the kind of attributes is the same and the only difference is attribute arguments, so the diagnostics are like first difference is definition in module 'FirstModule' found attribute ' __attribute__((enum_extensibility("open")))' attribute specified here but in 'SecondModule' found different attribute argument ' __attribute__((enum_extensibility("closed")))' attribute specified here From my limited experience, it is actually useful to have more details than trying to figure out the difference in attributes.

Fix the starting point of the diff.

vsapsai marked an inline comment as done.Oct 13 2022, 12:34 PM

vsapsai added inline comments.

clang/include/clang/Basic/DiagnosticASTKinds.td
852 ↗	(On Diff #466133)	It should be easier to understand now (easier doesn't mean easy).

Harbormaster completed remote builds in B192031: Diff 467570.Oct 13 2022, 1:17 PM

LGTM basically. Our discussion is mainly about the future work in Attr.td and not this patch. So I think they are not blocking issues.

clang/lib/AST/ODRDiagsEmitter.cpp
341–358	Agreed. I just wanted to say the codes can be further be shorten by making the diagnostic kinds contains more cases: %select { \| \| ... } We have some such examples before. But this is not required. Since it actually moves the complexity from the source codes to the table gen. And the implementation doesn't look bad.
clang/lib/AST/ODRHash.cpp
479–480	I got your point. But from my developing experience, I want to say the optimization hints matters too. And I am wondering how about checking the ODR for all attributes. It would emit error if the attributes that affecting the correctness mismatches. And if the other attributes mismatches, a warning enough. (If we worry about the compile time performance, I feel it won't take too long. And we can have an option to control it actually.)

This revision is now accepted and ready to land.Oct 13 2022, 7:26 PM

To fix pre-merge errors like

error: 'error' diagnostics seen but not expected:
  File .../clang/test/Modules/Output/odr_hash.cpp.tmp/Inputs/first.h Line 3797: ... found 'AlignedDoublePtr' with attribute ' __attribute__((align_value(0xb7dc9b8)))'

posted D135931.

vsapsai added a parent revision: D135931: [Attributes] Improve writing `ExprArgument` value..Oct 13 2022, 7:28 PM

In D135472#3856596, @vsapsai wrote:

Given all the discussion about which attributes should be added to ODR hash, I think it is useful at this point to have TableGen infrastructure to get this information from Attr.td. So I'll work on that.

Thank you, I think that's a good approach for the moment. Once you have the basic framework in place, we can opt in the uncontroversial attributes (like, I think anything inheriting from TypeAttr or DeclOrTypeAttr should be automatically opted in and any Argument whose Fake flag is set should not contribute to the ODR hash) and then we can do a follow-up to figure out how to distinguish "definitely an ODR violation" attributes from "not an ODR violation but still something we want to find a way to alert users about".

clang/lib/AST/ODRHash.cpp
479–480	I got your point. But from my developing experience, I want to say the optimization hints matters too. That's why I'd like to have an idea about what our policy is. To me, ODR hashing should be restricted to finding ODR violations. It sounds like you'd like ODR hashing to also optionally find other kinds of differences that aren't necessarily ODR violations but still surprises nonetheless. I think that's reasonable, but I think we should ensure the diagnostics distinguish between "this is definitely UB" and "this might not have the optimization or diagnostic properties you expect but is still correct". And I am wondering how about checking the ODR for all attributes. It would emit error if the attributes that affecting the correctness mismatches. And if the other attributes mismatches, a warning enough. (If we worry about the compile time performance, I feel it won't take too long. And we can have an option to control it actually.) I don't think we should ODR-check all attributes blindly. There's no ODR violation for one function having `[[nodiscard]]` in a TU and another function not having it. In fact, attributes are sometimes used additively so that you can put extra constraints locally that aren't there globally. e.g., I've seen real code doing: // Header.h int some_func(); // Source.cpp #include "Header.h" // We've had too many bugs from people ignoring the results, // which really matters in this particular file. Redeclare the API // with extra diagnostic checking for this TU. [[nodiscard]] int some_func(); ... It shouldn't cause ODR violation diagnostics if `some_func()` is defined in another source file without the attribute.

ChuanqiXu added inline comments.Oct 14 2022, 9:26 AM

clang/lib/AST/ODRHash.cpp
479–480	Oh, yeah, you're right. Let's try to make the policy clearly when we started to it further.

Rebase and diagnose attribute mismatch for function parameters.

Harbormaster completed remote builds in B194481: Diff 470915.Oct 26 2022, 2:45 PM

vsapsai mentioned this in D138859: [ODRHash] Drive attribute hashing through TableGen. NFC intended..Nov 28 2022, 1:22 PM

Revision Contents

Path

Size

clang/

lib/

AST/

ODRDiagsEmitter.cpp

4 lines

ODRHash.cpp

8 lines

test/

Modules/

odr_hash.cpp

26 lines

Diff 467564

clang/lib/AST/ODRDiagsEmitter.cpp

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	bool ODRDiagsEmitter::diagnoseSubMismatchField(

if (diagnoseSubMismatchAttr(FirstRecord, FirstModule, SecondModule,		if (diagnoseSubMismatchAttr(FirstRecord, FirstModule, SecondModule,
FirstField, SecondField))		FirstField, SecondField))
return true;		return true;

return false;		return false;
}		}

bool ODRDiagsEmitter::diagnoseSubMismatchTypedef(		bool ODRDiagsEmitter::diagnoseSubMismatchTypedef(
		aaron.ballmanUnsubmitted Done Reply Inline Actions Typedefs can have attributes, so should this be updated as well? (alignment, ext vector types, mode attributes, etc all come to mind) aaron.ballman: Typedefs can have attributes, so should this be updated as well? (alignment, ext vector types…
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions It should be tested for sure, thanks for pointing it out. vsapsai: It should be tested for sure, thanks for pointing it out.
		vsapsaiAuthorUnsubmitted Not Done Reply Inline Actions Typedefs are done. But while adding support for them I've realized we don't check method parameters. So it requires a little bit more work. vsapsai: Typedefs are done. But while adding support for them I've realized we don't check method…
const NamedDecl *FirstRecord, StringRef FirstModule, StringRef SecondModule,		const NamedDecl *FirstRecord, StringRef FirstModule, StringRef SecondModule,
const TypedefNameDecl FirstTD, const TypedefNameDecl SecondTD,		const TypedefNameDecl FirstTD, const TypedefNameDecl SecondTD,
bool IsTypeAlias) const {		bool IsTypeAlias) const {
enum ODRTypedefDifference {		enum ODRTypedefDifference {
TypedefName,		TypedefName,
TypedefType,		TypedefType,
};		};

Show All 20 Lines	bool ODRDiagsEmitter::diagnoseSubMismatchTypedef(

QualType FirstType = FirstTD->getUnderlyingType();		QualType FirstType = FirstTD->getUnderlyingType();
QualType SecondType = SecondTD->getUnderlyingType();		QualType SecondType = SecondTD->getUnderlyingType();
if (computeODRHash(FirstType) != computeODRHash(SecondType)) {		if (computeODRHash(FirstType) != computeODRHash(SecondType)) {
DiagError(TypedefType) << IsTypeAlias << FirstName << FirstType;		DiagError(TypedefType) << IsTypeAlias << FirstName << FirstType;
DiagNote(TypedefType) << IsTypeAlias << SecondName << SecondType;		DiagNote(TypedefType) << IsTypeAlias << SecondName << SecondType;
return true;		return true;
}		}

		if (diagnoseSubMismatchAttr(FirstRecord, FirstModule, SecondModule, FirstTD,
		SecondTD))
		return true;
return false;		return false;
}		}

bool ODRDiagsEmitter::diagnoseSubMismatchVar(const NamedDecl *FirstRecord,		bool ODRDiagsEmitter::diagnoseSubMismatchVar(const NamedDecl *FirstRecord,
StringRef FirstModule,		StringRef FirstModule,
StringRef SecondModule,		StringRef SecondModule,
const VarDecl *FirstVD,		const VarDecl *FirstVD,
const VarDecl *SecondVD) const {		const VarDecl *SecondVD) const {
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	bool ODRDiagsEmitter::diagnoseSubMismatchAttr(
auto ComputeAttrODRHash = [](const Attr *A) {		auto ComputeAttrODRHash = [](const Attr *A) {
ODRHash Hasher;		ODRHash Hasher;
Hasher.AddAttr(A);		Hasher.AddAttr(A);
return Hasher.CalculateHash();		return Hasher.CalculateHash();
};		};
auto FullAttributeText = [](const Attr *A, const ASTContext &Ctx) {		auto FullAttributeText = [](const Attr *A, const ASTContext &Ctx) {
std::string FullText;		std::string FullText;
llvm::raw_string_ostream OutputStream(FullText);		llvm::raw_string_ostream OutputStream(FullText);
A->printPretty(OutputStream, Ctx.getPrintingPolicy());		A->printPretty(OutputStream, Ctx.getPrintingPolicy());
		aaron.ballmanUnsubmitted Not Done Reply Inline Actions Do we want to have more control over the printing policy here? e.g., do we really want to claim an ODR difference if one module's printing policy specifies indentation of 8 and another specifies indentation of 4? Or if one module prints `restrict` while another prints `__restrict`, etc? aaron.ballman: Do we want to have more control over the printing policy here? e.g., do we really want to claim…
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions `FullAttributeText` is used for diagnostics and not for the mismatch detection, so we shouldn't complain about `restrict/__ restrict` mismatches (`DifferentAlignmentKeywords` tests that `__attribute__((aligned(8)))` and `alignas(8)` are not a mismatch). We use the same `ASTContext` to print both attributes, so that shouldn't be confusing. Diagnostic can be unstable across clang versions and probably across language modes. But I think that can happen to other diagnostic messages too, so think that's acceptable. vsapsai: `FullAttributeText` is used for diagnostics and not for the mismatch detection, so we shouldn't…
		aaron.ballmanUnsubmitted Done Reply Inline Actions Oh, that's great, thank you! aaron.ballman: Oh, that's great, thank you!
return OutputStream.str();		return OutputStream.str();
};		};
auto DiagError = [FirstContainer, FirstDecl, FirstModule,		auto DiagError = [FirstContainer, FirstDecl, FirstModule,
this](ODRAttributeDifference DiffType) {		this](ODRAttributeDifference DiffType) {
return Diag(FirstDecl->getLocation(),		return Diag(FirstDecl->getLocation(),
diag::err_module_odr_violation_attribute)		diag::err_module_odr_violation_attribute)
<< (FirstContainer ? FirstContainer : FirstDecl)		<< (FirstContainer ? FirstContainer : FirstDecl)
<< FirstModule.empty() << FirstModule << DiffType		<< FirstModule.empty() << FirstModule << DiffType
<< (FirstContainer != nullptr) << FirstDecl;		<< (FirstContainer != nullptr) << FirstDecl;
		aaron.ballmanUnsubmitted Done Reply Inline Actions You can probably drop the `getIdentifier()` here as the diagnostics engine knows how to print named declarations properly already. aaron.ballman: You can probably drop the `getIdentifier()` here as the diagnostics engine knows how to print…
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions Thanks for the hint, will do. vsapsai: Thanks for the hint, will do.
};		};
auto DiagNote = [FirstContainer, SecondDecl, SecondModule,		auto DiagNote = [FirstContainer, SecondDecl, SecondModule,
this](ODRAttributeDifference DiffType) {		this](ODRAttributeDifference DiffType) {
return Diag(SecondDecl->getLocation(),		return Diag(SecondDecl->getLocation(),
diag::note_module_odr_violation_attribute)		diag::note_module_odr_violation_attribute)
<< SecondModule << DiffType << (FirstContainer != nullptr)		<< SecondModule << DiffType << (FirstContainer != nullptr)
<< SecondDecl;		<< SecondDecl;
		aaron.ballmanUnsubmitted Done Reply Inline Actions Same here. aaron.ballman: Same here.
};		};
auto DiagNoteAttrLoc = [this](const Attr *A) {		auto DiagNoteAttrLoc = [this](const Attr *A) {
if (A)		if (A)
Diag(A->getLocation(), diag::note_attribute_specified_here)		Diag(A->getLocation(), diag::note_attribute_specified_here)
<< A->getRange();		<< A->getRange();
};		};

const Attr *LHS = nullptr;		const Attr *LHS = nullptr;
const Attr *RHS = nullptr;		const Attr *RHS = nullptr;
unsigned NumFirstAttrs = FirstAttrs.size();		unsigned NumFirstAttrs = FirstAttrs.size();
unsigned NumSecondAttrs = SecondAttrs.size();		unsigned NumSecondAttrs = SecondAttrs.size();
unsigned MaxNumAttrs = std::max(NumFirstAttrs, NumSecondAttrs);		unsigned MaxNumAttrs = std::max(NumFirstAttrs, NumSecondAttrs);
for (unsigned I = 0; I < MaxNumAttrs; ++I) {		for (unsigned I = 0; I < MaxNumAttrs; ++I) {
if (I < NumFirstAttrs)		if (I < NumFirstAttrs)
LHS = FirstAttrs[I];		LHS = FirstAttrs[I];
if (I < NumSecondAttrs)		if (I < NumSecondAttrs)
RHS = SecondAttrs[I];		RHS = SecondAttrs[I];

if (!LHS \|\| !RHS \|\| LHS->getKind() != RHS->getKind()) {		if (!LHS \|\| !RHS \|\| LHS->getKind() != RHS->getKind()) {
DiagError(AttributeKind)		DiagError(AttributeKind)
<< (LHS != nullptr) << LHS << FirstDecl->getSourceRange();		<< (LHS != nullptr) << LHS << FirstDecl->getSourceRange();
DiagNoteAttrLoc(LHS);		DiagNoteAttrLoc(LHS);
DiagNote(AttributeKind)		DiagNote(AttributeKind)
<< (RHS != nullptr) << RHS << SecondDecl->getSourceRange();		<< (RHS != nullptr) << RHS << SecondDecl->getSourceRange();
DiagNoteAttrLoc(RHS);		DiagNoteAttrLoc(RHS);
return true;		return true;
}		}
if (ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)) {		if (ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)) {
DiagError(AttributeArguments)		DiagError(AttributeArguments)
<< FullAttributeText(LHS, Context) << LHS->getRange();		<< FullAttributeText(LHS, Context) << LHS->getRange();
DiagNoteAttrLoc(LHS);		DiagNoteAttrLoc(LHS);
DiagNote(AttributeArguments)		DiagNote(AttributeArguments)
<< FullAttributeText(RHS, Context) << RHS->getRange();		<< FullAttributeText(RHS, Context) << RHS->getRange();
DiagNoteAttrLoc(RHS);		DiagNoteAttrLoc(RHS);
return true;		return true;
}		}
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions I feel like we can merge these two statements. ChuanqiXu: I feel like we can merge these two statements.
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions Sorry, I don't really get what two statements you are talking about. Is it `!LHS \|\| !RHS \|\| LHS->getKind() != RHS->getKind()` `ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)` ? vsapsai: Sorry, I don't really get what two statements you are talking about. Is it * `!LHS \|\| !RHS \|\|…
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Sorry for the ambiguity. Since `LHS->getKind() != RHS->getKind()` is covered by `ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)`. I feel like it is reasonable to: if (!LHS \|\| !RHS \|\| ComputeAttrODRHash(LHS) != ComputeAttrODRHash(RHS)) { DiagError(); DiagNote(); DiagNote(); DiagNoteAttrLoc(); return true; } ChuanqiXu: Sorry for the ambiguity. Since `LHS->getKind() != RHS->getKind()` is covered by…
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions There are 2 separate cases to improve the diagnostic. In the first case we'd have ... first difference is definition in module 'FirstModule' found attribute 'enum_extensibility' attribute specified here but in 'SecondModule' found no attribute And if we reach the second ones, it implies the kind of attributes is the same and the only difference is attribute arguments, so the diagnostics are like first difference is definition in module 'FirstModule' found attribute ' __attribute__((enum_extensibility("open")))' attribute specified here but in 'SecondModule' found different attribute argument ' __attribute__((enum_extensibility("closed")))' attribute specified here From my limited experience, it is actually useful to have more details than trying to figure out the difference in attributes. vsapsai: There are 2 separate cases to improve the diagnostic. In the first case we'd have ``` ... first…
		ChuanqiXuUnsubmitted Not Done Reply Inline Actions Agreed. I just wanted to say the codes can be further be shorten by making the diagnostic kinds contains more cases: %select { \| \| ... } We have some such examples before. But this is not required. Since it actually moves the complexity from the source codes to the table gen. And the implementation doesn't look bad. ChuanqiXu: Agreed. I just wanted to say the codes can be further be shorten by making the diagnostic kinds…
}		}

return false;		return false;
}		}

ODRDiagsEmitter::DiffResult		ODRDiagsEmitter::DiffResult
ODRDiagsEmitter::FindTypeDiffs(DeclHashes &FirstHashes,		ODRDiagsEmitter::FindTypeDiffs(DeclHashes &FirstHashes,
DeclHashes &SecondHashes) {		DeclHashes &SecondHashes) {
▲ Show 20 Lines • Show All 1,280 Lines • Show Last 20 Lines

clang/lib/AST/ODRHash.cpp

	Show First 20 Lines • Show All 470 Lines • ▼ Show 20 Lines
	}			}

	void ODRHash::getHashableAttrs(const Decl *D,			void ODRHash::getHashableAttrs(const Decl *D,
	SmallVectorImpl<const Attr *> &HashableAttrs) {			SmallVectorImpl<const Attr *> &HashableAttrs) {
	HashableAttrs.clear();			HashableAttrs.clear();
	if (!D->hasAttrs())			if (!D->hasAttrs())
	return;			return;

	llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs),			llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs),
	[](const Attr *A) { return !A->isImplicit(); });			[](const Attr *A) { return !A->isImplicit(); });
				vsapsaiAuthorUnsubmitted Done Reply Inline Actions I'm not sure `isImplicit` is the best indicator of attributes to check, so suggestions in this area are welcome. I think we can start strict and relax some of the checks if needed. If people have strong opinions that some attributes shouldn't be ignored, we can add them to the tests to avoid regressions. Personally, I believe that alignment and packed attributes should never be silently ignored. vsapsai: I'm not sure `isImplicit` is the best indicator of attributes to check, so suggestions in this…
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions Agreed. I feel `isImplicit` is enough for now. ChuanqiXu: Agreed. I feel `isImplicit` is enough for now.
				aaron.ballmanUnsubmitted Not Done Reply Inline Actions The tricky part is -- sometimes certain attributes add additional implicit attributes and those implicit attributes matter (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380). And some attributes seem to just do the wrong thing entirely: https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344 So I think `isImplicit()` is a good approximation, but I'm more wondering what the principle is for whether an attribute should or should not be considered part of the ODR hash. Type attributes, attributes that impact object layout, etc all seem like they should almost certainly be part of ODR hashing. Others are a bit more questionable though. I think this is something that may need a per-attribute flag in Attr.td so attributes can opt in or out of it because I'm not certain what ODR issues could stem from `[[maybe_unused]]` or `[[deprecated]]` disagreements across module boundaries. aaron.ballman: The tricky part is -- sometimes certain attributes add additional implicit attributes and those…
				erichkeaneUnsubmitted Not Done Reply Inline Actions I don't think 'isImplicit' is particularly good. I think the idea of auto-adding 'type' attributes and having the 'rest' be analyzed to figure out which are important. Alternatively, I wonder if we're better off just adding ALL attributes and seeing what the fallout is. We can later decide when we don't want them to be ODR significant (which, might be OTHERWISE meaningful later!). erichkeane: I don't think 'isImplicit' is particularly good. I think the idea of auto-adding 'type'…
				vsapsaiAuthorUnsubmitted Done Reply Inline Actions One option was to hash `isa<InheritedAttr>` attributes as they are "sticky" and should be more significant, so shouldn't be ignored. But I don't think this argument is particularly convincing. At this stage I think we can still afford to add all attributes and exclude some as-needed because modules adoption is still limited. Though I don't have strong feelings about this approach, just getting more restrictive later can be hard. vsapsai: One option was to hash `isa<InheritedAttr>` attributes as they are "sticky" and should be more…
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions It looks not a bad idea to add all attributes as experiments, @vsapsai how do you feel about this? ChuanqiXu: It looks not a bad idea to add all attributes as experiments, @vsapsai how do you feel about…
				vsapsaiAuthorUnsubmitted Done Reply Inline Actions One criteria to decide which attributes should be hashed is if they affect IRGen. But that's not exhaustive and I'm not sure how practical it is. The rule I'm trying to follow right now is if declarations with different attributes can be substituted instead of each other. For example, structs with different memory layout cannot be interchanged, so it is reasonable to reject them. But maybe we should review attributes on case-by-case basis. For example, for `[[deprecated]]` I think the best for developers is not to complain about it but merge the attributes and then have regular diagnostic about using a deprecated entity. vsapsai: One criteria to decide which attributes should be hashed is if they affect IRGen. But that's…
				aaron.ballmanUnsubmitted Not Done Reply Inline Actions My intuition is that we're going to want this to be controlled from Attr.td on a case-by-case basis and to automatically generate the ODR hash code for attribute arguments. We can either force every attribute to decide explicitly (this seems pretty disruptive as a final design but would be a really good way to ensure we audit all attributes if done as an experiment), or we can pick a default behavior to ODR hash/not. I suspect we're going to want to pick a default behavior based on whatever the audit tells us the most common situation is. I think we're going to need something a bit more nuanced than "this attribute matters for ODR" because there are attribute arguments that don't always contribute to the entity (for example, we have "fake" arguments that are used to give the semantic attribute more information, there may also be cases where one argument matter and another argument does not such as `enable_if` where the condition matters greatly but the message doesn't matter much). So we might need a marking on the attribute level and on the parameter level to determine what all factors into attribute identity for generating the ODR hashing code. Hopefully we can avoid needing more granularity than that. aaron.ballman: My intuition is that we're going to want this to be controlled from Attr.td on a case-by-case…
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td. And I feel most of them affecting the code generation. For example, there're a lot of attributes which is hardware related. because there are attribute arguments that don't always contribute to the entity When you're talking about "entities", I'm not sure if you're talking the "entities" in the language spec or a general abstract ones. I mean, for example, the `always_inline` attribute doesn't contribute to the declaration from the language perspective. But it'll be super odd if we lost it. So I feel it may be better to change the requirement to be "affecting code generation" Also, to be honest, I am even not sure if we should take "affecting code generation " as the requirement. For example, for the `preferred_name` attribute, this is used for better printing names. It doesn't affect code generation nor the entity you mentioned. But if we drop it, then the printing name may not be what users want. I'm not sure if this is desired. ChuanqiXu: I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td. And…
				aaron.ballmanUnsubmitted Not Done Reply Inline Actions I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td. And I feel most of them affecting the code generation. For example, there're a lot of attributes which is hardware related. Hmmm, I'm seeing a reasonably even split. We definitely have a ton of attributes like multiversioning, interrupts, calling conventions, availability, etc that I think all should be part of an ODR hash. We also have a ton of other ones that don't seem like they should matter for ODR hashing like hot/cold, consumed/returns retained, constructor/destructor, deprecated, nodiscard, maybe_unused, etc. Then we have the really fun ones where I think we'll want them to be in the ODR hash but maybe we don't, like `[[clang::annotate]]`, restrict, etc. So I think we really do need an actual audit of the attributes to decide what the default should be (if there should even be one). When you're talking about "entities", I'm not sure if you're talking the "entities" in the language spec or a general abstract ones. I mean, for example, the always_inline attribute doesn't contribute to the declaration from the language perspective. But it'll be super odd if we lost it. So I feel it may be better to change the requirement to be "affecting code generation" Not in a language spec meaning -- I meant mostly that there's some notion of identify for the one definition rule and not all attribute arguments contribute to that identity the same way. I don't know that "affecting code gen" really captures the whole of it. For example, whether a function is/is not marked hot or cold affects codegen, but really has nothing to do with the identity of the function (it's the same function whether the hint is there or not). `preferred_name` is another example -- do we really think a structure with that attribute is fundamentally a different thing than one without that attribute? I don't think so. To me, it's more about what cross-TU behaviors you'll get if the attribute is only present on one side. Things like ABI tags, calling conventions, attributes that behave like qualifiers, etc all contribute to the identity of something where a mismatch will impact the correctness of the program. But things like optimization hints and diagnostic hints don't seem like they contribute to the identity of something because they don't impact correctness. aaron.ballman: > I feel the default behavior would be to do ODR hashes. I just took a quick look in Attr.td.
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions I got your point. But from my developing experience, I want to say the optimization hints matters too. And I am wondering how about checking the ODR for all attributes. It would emit error if the attributes that affecting the correctness mismatches. And if the other attributes mismatches, a warning enough. (If we worry about the compile time performance, I feel it won't take too long. And we can have an option to control it actually.) ChuanqiXu: I got your point. But from my developing experience, I want to say the optimization hints…
				aaron.ballmanUnsubmitted Not Done Reply Inline Actions I got your point. But from my developing experience, I want to say the optimization hints matters too. That's why I'd like to have an idea about what our policy is. To me, ODR hashing should be restricted to finding ODR violations. It sounds like you'd like ODR hashing to also optionally find other kinds of differences that aren't necessarily ODR violations but still surprises nonetheless. I think that's reasonable, but I think we should ensure the diagnostics distinguish between "this is definitely UB" and "this might not have the optimization or diagnostic properties you expect but is still correct". And I am wondering how about checking the ODR for all attributes. It would emit error if the attributes that affecting the correctness mismatches. And if the other attributes mismatches, a warning enough. (If we worry about the compile time performance, I feel it won't take too long. And we can have an option to control it actually.) I don't think we should ODR-check all attributes blindly. There's no ODR violation for one function having `[[nodiscard]]` in a TU and another function not having it. In fact, attributes are sometimes used additively so that you can put extra constraints locally that aren't there globally. e.g., I've seen real code doing: // Header.h int some_func(); // Source.cpp #include "Header.h" // We've had too many bugs from people ignoring the results, // which really matters in this particular file. Redeclare the API // with extra diagnostic checking for this TU. [[nodiscard]] int some_func(); ... It shouldn't cause ODR violation diagnostics if `some_func()` is defined in another source file without the attribute. aaron.ballman: > I got your point. But from my developing experience, I want to say the optimization hints…
				ChuanqiXuUnsubmitted Not Done Reply Inline Actions Oh, yeah, you're right. Let's try to make the policy clearly when we started to it further. ChuanqiXu: Oh, yeah, you're right. Let's try to make the policy clearly when we started to it further.
	}			}

	void ODRHash::AddAttrs(const Decl *D) {			void ODRHash::AddAttrs(const Decl *D) {
	llvm::SmallVector<const Attr *, 2> Attrs;			llvm::SmallVector<const Attr *, 2> Attrs;
	getHashableAttrs(D, Attrs);			getHashableAttrs(D, Attrs);
	ID.AddInteger(Attrs.size());			ID.AddInteger(Attrs.size());
	for (const Attr *A : Attrs)			for (const Attr *A : Attrs)
	AddAttr(A);			AddAttr(A);
	}			}

	void ODRHash::AddAttr(const Attr *A) {			void ODRHash::AddAttr(const Attr *A) {
	ID.AddInteger(A->getKind());			ID.AddInteger(A->getKind());

	// FIXME: This should be auto-generated as part of Attr.td			// FIXME: This should be auto-generated as part of Attr.td
				aaron.ballmanUnsubmitted Not Done Reply Inline Actions 100% agreed. aaron.ballman: 100% agreed.
	switch (A->getKind()) {			switch (A->getKind()) {
	case attr::Aligned: {			case attr::Aligned: {
	auto *M = cast<AlignedAttr>(A);			auto *M = cast<AlignedAttr>(A);
	ID.AddBoolean(M->isAlignmentExpr());			ID.AddBoolean(M->isAlignmentExpr());
	if (M->isAlignmentExpr()) {			if (M->isAlignmentExpr()) {
	Expr *AlignmentExpr = M->getAlignmentExpr();			Expr *AlignmentExpr = M->getAlignmentExpr();
	ID.AddBoolean(AlignmentExpr);			ID.AddBoolean(AlignmentExpr);
	if (AlignmentExpr)			if (AlignmentExpr)
	AddStmt(AlignmentExpr);			AddStmt(AlignmentExpr);
	} else {			} else {
	ID.AddString(M->getAlignmentType()->getType().getAsString());			ID.AddString(M->getAlignmentType()->getType().getAsString());
	}			}
	break;			break;
	}			}
				case attr::AlignValue: {
				auto *M = cast<AlignValueAttr>(A);
				Expr *AlignmentExpr = M->getAlignment();
				ID.AddBoolean(AlignmentExpr);
				if (AlignmentExpr)
				AddStmt(AlignmentExpr);
				break;
				}
	case attr::EnumExtensibility: {			case attr::EnumExtensibility: {
	auto *M = cast<EnumExtensibilityAttr>(A);			auto *M = cast<EnumExtensibilityAttr>(A);
	ID.AddInteger(M->getExtensibility());			ID.AddInteger(M->getExtensibility());
	break;			break;
	}			}
	default:			default:
	break;			break;
	}			}
	▲ Show 20 Lines • Show All 674 Lines • Show Last 20 Lines

clang/test/Modules/odr_hash.cpp

Show First 20 Lines • Show All 3,631 Lines • ▼ Show 20 Lines	class S {
static auto Function2 = invalid2<Num>;		static auto Function2 = invalid2<Num>;
// expected-error@first.h:* {{'Types::DependentSizedExtVector::invalid2' has different definitions in different modules; definition in module 'FirstModule' first difference is function body}}		// expected-error@first.h:* {{'Types::DependentSizedExtVector::invalid2' has different definitions in different modules; definition in module 'FirstModule' first difference is function body}}
// expected-note@second.h:* {{but in 'SecondModule' found a different body}}		// expected-note@second.h:* {{but in 'SecondModule' found a different body}}
static auto Function3 = valid<Num>;		static auto Function3 = valid<Num>;
};		};
#endif		#endif
} // namespace DependentSizedExtVector		} // namespace DependentSizedExtVector

namespace Attributes {		namespace Attributes {
		vsapsaiAuthorUnsubmitted Done Reply Inline Actions As we land hashing for C and Objective-C, we can move these tests to their own file. But for now I think it makes sense to keep everything in odr_hash.cpp. Though I don't have a strong preference. vsapsai: As we land hashing for C and Objective-C, we can move these tests to their own file. But for…
#if defined(FIRST)		#if defined(FIRST)
struct __attribute__((packed)) PackingPresence {		struct __attribute__((packed)) PackingPresence {
char x;		char x;
long y;		long y;
};		};
#elif defined(SECOND)		#elif defined(SECOND)
struct PackingPresence {		struct PackingPresence {
char x;		char x;
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
EnumExtensibilityValues testEnumAttrValue;		EnumExtensibilityValues testEnumAttrValue;
// expected-error@first.h:* {{'Types::Attributes::EnumExtensibilityValues' has different definitions in different modules; first difference is definition in module 'FirstModule' found attribute ' __attribute__((enum_extensibility("open")))'}}		// expected-error@first.h:* {{'Types::Attributes::EnumExtensibilityValues' has different definitions in different modules; first difference is definition in module 'FirstModule' found attribute ' __attribute__((enum_extensibility("open")))'}}
// expected-note@first.h:* {{attribute specified here}}		// expected-note@first.h:* {{attribute specified here}}
// expected-note@second.h:* {{but in 'SecondModule' found different attribute argument ' __attribute__((enum_extensibility("closed")))'}}		// expected-note@second.h:* {{but in 'SecondModule' found different attribute argument ' __attribute__((enum_extensibility("closed")))'}}
// expected-note@second.h:* {{attribute specified here}}		// expected-note@second.h:* {{attribute specified here}}
#endif		#endif

#if defined(FIRST)		#if defined(FIRST)
		struct TypedefAttributePresence {
		typedef double *AlignedDoublePtr __attribute__((align_value(64)));
		};
		struct TypedefDifferentAttributeValue {
		typedef double *AlignedDoublePtr __attribute__((align_value(64)));
		};
		#elif defined(SECOND)
		struct TypedefAttributePresence {
		typedef double *AlignedDoublePtr;
		};
		struct TypedefDifferentAttributeValue {
		typedef double *AlignedDoublePtr __attribute__((align_value(32)));
		};
		#else
		TypedefAttributePresence testTypedefAttrPresence;
		// expected-error@first.h:* {{'Types::Attributes::TypedefAttributePresence' has different definitions in different modules; first difference is definition in module 'FirstModule' found 'AlignedDoublePtr' with attribute 'align_value'}}
		// expected-note@first.h:* {{attribute specified here}}
		// expected-note@second.h:* {{but in 'SecondModule' found 'AlignedDoublePtr' with no attribute}}
		TypedefDifferentAttributeValue testDifferentArgumentsInTypedefAttribute;
		// expected-error@first.h:* {{'Types::Attributes::TypedefDifferentAttributeValue' has different definitions in different modules; first difference is definition in module 'FirstModule' found 'AlignedDoublePtr' with attribute ' __attribute__((align_value(64)))'}}
		// expected-note@first.h:* {{attribute specified here}}
		// expected-note@second.h:* {{but in 'SecondModule' found 'AlignedDoublePtr' with different attribute argument ' __attribute__((align_value(32)))'}}
		// expected-note@second.h:* {{attribute specified here}}
		#endif

		#if defined(FIRST)
#define PACKED __attribute__((packed))		#define PACKED __attribute__((packed))
struct PACKED AttributeInMacro {		struct PACKED AttributeInMacro {
char x;		char x;
long y;		long y;
};		};

#pragma clang attribute push (__attribute__((abi_tag("a"))), apply_to = any(record(unless(is_union))))		#pragma clang attribute push (__attribute__((abi_tag("a"))), apply_to = any(record(unless(is_union))))
struct AttributeInPragma { char x; };		struct AttributeInPragma { char x; };
▲ Show 20 Lines • Show All 1,201 Lines • Show Last 20 Lines