This is an archive of the discontinued LLVM Phabricator instance.

[modules][ODRHash] Compare ODR hashes to detect mismatches in duplicate ObjCInterfaceDecl.
AbandonedPublic

Authored by vsapsai on Apr 22 2022, 11:39 AM.

Details

Summary

The purpose of the change is to start using ODR hash for comparison
instead of ASTStructuralEquivalence and ODR hash calculation is,
unfortunately, incomplete. The decision to use ODR hash is made because
we can use the same mechanism while checking duplicates encountered
during deserialization. For that case ASTStructuralEquivalence is not a
good fit because it can lead to excessive deserialization and
deserialized decls would be used only for ASTStructuralEquivalence.

rdar://82908223

Diff Detail

Event Timeline

vsapsai created this revision.Apr 22 2022, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 11:39 AM
Herald added a subscriber: ributzka. · View Herald Transcript
vsapsai requested review of this revision.Apr 22 2022, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 11:39 AM

As a proof of concept how ODR hash comparison can be used during deserialization, please see D124289. My point is to highlight the amount of reuse we gain from using ODR hash (I particularly enjoy the test case reuse). I agree the error messages aren't particularly useful yet and it will take some work to make code in ASTReader::diagnoseOdrViolations usable from parser. But I believe that should be doable.

This sounds reasonable to me. What use-cases does ASTStructuralEquivalence fit better than ODR hashes?

clang/include/clang/Sema/Sema.h
3300

I'm not sure I'm a fan of using the exact same function name for checking ODR hashes and structural equivalence.

This sounds reasonable to me. What use-cases does ASTStructuralEquivalence fit better than ODR hashes?

I think the biggest advantage of ASTStructuralEquivalence is that it doesn't affect memory consumption by Decls. But it makes repeated comparisons more expensive.

Another ASTStructuralEquivalence advantage is that it's easier to add new checks. For example, comparing ivars is pretty trivial with ASTStructuralEquivalence but with ODR hash we need to handle that ivar types can be structs/unions, including anonymous and nested structs.

Cannot say that for sure but I think ASTStructuralEquivalence is easier to extend and to adopt for different purposes, while ODR hash seems to be more invasive. But this argument is more hand-wavy.

Most likely ASTStructuralEquivalence has other positive qualities but that's what I was able to come up off the top of my head. And you can see that in my decision-making ASTStructuralEquivalence doesn't look like a compelling option.

clang/include/clang/Sema/Sema.h
3300

That's my failure of imagination, will come with something different. And after you've pointed it out, I've realized it's not a good use of overloading too.

jansvoboda11 added inline comments.Apr 28 2022, 2:15 AM
clang/lib/AST/ODRHash.cpp
528

Is this important to implement now, or are you fine leaving this be for the time being?

This sounds reasonable to me. What use-cases does ASTStructuralEquivalence fit better than ODR hashes?

I think the biggest advantage of ASTStructuralEquivalence is that it doesn't affect memory consumption by Decls. But it makes repeated comparisons more expensive.

Another ASTStructuralEquivalence advantage is that it's easier to add new checks. For example, comparing ivars is pretty trivial with ASTStructuralEquivalence but with ODR hash we need to handle that ivar types can be structs/unions, including anonymous and nested structs.

Cannot say that for sure but I think ASTStructuralEquivalence is easier to extend and to adopt for different purposes, while ODR hash seems to be more invasive. But this argument is more hand-wavy.

Most likely ASTStructuralEquivalence has other positive qualities but that's what I was able to come up off the top of my head. And you can see that in my decision-making ASTStructuralEquivalence doesn't look like a compelling option.

Makes sense. If most Decl classes had the member variables that enable ODR hashing, would it make sense to keep structural equivalence logic around? That sounds a bit redundant to me.

iains added a subscriber: iains.Apr 28 2022, 2:22 AM

This sounds reasonable to me. What use-cases does ASTStructuralEquivalence fit better than ODR hashes?

I think the biggest advantage of ASTStructuralEquivalence is that it doesn't affect memory consumption by Decls. But it makes repeated comparisons more expensive.

Another ASTStructuralEquivalence advantage is that it's easier to add new checks. For example, comparing ivars is pretty trivial with ASTStructuralEquivalence but with ODR hash we need to handle that ivar types can be structs/unions, including anonymous and nested structs.

Cannot say that for sure but I think ASTStructuralEquivalence is easier to extend and to adopt for different purposes, while ODR hash seems to be more invasive. But this argument is more hand-wavy.

Most likely ASTStructuralEquivalence has other positive qualities but that's what I was able to come up off the top of my head. And you can see that in my decision-making ASTStructuralEquivalence doesn't look like a compelling option.

Makes sense. If most Decl classes had the member variables that enable ODR hashing, would it make sense to keep structural equivalence logic around? That sounds a bit redundant to me.

Presumably ASTStructuralEquivalence can also be used as a tie-breaker in the case of a hash collision?

Presumably ASTStructuralEquivalence can also be used as a tie-breaker in the case of a hash collision?

Ah, that's a good point.

Recollected that ASTStructuralEquivalence by default tends to be order-dependent while ODR hash order-independent. Different approaches can be achieved with different mechanisms but some behavior is easier to achieve.

Another peculiarity of ASTStructuralEquivalence is that it allows more customizability. Initializing StructuralEquivalenceContext is more involved than getODRHash() but adding knobs for tweaking comparison behavior is easier.

clang/lib/AST/ODRHash.cpp
528

When I was posting this patch, I thought it wasn't critical to cover everything immediately. Now I'm thinking about the ways to minimize the uncovered elements. Specifically, I'm thinking about adding support for protocols first and then for interfaces. Protocols aren't as useful as interfaces but they are simpler and can be easier to add.

And AddSubDecl should cover some of the nested decls already, need to add more cases to the test.

vsapsai planned changes to this revision.May 6 2022, 7:27 PM
vsapsai added inline comments.
clang/lib/AST/ODRHash.cpp
528

I've made good progress at handling anonymous RecordDecl. So think I should add handling ivars instead of the lame FIXME.

vsapsai abandoned this revision.Dec 14 2022, 5:22 PM

Abandoning this change as now it is split between D124286 and D140073.