This is an archive of the discontinued LLVM Phabricator instance.

[ODRHash] Hash type-as-written
ClosedPublic

Authored by ChuanqiXu on Jul 24 2023, 11:41 PM.

Details

Summary

This is suggested by @rsmith in https://reviews.llvm.org/D154324#inline-1508868.

The intention of the review page is to get a feeling about the suggestion. And this may be helpful to discuss the direction.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Jul 24 2023, 11:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 11:41 PM
ChuanqiXu requested review of this revision.Jul 24 2023, 11:41 PM

There is two kinds of regression in the current patch. One is false negative and one is false positive.

The first one is as:

#if defined(FIRST)
typedef int T;
struct S3 {
  typedef T a;
};
#elif defined(SECOND)
typedef double T;
struct S3 {
  typedef T a;
};
#else
S3 s3;
#endif

Previously we are able to diagnose that the type T in the translation units are different. But now we don't.

The second one is:

#if defined(FIRST)
struct T1;
class S1 {
  T1* t;
};
#elif defined(SECOND)
typedef struct T1 {} T1;
class S1 {
  T1* t;
};
#else
S1 s1;
#endif

Now we think S1 have different definitions in different TU due to we don't strip the TypeDef decls too.

So we can think these two problems come from the same underlying issue: how to get the underlying type from type-def-types. This is also the motivation issue for the series of patches.

From what @rsmith said:

If we want to include the underlying type of the typedef here, we'll need a new kind of hashing to capture only the value of the typedef and not how it was written. I don't think it's worth it; let's just not include the definition of a referenced typedef in the hash for now.

So now it looks like the only path we can forward? Or can we accept the regressions described in the patch?

clang/test/Modules/odr_hash.cpp
1049

The regression should come from we don't try to find the underlying type of a typedef type anymore. So we failed to find the real type of T is different (one is int and another is double)

4897–4898

The regression should come from that we don't strip typedefs in this patch. (See ODRTypeVisitor::Visit)

From what @rsmith said:

If we want to include the underlying type of the typedef here, we'll need a new kind of hashing to capture only the value of the typedef and not how it was written. I don't think it's worth it; let's just not include the definition of a referenced typedef in the hash for now.

So now it looks like the only path we can forward? Or can we accept the regressions described in the patch?

The cases where we used to diagnose and don't diagnose any more seem acceptable to me, even though ideally we'd still want to catch those problems. It's much more important that we don't reject valid programs here than that we catch all the ODR violations that we possibly can.

However, the new diagnostic for valid code doesn't seem OK -- that's adding a false positive. I think there are a few ways we could handle that; maybe the easiest would be to add back in the typedef stripping logic, but strip typedefs *only* in situation where a typedef resolves to a tag type with the same name.

There's also another really awkward but valid case:

typedef int T;
namespace A {
  #ifdef INNER_T
  typedef int T;
  #endif
  struct X { T n; };
}

According to the ODR, it's OK to have this header twice in the same program with different definitions for INNER_T, because the name T resolves to the same entity either way (the type int). I think we will not reject that case, because ODRHash::AddDecl only considers the name of the declaration, not the scope that it was declared in, but it'd be good to add that as a testcase too.

If we want to avoid the regressions here, I think we could add an additional mechanism to hash the resolved value of a type, in a way that's careful to only hash things that are part of the value of the type, not part of the way the type was written. But I'd consider that to be a lower priority than getting to the point where we don't have false positives.

ChuanqiXu updated this revision to Diff 545007.Jul 27 2023, 8:34 PM

Address comments:

  • add back in the typedef stripping logic, but strip typedefs *only* in situation where a typedef resolves to a tag type with the same name. Note that this is exactly what it does.
  • Add a test:
C++
typedef int T;
namespace A {
  #ifdef INNER_T
  typedef int T;
  #endif
  struct X { T n; };
}
ChuanqiXu updated this revision to Diff 545008.Jul 27 2023, 8:35 PM

Update the correct patch.

@rsmith thanks for the suggestion. I prefer to handle things simply.

clang/lib/AST/ODRHash.cpp
901

So it looks like we've originally done this. I added a test for it too.

Remove redundant FIXME in the test.

rsmith accepted this revision.Jul 28 2023, 2:13 PM

Looks good, thanks!

Please can you add TODO comments to the invalid cases that we are now accepting, to say ideally we should reject them? I've left comments on a few of them where I think the original diagnostics weren't actually what we wanted.

clang/test/Modules/odr_hash-gnu.cpp
54

We should issue a diagnostic here, not for Invalid2, because we can't merge the two declarations of global. (We should also diagnose that global has two definitions...)

clang/test/Modules/odr_hash.cpp
3602

We should diagnose the mismatched declaration of global, not the mismatched definitions of invalid2.

clang/test/Modules/odr_hash.mm
287–295

I think it's better that we're not diagnosing these ones :) These were follow-on diagnostics from the mismatch in Interface4 / Interface5 / Interface6.

This revision is now accepted and ready to land.Jul 28 2023, 2:13 PM
ChuanqiXu updated this revision to Diff 545501.Jul 30 2023, 8:02 PM
ChuanqiXu marked 3 inline comments as done.

Address comments.

Thanks for the reviewing!

This revision was landed with ongoing or failed builds.Jul 30 2023, 8:08 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2023, 8:08 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript