When initializing FunctionNoProtoType types, zero out the type
qualifiers. This will ensure the ODR hash remains stable as it
hashes the values for these qualifiers for all function types.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Was struggling to think of a good one. How about a test that repeatedly generates a pcm for a func decl with no params and checks if the DECL_FUNCTION record is the same?
Have you looked at clang/test/Modules/odr_hash.cpp? It's where most of the ODR hash testing takes place by testing that Decls can be merged properly instead of checking the contents of pcm files.. Using #if define, it creates multiple modules from the same file. I would suggest creating two functions in each of the modules, then in the main file, using the function to force it to be loaded from the modules and merged together. The test should fail with the current Clang, but pass with your patch. You may need to create your test file if you need different compiler options.
I took a look at writing a test to cover this, but hit the following problem: function qualifiers are only valid on c++ members, and there we cannot create FunctionNoProtoTypes. I couldn't think of a way of testing this by comparing module output that would fail deterministically, as it relies on reusing uninitialized memory that is non-zero.
This is why I'm wondering how we're hitting this problem in the first place. C++ shouldn't be able to create a function without a prototype so why does the ODR hash matter (do we use that in C and I just wasn't aware of it)?
clang/include/clang/AST/Type.h | ||
---|---|---|
3931 | It seems a bit odd to me that we only want to initialize one member of the bits and none of the rest. | |
clang/unittests/AST/DeclTest.cpp | ||
365 ↗ | (On Diff #462248) | You should also pass -std=c89 explicitly so that the test continues to work when we switch the default language mode to C23 someday in the future. |
The ODR hash matters because it is serialized in PCM output, regardless of which function type it is:
https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/ODRHash.cpp#L889-L908
Leaving the FastTypeQuals uninitialized can lead to non-deterministic values of the ODR hash, and so the serialized module output. With enough leading 0s this can lead to a size change in the PCM file, which will start to fail builds due to a size mismatch during PCM validation.
Some alternative approaches could be:
- don't call these methods in VisitFunctionNoProtoType (and audit other callsites)
- remove these methods from FunctionNoProtoType
clang/include/clang/AST/Type.h | ||
---|---|---|
3931 | The reason I only set the FastTypeQuals is because the rest of the FunctionTypeBits are not accessed in the base class or this class, except for ExtInfo which is initialized in the base class. |
clang/include/clang/AST/Type.h | ||
---|---|---|
3931 | This feels error-prone since any new classes derived from FunctionType will need to also have this. I think the safer change is to modify FunctionType::getFastTypeQuals(). If this is a FunctionProtoType, use FastTypeQuals to create a Qualifiers. If it is not, return a default created Qualifiers. This way, we won't need to increase the access to data members. |
LGTM aside from a minor coding style nit, thank you!
clang/include/clang/AST/Type.h | ||
---|---|---|
3882–3883 |
How correct is it to access isConst, isVolatile, isRestrict for FunctionNoProtoType? Yes, we can provide some default value but I'm curious if accessing that default value is correct.
For the record, I've tried to fix the same problem in https://reviews.llvm.org/D104963 in a different way.
That was my initial solution as well, but it seemed safer to ensure these methods always returned a consistent value without auditing all the possible call sites. I agree that it doesn't seem right that these methods are on the base class at all if they are only valid in one of the subclasses.
Thanks for the explanation, it is good to know the reasons for the current direction. Though it still feels like fixing a symptom and not the root cause.
The trouble is -- FunctionNoProtoType is pretty rare to encounter and is getting more rare by the year now that C has removed the feature entirely, but you can't build a prototyped function from an unprototyped one or vice versa (their only relationship is that they're both function types with a known return type). So if we pushed the methods from FunctionType down into FunctionProtoType, we'd have to run through the casting machinery a lot more than we currently do which could potentially regress compile times for very little benefit.
Personally, I think the next step is to add a local assert() to this function to try to find out why we're calling this on functions without a prototype and fix up the call sites. I think the intent is that you should not be calling this function on an unprototyped function and it'd be good for debug builds to yell if we're doing that, but returning a default constructed object is a safe recovery for release builds.
The problem with that approach is it won't work with the unit test we have. When we have more ODR hashing in C, we can replace the unit test with the one in D104963 but I wasn't able to make that one fail reliably (no MSAN on macOS), so not sure it is a sufficient testing.