This is an archive of the discontinued LLVM Phabricator instance.

[clang] Add serialization for loop hint annotation tokens
ClosedPublic

Authored by mikerice on Nov 21 2022, 12:03 PM.

Details

Summary

When late parsed templates are used with PCH, tokens are serialized. The
existing code does not handle annotation tokens which can occur due to
various pragmas.

This patch implements the serialization for annot_pragma_loop_hint.

This also enables use of OpenMP pragmas and #pragma unused which do not
need special serialization of the PtrData field.

There are other pragmas that can be used with late template parsing which
we can implement in separately patches as needed.

Fixes https://github.com/llvm/llvm-project/issues/39504

Diff Detail

Event Timeline

mikerice created this revision.Nov 21 2022, 12:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 21 2022, 12:03 PM
mikerice requested review of this revision.Nov 21 2022, 12:03 PM

I think you also need to update ASTBitCodes.h to bump VERSION_MAJOR, and the changes should have a release note.

clang/include/clang/Lex/Token.h
338

ArrayRef does not own its underlying data, so this looks like a potential use-after-free waiting to happen despite being existing code.

clang/lib/Serialization/ASTReader.cpp
1687

Ah, so you make an allocated copy of the tokens, so this isn't a use-after-free but it's still pretty fragile. I wonder if we should switch to using SmallVector in a follow-up NFC change?

clang/lib/Serialization/ASTWriter.cpp
4397–4398

There's a whole lot of annotation tokens that are not loop hints; is something preventing them from getting here?

mikerice updated this revision to Diff 477281.Nov 22 2022, 1:17 PM
mikerice marked 2 inline comments as done.

Bump the AST version.
Add an entry to the release notes.

clang/lib/Serialization/ASTReader.cpp
1687

I can look into that and follow-up. Not sure why that choice was made originally.

clang/lib/Serialization/ASTWriter.cpp
4397–4398

I believe the non-pragma annotation tokens are not created at this point since Sema hasn't been run over the late-parsed template. Annotation tokens for pragmas will hit this assertion until someone implements serialization for that pragma. This is a little better than the assertion we had before. Another option would be to give an error here if we see a token we haven't implemented yet.

rnk added a comment.Nov 22 2022, 1:32 PM

It's unfortunate that nobody has carried forward an RFC to disable fdelayed-template-parsing:
https://discourse.llvm.org/t/how-do-you-access-the-body-of-a-template-function-in-the-ast/61829/7

This feature is a source of bugs and ongoing maintenance costs like this, and I don't believe it is creating enough value to justify itself anymore.

Feel free to do what you have to, though.

ping, anything else needed for this?

aaron.ballman accepted this revision.Nov 29 2022, 8:22 AM

It's unfortunate that nobody has carried forward an RFC to disable fdelayed-template-parsing:
https://discourse.llvm.org/t/how-do-you-access-the-body-of-a-template-function-in-the-ast/61829/7

This feature is a source of bugs and ongoing maintenance costs like this, and I don't believe it is creating enough value to justify itself anymore.

I think the compatibility it gives with MSVC is sufficient value to justify its existence (until MSVC stops supporting delayed template parsing entirely), but we should be exploring whether we have the correct default these days in terms of enabling/disabling it for targets.

ping, anything else needed for this?

LGTM now, thanks!

This revision is now accepted and ready to land.Nov 29 2022, 8:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 29 2022, 10:56 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
shafik added a subscriber: shafik.Feb 5 2023, 2:10 PM

It looks like https://github.com/llvm/llvm-project/issues/60543 is hitting your added llvm_unreachable("missing serialization code for annotation token"); is this expected.

It looks like https://github.com/llvm/llvm-project/issues/60543 is hitting your added llvm_unreachable("missing serialization code for annotation token"); is this expected.

I believe it's expected that we hit this assertion - it's clearer than the original message, and more directly indicates that #pragma pack cannot be serialized.

It looks like https://github.com/llvm/llvm-project/issues/60543 is hitting your added llvm_unreachable("missing serialization code for annotation token"); is this expected.

I believe it's expected that we hit this assertion - it's clearer than the original message, and more directly indicates that #pragma pack cannot be serialized.

Failed assertions or reaching code marked as unreachable is always a bug -- we should be failing gracefully with diagnostics and not crashing.

I believe it's expected that we hit this assertion - it's clearer than the original message, and more directly indicates that #pragma pack cannot be serialized.

Failed assertions or reaching code marked as unreachable is always a bug -- we should be failing gracefully with diagnostics and not crashing.

My intention is to go through all the pragmas and implement serialization for any that need it. If there are some that we don't want to do or if this takes a long time, I will do ahead and add a diagnostic.