This is an archive of the discontinued LLVM Phabricator instance.

[Serialization] Add support for (de)serializing #pragma pack
ClosedPublic

Authored by DHowett-MSFT on Feb 6 2023, 8:31 AM.

Details

Summary

Without this, GCH serialization of late template expansions will
encounter an assertion failure.

Changes were required to ASTReader::ReadString to add support for
reading a string from a RecordDataImpl (which is the type consumed by
ReadToken) rather than a RecordData.

#pragma pack slot names are read into preprocessor storage.

This requires incidental changes [Parse,Sema] to move PragmaPackInfo to Sema

There is precedent for ASTReader/ASTWriter to poke into Sema.

Fixes https://github.com/llvm/llvm-project/issues/60543

Diff Detail

Event Timeline

DHowett-MSFT created this revision.Feb 6 2023, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2023, 8:31 AM
  • Add a test
DHowett-MSFT published this revision for review.Feb 6 2023, 10:09 AM

This is a fix for https://github.com/llvm/llvm-project/issues/60543.

I originally approached this fix by moving PragmaPackOptions up to Lex/Token.h to live near PragmaLoopHintInfo; however, that required moving much more of the PragmaMsStack machinery than I was comfortable with.

There is precedent for some pragma infos living in Sema as well as ASTReader/ASTWriter poking into Sema to do their jobs.

I've included a test that both validates that the frontend no longer crashes _and_ that the resultant structures are packed as expected.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2023, 10:09 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
mikerice added inline comments.Feb 6 2023, 10:48 AM
clang/test/PCH/delayed-template-with-pragma-pack.cpp
15

Your test should include testing of the slot string too, so we are sure all fields are serialized and restored correctly. Otherwise this looks reasonable to me.

While this fixes the assertion failure and the immediate issue of whether packing _works_ inside delay-parsed templates in a PCH, it does reveal a follow-on issue that I can't quite trace out.

c++
template <typename T>
void foo() {
#pragma pack(push, 1)
#pragma pack(show)
#pragma pack(pop)
}

results in...

unterminated '#pragma pack (push, ...)' at end of file

I've stepped through ASTReader and verified that we do get all three expected pack pragma nodes

{PSK_Push, ..., 1}
{PSK_Show}
{PSK_Pop, ...}

suggesting that there's something later down the line that messes up the pack stack.

clang/test/PCH/delayed-template-with-pragma-pack.cpp
15

So, I've identified an additional wrinkle...

Unlike #pragma omp, #pragma pack and ... align don't show up in the AST dump/reprint and I cannot verify their continued existence; this is true _even without delayed template expansion or PCH generation._

alignment/packing seem to not be stored as Attrs, so they don't get printed via attr print on the AST nodes or (prettily) via the tablegen-generated pretty printers.

Fixing this is somewhat beyond me at the moment. :)

While this fixes the assertion failure and the immediate issue of whether packing _works_ inside delay-parsed templates in a PCH, it does reveal a follow-on issue that I can't quite trace out.

c++
template <typename T>
void foo() {
#pragma pack(push, 1)
#pragma pack(show)
#pragma pack(pop)
}

results in...

unterminated '#pragma pack (push, ...)' at end of file

Tried your patch, but am not seeing this. Can you share the exact test case where you see this?

Also please add a -triple to your test so it is stable across platforms (maybe -triple x86_64-windows-msvc)

clang/test/PCH/delayed-template-with-pragma-pack.cpp
15

So, I've identified an additional wrinkle...

Unlike #pragma omp, #pragma pack and ... align don't show up in the AST dump/reprint and I cannot verify their continued existence; this is true _even without delayed template expansion or PCH generation._

alignment/packing seem to not be stored as Attrs, so they don't get printed via attr print on the AST nodes or (prettily) via the tablegen-generated pretty printers.

True, so you have to verify it with the generated code instead.

@mikerice Alright, it turns out that PragmaPackAlignStorage isn't suitable for this (actually, I'm not sure what it _is_
suitable for) because it's deleted by the time the template gets parsed again.

The push, slot case failed nondeterministically because the slot name is being UAF'd.

For now (and I am not sure this is correct), I've chosen to store the slot name in the preprocessor storage.

I suspect that other serialized pack info (not from the pragma, but the attr version that is stored in PragmaPackAlignStorage) is subject to the same limitation...

DHowett-MSFT marked 2 inline comments as done.Feb 6 2023, 3:15 PM

The new test checks how tightly things were packed across a stack of packs, pushed and popped. :)

DHowett-MSFT edited the summary of this revision. (Show Details)Feb 6 2023, 3:37 PM
  • Use .copy() like the other nearby code

Test failures seem related to recent driver changes. There's a commit 6a8a423c1864ced060a7041bf6ada7574f35ad4d that purports to fix them.

mikerice accepted this revision.Feb 7 2023, 9:34 AM

LGTM. Thanks for taking care of this.

This revision is now accepted and ready to land.Feb 7 2023, 9:34 AM

Thanks for the review!

I’ll need somebody to help land this, as I don’t have write access to the project. :)

This revision was landed with ongoing or failed builds.Feb 7 2023, 11:41 AM
This revision was automatically updated to reflect the committed changes.