This is an archive of the discontinued LLVM Phabricator instance.

[clang] Diagnose overly complex Record in __builtin_dump_struct
Needs ReviewPublic

Authored by yronglin on Aug 18 2023, 10:40 AM.

Details

Diff Detail

Event Timeline

yronglin created this revision.Aug 18 2023, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2023, 10:40 AM
Herald added a subscriber: arphaman. · View Herald Transcript
yronglin requested review of this revision.Aug 18 2023, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2023, 10:40 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
yronglin retitled this revision from [Clang] Add assertion to check the value of NumSubExprs/ResultIndex does not overflow to [NFC][Clang] Add assertion to check the value of NumSubExprs/ResultIndex does not overflow.Aug 18 2023, 10:44 AM

Hopefully my clarification on https://reviews.llvm.org/D154784#inline-1531176 makes it clear that assertions aren't necessarily the right tool here - as this path is reachable with actual code & I don't think we should allow that code to reach UB in the compiler in a non-assertionsn build. If we think this is a sufficient size to support, then we should error clearly even in a non-assertions build.

yronglin updated this revision to Diff 554770.Aug 30 2023, 10:25 AM

Addres the comments that we talked in D154784.

yronglin retitled this revision from [NFC][Clang] Add assertion to check the value of NumSubExprs/ResultIndex does not overflow to [clang] Diagnose overly complex Record in __builtin_dump_struct.Aug 30 2023, 10:29 AM
yronglin edited the summary of this revision. (Show Details)

The changes generally LGTM, thank you!

clang/lib/Sema/SemaChecking.cpp
732–733

This will correctly handle diagnosing a gigantic anonymous struct.

rsmith added a subscriber: rsmith.Aug 30 2023, 2:23 PM
rsmith added inline comments.
clang/lib/Sema/SemaChecking.cpp
732–733

Producing an error here seems likely to eventually cause problems in practice for some users: people are using __builtin_dump_struct in generic code for reflection purposes, not just for debugging, and this will cause us to start rejecting complex generic code.

Instead of rejecting, can we produce a tree of PseudoObjectExprs if we have too many steps to store in a single expression?

aaron.ballman added inline comments.Aug 31 2023, 5:58 AM
clang/lib/Sema/SemaChecking.cpp
732–733

Producing an error here seems likely to eventually cause problems in practice for some users: people are using __builtin_dump_struct in generic code for reflection purposes, not just for debugging, and this will cause us to start rejecting complex generic code.

Instead of rejecting, can we produce a tree of PseudoObjectExprs if we have too many steps to store in a single expression?

I think that requires wider discussion -- I don't think __builtin_dump_struct is a reasonable interface we want to support for reflection (in fact, I'd argue it's an explicit non-goal, the same as reflection via -ast-dump). Compile-time reflection is something we're likely to need to support more intentionally and I don't think we're going to want to use this as an interface for it or have to maintain it as a reflection tool long-term. As such, I think producing a tree of PseudoObjectExprs is overkill; you can quote me on this a few years from now when we're laughing at its quaintness, but "16k fields of debug output is enough for anyone" for a debugging interface.

(That said, I think we should be considering what support we want to add to the compiler for reflection in service of the work being done in WG21 on the topic -- if __builtin_dump_struct is being used for reflection in practice, it would be nice to give people a supported, more ergonomic interface for it that we can use for a future version of C++.)

rsmith added inline comments.Aug 31 2023, 3:03 PM
clang/lib/Sema/SemaChecking.cpp
732–733

The bug report https://github.com/llvm/llvm-project/issues/63169 was encountered by a user hitting the previous 256-element limit in practice when using __builtin_dump_struct for reflection. I don't think we can reasonably prevent that from happening, other than -- as you say -- encouraging WG21 to give us a real reflection design we can implement.

yronglin updated this revision to Diff 555376.Sep 1 2023, 7:50 AM

Address Aaron's comments.

yronglin marked 4 inline comments as done.Sep 1 2023, 7:52 AM

Thank you for your review @aaron.ballman @rsmith , I will be happy to continue cook this patch once we reach a consensus.

clang/lib/Sema/SemaChecking.cpp
732–733

fixed.

dblaikie added inline comments.Sep 15 2023, 12:55 PM
clang/lib/Sema/SemaChecking.cpp
732–733

@rsmith what do you think we should do now? Support even larger values than uint16_t worth? (the tree PseudoObjectExpr suggestion you made?)

If we did make the uint16_t limit stand - perhaps that'd apply pressure to folks who have been using this for RTTI to consider other options - it wouldn't break any existing code (that's not already broken in clang, at least - with the overflow stuff happening) but might put pressure on further growth of such techniques?

@rsmith ping on this thread - wondering what you'd prefer the direction be here