This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Improve the handling of large arrays evaluation.
ClosedPublic

Authored by cor3ntin on Jul 21 2023, 7:12 AM.

Details

Summary

This is a temporary fix (for clang 17) that caps the size of
any array we try to constant evaluate:

There are 2 limits:
  * We cap to UINT_MAX the size of ant constant evaluated array,
    because the constant evaluator does not support size_t.
  * We cap to `-fconstexpr-steps` elements the size of each individual
    array and dynamic array allocations.
    This works out because the number of constexpr steps already limits
    how many array elements can be initialized, which makes this new
    limit conservatively generous.
    This ensure that the compiler does not crash when attempting to
    constant-fold valid programs.

If the limit is reached by a given array, constant evaluation will fail,
and the program will be ill-formed, until a bigger limit is given.
Or, constant folding will fail and the array will be evaluated at runtime.

Fixes #63562

Diff Detail

Event Timeline

cor3ntin created this revision.Jul 21 2023, 7:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2023, 7:12 AM
cor3ntin requested review of this revision.Jul 21 2023, 7:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2023, 7:12 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
shafik added a subscriber: shafik.Jul 21 2023, 2:32 PM

Did you evaluate trying to use StepsLeft at all to limit the allocation? Should the limit be setable via the command line like what we do with steps via -fconstexpr-steps?

cor3ntin updated this revision to Diff 543472.Jul 24 2023, 4:47 AM
  • Use -fconstexpr-steps to limit the size of the array.
  • Aply the logic to both constant evaluation and constant folding as is constant evaluation fail the limit can now be increased.
cor3ntin added inline comments.Jul 24 2023, 4:49 AM
clang/lib/AST/ExprConstant.cpp
6753–6757

Note that that the computation for nothrow is incorrect

aaron.ballman added inline comments.Jul 24 2023, 5:16 AM
clang/include/clang/Basic/DiagnosticASTKinds.td
356

I think use of %0 twice here is a bit of a surprise. The first time it's telling you the array bounds, but the second time it's telling you a step limit; these are different things (array bounds contribute to the step limit but there's no reason to assume that setting the step limit to the array bounds will fix anything (or even be larger than the current step limit).

clang/lib/AST/ExprConstant.cpp
1039
6544–6546

Unnecessary changes.

6753–6757

Good catch -- are you planning to fix in a follow-up? If not, can you file an issue so we don't lose track of this? (Bonus points for a test case showing that we get this wrong.)

clang/test/SemaCXX/cxx2a-constexpr-dynalloc-limits.cpp
2
53

The wording of this note is a bit unfortunate given that there's no loop in sight. :-(

60

If you set the limit to 1025 do you actually succeed?

80–83
cor3ntin marked 2 inline comments as done.Jul 24 2023, 5:27 AM
cor3ntin added inline comments.
clang/include/clang/Basic/DiagnosticASTKinds.td
356

"use '-fconstexpr-steps' to increase this limit" then?

(The idea was that fconstexpr-steps needs to be at least that value, i agree that the program could still fail a few instructions later)

clang/lib/AST/ExprConstant.cpp
6753–6757

I'll file an issue because it's unclear to me whether this is the only problem with nothrow allocation, or how they should work in general.
Assuming we fix that (T*) new(TOO_BIG, nothrow) still complain about casting a void* ptr.

Ie, i already looked into it and decided it was way out of scope.

clang/test/SemaCXX/cxx2a-constexpr-dynalloc-limits.cpp
53

Line 32. ie, that the case where the allocation succeeds but we can't do much to the array.

60

Yes... but no, see previous comment :)
Maybe I can add constexpr S<std::size_t> s4(100); that would have no error at all

aaron.ballman added inline comments.Jul 24 2023, 5:41 AM
clang/include/clang/Basic/DiagnosticASTKinds.td
356

Yeah, I think this is a case where it's better to just mention the option but not suggest a value.

clang/lib/AST/ExprConstant.cpp
6753–6757

SGTM!

clang/test/SemaCXX/cxx2a-constexpr-dynalloc-limits.cpp
2

Should we add -Wvla to show when constant folding fails we're getting a VLA?

53

Ahhh, I see. Hmm, okay, not as bad as I had thought.

60

I think that'd be a good test to add as well, yes,

cor3ntin updated this revision to Diff 543490.Jul 24 2023, 5:46 AM

Address Aaron's comments and add documentation

cor3ntin updated this revision to Diff 543504.Jul 24 2023, 6:14 AM

Add some VLA tests

cor3ntin updated this revision to Diff 543530.Jul 24 2023, 7:07 AM

Add more Vla tests

danilaml added inline comments.
clang/lib/AST/ExprConstant.cpp
1026
1027
aaron.ballman added subscribers: thieta, tstellar, Restricted Project.Jul 24 2023, 7:30 AM

The changes generally seem reasonable to me, but I'm not 100% sure of the impacts of tying this to constexpr steps. That's a vague measure that was used mostly as an escape hatch for recursive constexpr evaluation. It seems reasonable to also tie it to array initialization (where each element of the array is one "step"), but it's not clear to me whether we're going to start rejecting code we used to accept or not. Because this is fixing a crash with code that's becoming more common (mostly through STL constexpr constructors), I see the appeal to getting this into Clang 17 and I think the current changes (tying to the max constexpr steps instead of constexpr steps remaining) are pretty conservative. However, because it's not a regression (this problem has existed for quite a while in Clang) and because we branch tomorrow, I think the most conservative approach would be to land this just after the Clang 17 branch. If post-commit testing doesn't bring up surprises before we start getting closer to putting out release candidates, we can cherry-pick this onto the release branch for early rc testing to widen the coverage and hopefully get the changes in 17. WDYT CC @tstellar @thieta

The UINT_MAX thing seems like a straightforward bug; if we have time to fix it properly, I'd prefer not to add weird workarounds.

The release note says "unless they are part of a constant expression", but I don't see any code in the implementation that distinguishes folding from constant expression evaluation. Unless this is just referring to the fact that bailing out of folding doesn't produce an error? We might want to consider using a stricter bound for optional folding, though.

How likely is it that we could add some sort of optimization for new expressions so we don't represent each element separately in memory? I know there's no solution in general, but in the cases people actually care about, all/almost all the elements are identical.

The UINT_MAX thing seems like a straightforward bug; if we have time to fix it properly, I'd prefer not to add weird workarounds.

Note sure what you mean. Making sure we use size_t for all array extents is not something that can be fixed overnight but more importantly, it does not help:
Would it not overflow, the allocation would still fail.

The release note says "unless they are part of a constant expression", but I don't see any code in the implementation that distinguishes folding from constant expression evaluation. Unless this is just referring to the fact that bailing out of folding doesn't produce an error? We might want to consider using a stricter bound for optional folding, though.

I need to update the release notes/commit message they are no reflective of the current design which always look at fconstexpr-steps in all modes of evaluation. It's much cleaner that way.

How likely is it that we could add some sort of optimization for new expressions so we don't represent each element separately in memory? I know there's no solution in general, but in the cases people actually care about, all/almost all the elements are identical.

I don't think that would make sense in actual code, and having some sort of sparse array is something I considered. And we already delay allocation in some cases, but we do need to create elements to destroy them, read them, etc. So while it may be possible, it seems... complicated.
I think a more viable long term fix might be to not crash on allocation failure, and/or to have a way to limit the allocation to some portion of the available memory.

cor3ntin updated this revision to Diff 543606.Jul 24 2023, 9:46 AM

Update release notes and commit message

cor3ntin edited the summary of this revision. (Show Details)Jul 24 2023, 9:47 AM

Note sure what you mean. Making sure we use size_t for all array extents is not something that can be fixed overnight but more importantly, it does not help: Would it not overflow, the allocation would still fail.

Oh, right, it would be sizeof(APValue) * UINT_MAX, which would break in any practical usage.

I don't think that would make sense in actual code, and having some sort of sparse array is something I considered. And we already delay allocation in some cases, but we do need to create elements to destroy them, read them, etc. So while it may be possible, it seems... complicated.

Definitely not simple, sure.

I think a more viable long term fix might be to not crash on allocation failure, and/or to have a way to limit the allocation to some portion of the available memory.

Making the compiler's behavior depend on the amount of memory installed in the user's computer seems like a non-starter. I think we just have to stick with some combination of:

  • Try to reduce excessive memory usage in constant folding.
  • Add strict limits memory usage limits for optional constant folding
  • Maybe consider disobeying the standard slightly in certain cases: the standard requires that we constant-fold the initializers for all global variables, but that might not really be viable for globals that are expensive to evaluate.

Note sure what you mean. Making sure we use size_t for all array extents is not something that can be fixed overnight but more importantly, it does not help: Would it not overflow, the allocation would still fail.

Oh, right, it would be sizeof(APValue) * UINT_MAX, which would break in any practical usage.

I don't think that would make sense in actual code, and having some sort of sparse array is something I considered. And we already delay allocation in some cases, but we do need to create elements to destroy them, read them, etc. So while it may be possible, it seems... complicated.

Definitely not simple, sure.

I think a more viable long term fix might be to not crash on allocation failure, and/or to have a way to limit the allocation to some portion of the available memory.

Making the compiler's behavior depend on the amount of memory installed in the user's computer seems like a non-starter. I think we just have to stick with some combination of:

  • Try to reduce excessive memory usage in constant folding.

100% agreed

  • Add strict limits memory usage limits for optional constant folding

This seems reasonable to me, but I think this should be user controllable so that users with different resource constraints can override whatever default value we pick.

  • Maybe consider disobeying the standard slightly in certain cases: the standard requires that we constant-fold the initializers for all global variables, but that might not really be viable for globals that are expensive to evaluate.

If we can use an existing implementation limit (http://eel.is/c++draft/implimits) to justify our decision, great; otherwise, I think we should ask WG21 for an official escape hatch so that we don't have conformance issues.

Note sure what you mean. Making sure we use size_t for all array extents is not something that can be fixed overnight but more importantly, it does not help: Would it not overflow, the allocation would still fail.

Oh, right, it would be sizeof(APValue) * UINT_MAX, which would break in any practical usage.

I don't think that would make sense in actual code, and having some sort of sparse array is something I considered. And we already delay allocation in some cases, but we do need to create elements to destroy them, read them, etc. So while it may be possible, it seems... complicated.

Definitely not simple, sure.

I think a more viable long term fix might be to not crash on allocation failure, and/or to have a way to limit the allocation to some portion of the available memory.

Making the compiler's behavior depend on the amount of memory installed in the user's computer seems like a non-starter. I think we just have to stick with some combination of:

  • Try to reduce excessive memory usage in constant folding.

100% agreed

  • Add strict limits memory usage limits for optional constant folding

This seems reasonable to me, but I think this should be user controllable so that users with different resource constraints can override whatever default value we pick.

  • Maybe consider disobeying the standard slightly in certain cases: the standard requires that we constant-fold the initializers for all global variables, but that might not really be viable for globals that are expensive to evaluate.

If we can use an existing implementation limit (http://eel.is/c++draft/implimits) to justify our decision, great; otherwise, I think we should ask WG21 for an official escape hatch so that we don't have conformance issues.

Forgot to reply to that, it's already conforming

On the last point, i think a *explicit* limit would be nice but i chatted with core and they agreed the current wording was sufficient - and they didn't seem super enthusiastic about adding limits.

Note sure what you mean. Making sure we use size_t for all array extents is not something that can be fixed overnight but more importantly, it does not help: Would it not overflow, the allocation would still fail.

Oh, right, it would be sizeof(APValue) * UINT_MAX, which would break in any practical usage.

I don't think that would make sense in actual code, and having some sort of sparse array is something I considered. And we already delay allocation in some cases, but we do need to create elements to destroy them, read them, etc. So while it may be possible, it seems... complicated.

Definitely not simple, sure.

I think a more viable long term fix might be to not crash on allocation failure, and/or to have a way to limit the allocation to some portion of the available memory.

Making the compiler's behavior depend on the amount of memory installed in the user's computer seems like a non-starter. I think we just have to stick with some combination of:

  • Try to reduce excessive memory usage in constant folding.

100% agreed

  • Add strict limits memory usage limits for optional constant folding

This seems reasonable to me, but I think this should be user controllable so that users with different resource constraints can override whatever default value we pick.

  • Maybe consider disobeying the standard slightly in certain cases: the standard requires that we constant-fold the initializers for all global variables, but that might not really be viable for globals that are expensive to evaluate.

If we can use an existing implementation limit (http://eel.is/c++draft/implimits) to justify our decision, great; otherwise, I think we should ask WG21 for an official escape hatch so that we don't have conformance issues.

Forgot to reply to that, it's already conforming

Oh, excellent!

On the last point, i think a *explicit* limit would be nice but i chatted with core and they agreed the current wording was sufficient - and they didn't seem super enthusiastic about adding limits.

"or others" is an amazing amount of latitude for Core to give us, how handy! :-D

FWIW, I personally think having explicit limits is far better than relying on "or others" to do the heavy lifting, but we've got what we need at least.

@efriedma You still have objection to landing that now with the goal to backport later in the release process? Thanks!

The general approach seems fine. The multiplier for constexpr vs. constant folding can be left for a followup, and we can continue to consider other possible improvements elsewhere.

I guess I have one remaining question here: how does this interact with SFINAE? In other words, if we hit the limit analyzing the signature of a function, do we print an error, or silently remove the function from the overload list?

The general approach seems fine. The multiplier for constexpr vs. constant folding can be left for a followup, and we can continue to consider other possible improvements elsewhere.

I guess I have one remaining question here: how does this interact with SFINAE? In other words, if we hit the limit analyzing the signature of a function, do we print an error, or silently remove the function from the overload list?

SFINAE https://godbolt.org/z/8can1q7GW (there is no difference between constant expression invalid because they exceed limits, or because they are not constant for other reasons)

efriedma accepted this revision.Jul 27 2023, 10:45 AM

That's scary: it means we can silently behave differently from other compilers in a way that's hard to understand. Is there some way we can provide a diagnostic?

That said, it looks like that's an existing issue, so I'm fine with addressing it separately.

This revision is now accepted and ready to land.Jul 27 2023, 10:45 AM

That's scary: it means we can silently behave differently from other compilers in a way that's hard to understand. Is there some way we can provide a diagnostic?

That said, it looks like that's an existing issue, so I'm fine with addressing it separately.

It might be worth asking the committee for input, i agree with you that it's probably not a desirable behavior

@efriedma I sent a mail to core last night, some discussion is happening. @rsmith made some good point so hopefully we will be able to produce some hard error in that case,
But given this is preexisting, I'm going to land that and hopefully we can merge it in the 17 branch in the next few weeks. Thanks!

cor3ntin updated this revision to Diff 545099.Jul 28 2023, 4:33 AM
cor3ntin edited the summary of this revision. (Show Details)

Rebase

This revision was landed with ongoing or failed builds.Jul 28 2023, 5:38 AM
This revision was automatically updated to reflect the committed changes.
antmo added a subscriber: antmo.Jul 28 2023, 7:27 AM

Hi, clang-armv8-quick bot now fails on Clang::cxx2a-constexpr-dynalloc-limits.cpp

https://lab.llvm.org/buildbot/#/builders/245/builds/11764

Could you please look at this ?

Hi, clang-armv8-quick bot now fails on Clang::cxx2a-constexpr-dynalloc-limits.cpp

https://lab.llvm.org/buildbot/#/builders/245/builds/11764

Could you please look at this ?

I landed https://reviews.llvm.org/D156542, which should fix the issue. I'm keeping an eye on it!