Page MenuHomePhabricator

[Sema] Delay evaluation of std::source_location::current() in default arguments
Needs ReviewPublic

Authored by ilya-biryukov on Jul 11 2022, 7:58 AM.

Details

Summary

... and class member initializers.

They must be evaluated in the context where default argument is actually
used during a call, not in a parameter list where default argument is specified.

We end up moving the evaluation to codegen stage. It would be nice to do
evaluation during semantic analysis, but that would imply that
CXXDefaultArgExpr will need to store an expression different from
ParmVarDecl::getDefaultArgument(). This would have its performance costs
and is much more complex than the current approach.

One theoretical drawback is that we will delay errors that happen during evaluation
of source_location::current calls. This should not happen in practice as calls to
std::source_location::current() will always succeed in conforming implementations.

Fixes #56379.

Diff Detail

Unit TestsFailed

TimeTest
210 msx64 debian > Clang.CodeGenCXX::builtin-source-location.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/16.0.0/include -nostdsysteminc -no-opaque-pointers -std=c++20 -fblocks /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCXX/builtin-source-location.cpp -triple x86_64-unknown-unknown -emit-llvm -o /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/CodeGenCXX/Output/builtin-source-location.cpp.tmp.ll

Event Timeline

ilya-biryukov created this revision.Jul 11 2022, 7:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 7:58 AM
ilya-biryukov requested review of this revision.Jul 11 2022, 7:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 7:58 AM

TODO: add example from cpprefence to the tests:

consteval int f() { return 42; }
consteval auto g() { return &f; }
consteval int h(int (*p)() = g()) { return p(); }
constexpr int r = h();  // OK
constexpr auto e = g(); // ill-formed: a pointer to an immediate function is
                        // not a permitted result of a constant expression

another example that actually breaks:

constexpr int foo(int i) {
    if (i == 1) return i/0;
}

consteval int bar(int i) {
    if (i == 1) return i/0;
      return i;
}

void baz(int i = foo(1), int j = bar(1)) {
}

void xxx() {
  baz();
}
  • Special-case the std::source_location::current calls, delay until codegen
ilya-biryukov added a subscriber: aaron.ballman.

This seems to mostly work and ready for another round of review, but I still need to update the codegen test, it seems it has caught an error with default-initilization of globals.
For reference, I have also tried forcing evaluation of relevant calls during semantic analysis, but did not finish it. See my attempt here if interested: https://reviews.llvm.org/D132941.

Current design of Clang does not seem to be friendly to this kind of change as this involves having different expressions inside CXXDefaultArgExpr even if the underlying parameter is the same.

@aaron.ballman could you take a look at this and share your thoughts too?

ilya-biryukov retitled this revision from [Sema] Fix evaluation context of immediate functions in default arguments to [Sema] Delay evaluation of std::source_location::current() in default arguments.Aug 30 2022, 7:31 AM
ilya-biryukov edited the summary of this revision. (Show Details)

I'm confused about this new strategy of special-casing source_location::current(). Isn't it wrong to eagerly evaluate _other_ calls in default args, as well? ISTM that source_location is simply _exposing_ the bug in where we evaluate these expressions, but that it's actually a more general problem?

I'm confused about this new strategy of special-casing source_location::current(). Isn't it wrong to eagerly evaluate _other_ calls in default args, as well? ISTM that source_location is simply _exposing_ the bug in where we evaluate these expressions, but that it's actually a more general problem?

+1 -- special casing things by name is almost always a red flag in the frontend (we do it from time to time when trying to add compatibility hacks for broken 3rd party headers though, which is reasonable). I don't think there's supposed to be anything special about std::source_location::current() evaluation. Default arguments are always evaluated at the site of the call, not the callee, so eager evaluation seems like the root cause of the problem.

I was also under impression that this should fall out of the C++ semantics, but after playing around with it I am not so sure anymore.

Isn't it wrong to eagerly evaluate _other_ calls in default args, as well?
ISTM that source_location is simply _exposing_ the bug in where we evaluate these expressions, but that it's actually a more general problem?

Yes, it will evaluate calls to other consteval functions and and I don't think there is a way to notice this if there are no errors in the consteval functions. constexpr and consteval functions should produce the same value when called with the same argument IIUC.
source_location::current() is special in that regard and it breaks that assumption. E.g. the fact that CXXDefaultArgExpr references the same expression from ParmVarDecl.
We could delay evaluation of other consteval functions until the default argument is actually used relatively cheaply for other functions, but source_location::current() is the only one that requires a complete revamp of how CXXDefaultArgExpr works (see D132941 for an attempt, it's not pretty, unless there is a simple way I missed).

I am also torn on how to interpret the standard. It might be that evaluating immediate function in default arguments in actually required.
[dcl.fct.default]p5 mentions we need to do full semantic checking of the initializer for default arguments:

The default argument has the same semantic constraints as the initializer in a declaration of a variable of the
parameter type, using the copy-initialization semantics (9.4). The names in the default argument are bound,
and the semantic constraints are checked, at the point where the default argument appears.

Additionally, the standard requires all immediate invocations to be evaluated (unless they are in immediate function context, which is not the case if default arguments are not for consteval function).
Hence, we should also evaluate the immediate invocations as part of the semantic checking we do for initializer of the default argument.

All major compilers certainly evaluate the consteval functions in the default arguments even without calls to the function that has those parameters:
https://gcc.godbolt.org/z/qTzrf6Msx

It might be an implementation quirk and a clarification would be nice. I am leaning towards thinking this behavior is actually standard-compliant.
This does not contradict the fact that default arguments must be evaluated in the context of the caller.
One can't see the difference for consteval functions as they must be replaced by constants.

Both GCC and MSVC seem to special-case std::source_location::current() from what I can tell.

I was also under impression that this should fall out of the C++ semantics, but after playing around with it I am not so sure anymore.

I'm also starting to question that. :-)

Isn't it wrong to eagerly evaluate _other_ calls in default args, as well?
ISTM that source_location is simply _exposing_ the bug in where we evaluate these expressions, but that it's actually a more general problem?

Yes, it will evaluate calls to other consteval functions and and I don't think there is a way to notice this if there are no errors in the consteval functions. constexpr and consteval functions should produce the same value when called with the same argument IIUC.
source_location::current() is special in that regard and it breaks that assumption. E.g. the fact that CXXDefaultArgExpr references the same expression from ParmVarDecl.
We could delay evaluation of other consteval functions until the default argument is actually used relatively cheaply for other functions, but source_location::current() is the only one that requires a complete revamp of how CXXDefaultArgExpr works (see D132941 for an attempt, it's not pretty, unless there is a simple way I missed).

I am also torn on how to interpret the standard. It might be that evaluating immediate function in default arguments in actually required.
[dcl.fct.default]p5 mentions we need to do full semantic checking of the initializer for default arguments:

The default argument has the same semantic constraints as the initializer in a declaration of a variable of the
parameter type, using the copy-initialization semantics (9.4). The names in the default argument are bound,
and the semantic constraints are checked, at the point where the default argument appears.

Additionally, the standard requires all immediate invocations to be evaluated (unless they are in immediate function context, which is not the case if default arguments are not for consteval function).
Hence, we should also evaluate the immediate invocations as part of the semantic checking we do for initializer of the default argument.

All major compilers certainly evaluate the consteval functions in the default arguments even without calls to the function that has those parameters:
https://gcc.godbolt.org/z/qTzrf6Msx

It might be an implementation quirk and a clarification would be nice. I am leaning towards thinking this behavior is actually standard-compliant.
This does not contradict the fact that default arguments must be evaluated in the context of the caller.
One can't see the difference for consteval functions as they must be replaced by constants.

Both GCC and MSVC seem to special-case std::source_location::current() from what I can tell.

https://eel.is/c++draft/expr.call#7.sentence-3 is what causes the argument to be evaluated at the call expression rather than at the declaration while https://eel.is/c++draft/dcl.fct.default#5 makes it clear that the semantic constraints are checked at the declaration location. I couldn't find explicit wording which says this, but I believe that semantic checking in this case also involves determining whether the immediate function is a valid or not (https://eel.is/c++draft/expr.const#14), which should explain the behavior of your example in https://gcc.godbolt.org/z/qTzrf6Msx.

However, https://eel.is/c++draft/dcl.constexpr#5 says that the behavior has to be the same whether the constexpr function is runtime evaluated or constant evaluated, and a function declared consteval is a constexpr function (https://eel.is/c++draft/dcl.constexpr#2.sentence-1). That sure does make std::source_location::current() a rather special function -- if the call were to be evaluated at runtime (which it can't because it's an immediate function), it would give different results.

There is another example we shouldn't make this specific for std::source_location::current(): https://github.com/llvm/llvm-project/issues/57459. I guess we can solve the issue too if we evaluate default argument at the caller position.

@aaron.ballman, that's my reading of the standard as well. Do you think we should proceed with the current approach or is there another direction worth pursuing to make source_location work?

There is another example we shouldn't make this specific for std::source_location::current(): https://github.com/llvm/llvm-project/issues/57459. I guess we can solve the issue too if we evaluate default argument at the caller position.

I think you're right and it would probably work automatically if we were to recreate the default argument expression on every call where it appears.
However, going this route for the particular purpose of checking module visibility looks like an overkill. FWIW, see my attempt at this to fix source_location::current() with immediate invocations D132941.
It's complicated, makes default arguments slower and encounters new failures modes, e.g. there new errors with lambdas in default arguments.

I suspect this could be done relatively cheaply and simply inside the MarkDeclarationsReferencedInExpr call at the bottom of Sema::CheckCXXDefaultArgExpr.
I tried prototyping this and it seems to fit nicely with the current design. We just need to add a few notes to link between the locations of parameter initializers and calls involving default arguments.
(It is still tricky to get right in all cases since we need to enumerate all positions where complete types are required, but certainly does not look impossible).

#include <stdio.h>
#include <source_location>


consteval int fn(std::source_location sl = std::source_location::current()) {
  return sl.line();
}

consteval int fn2(int line = fn()) {
  return line;
}

int main() {
  printf("fn=%d fn2=%d\n", fn(), fn2());
}

I believe this should print fn=14 fn2=14. I don't see how we can get that by special-casing calls to std::source_location::current, and indeed, with this version of the patch, the program instead prints fn=14 fn2=9.

I believe this should print fn=14 fn2=14. I don't see how we can get that by special-casing calls to std::source_location::current, and indeed, with this version of the patch, the program instead prints fn=14 fn2=9.

There is another issue in Clang that prevents it from producing correct results with this patch. We should not evaluate immediate calls inside default arguments of immediate function parameters as they are deemed to be inside the immediate function context. Clang currently evaluates these calls as all parameter scopes are PotentiallyEvaluatedIfUsed and ImmediateFunctionContext is a different value inside the same enumeration :)

Notice how GCC does not produce an error here as it does not attempt to run the invalid constant calculation:
https://gcc.godbolt.org/z/Encr3Kzbs
I believe the GCC behavior is correct here. If we fix this bug, Clang will stop producing an error too and we will have the behavior you described.

The results of calling fn() are also different if you do it inside a default argument for non-consteval function (I use constexpr here for static assert, but dynamic calls will give the same result):
https://gcc.godbolt.org/z/P1x8PGsh6
MSVC and GCC disagree about the value for a recursive call here, I would say that GCC behavior seems reasonable to me (it evaluates consteval calls at the first non-immediate function context).

I believe this should print fn=14 fn2=14. I don't see how we can get that by special-casing calls to std::source_location::current, and indeed, with this version of the patch, the program instead prints fn=14 fn2=9.

There is another issue in Clang that prevents it from producing correct results with this patch. We should not evaluate immediate calls inside default arguments of immediate function parameters as they are deemed to be inside the immediate function context. Clang currently evaluates these calls as all parameter scopes are PotentiallyEvaluatedIfUsed and ImmediateFunctionContext is a different value inside the same enumeration :)

OK...So...

If the correct rule is that consteval functions must be immediately evaluated even when used in a default arg of a non-consteval function (rather than deferring evaluation to the callsite, as would otherwise be the rule for default args!), then GCC's behavior for "fn3" in your example https://gcc.godbolt.org/z/P1x8PGsh6 makes sense. And, yea, also Clang's _current_ behavior for current() makes sense! Because current is not actually generating the source location itself -- the actual source location is coming from a builtin call in a default arg of current()...it's only a wrapper, just like fn() is a wrapper around current().

But, "makes sense" (from implementation point of view) doesn't mean "useful"...which is why you're adding a special case for it. And why GCC also did.

This is really quite awful.

The more I look at this, the more I feel like, if all the above behaviors are correct, source_location::current() simply shouldn't be marked consteval at all. If it was constexpr, instead, it'd work great _without_ needing any special-casing, because the "must evaluate immediately upon seeing function definition even in default arg position" wouldn't apply to it. And that's the behavior we want from it!

@aaron.ballman, that's my reading of the standard as well. Do you think we should proceed with the current approach or is there another direction worth pursuing to make source_location work?

There is another example we shouldn't make this specific for std::source_location::current(): https://github.com/llvm/llvm-project/issues/57459. I guess we can solve the issue too if we evaluate default argument at the caller position.

I think you're right and it would probably work automatically if we were to recreate the default argument expression on every call where it appears.
However, going this route for the particular purpose of checking module visibility looks like an overkill. FWIW, see my attempt at this to fix source_location::current() with immediate invocations D132941.
It's complicated, makes default arguments slower and encounters new failures modes, e.g. there new errors with lambdas in default arguments.

It is just an example. I just wanted to say we could solve more problems if we chose that way. The issue of the module may not so painful and it shouldn't be hard to handle it specially. (Although specially handling looks not good, this is the reason why I didn't fix it). But after all, I am not objecting (nor approving) this for the modules example.


For this patch, I agree with @jyknight, it'd better to be constexpr instead of consteval. I feel it may be better to reflect this with wg21 instead of hacking in compilers.

The more I look at this, the more I feel like, if all the above behaviors are correct, source_location::current() simply shouldn't be marked consteval at all. If it was constexpr, instead, it'd work great _without_ needing any special-casing, because the "must evaluate immediately upon seeing function definition even in default arg position" wouldn't apply to it. And that's the behavior we want from it!

I think it's true that this would give the desired behavior without any hacks on the side of the compiler.
I am not sure why consteval was chosen in the first place, I suspect the goal was to reduce runtime costs.

I prototyped a change that forces constexpr on current() and it definitely works wrt to the scope rules.
If we choose this approach, Clang will not be standard-compliant with C++20. Raising this with WG21 is definitely something we need to do.
@aaron.ballman would you be able to discuss this within the WG21 or provide pointers on how these things are done? I am a total stranger to this process.

In the meantime, what options do we have for existing C++20 Standard and existing STL libraries?
The ones I see are:

  1. Clang keeps the status quo and provides wrong results for consteval source_location::current().
  2. Clang internally switches "hacks up" current() to be constexpr even if it was marked as consteval, which means it can produce runtime calls and silently compile code where using consteval function current() would be an error.
  3. Same as (2), but try to mitigate incompatibilities, e.g. lower calls to current() to constants, do not do the codegen the body of current().
  4. Keep current() consteval, but special-case calls to it.

(1) does not look like an option.
I am not sure entirely various problems that (2) and (3) would entail, but this only seems reasonable if WG21 decides current() must be changed to constexpr in the long run.
A compiler change for (2) is very simple, so I would vote for this if there are not problematic implications for users, e.g. what are the compatibility guarantees when linking together code compiled both with Clang and GCC? Would be break any promises Clang has around those?
If we are to choose between (3) and (4), I would definitely go with (4) as the compiler changes are roughly equivalent, but treating current() as consteval function produces exactly the diagnostics you expect from consteval (e.g. prevents taking a pointer to it).

I'd say we should choose between (2) and (4).
Prefer (2) if WG21 decides to switch to constexpr or we are certain there are no potential issues (see comment about GCC compatibility above).
Prefer (4) otherwise.

@aaron.ballman, that's my reading of the standard as well. Do you think we should proceed with the current approach or is there another direction worth pursuing to make source_location work?

It is just an example. I just wanted to say we could solve more problems if we chose that way. The issue of the module may not so painful and it shouldn't be hard to handle it specially. (Although specially handling looks not good, this is the reason why I didn't fix it). But after all, I am not objecting (nor approving) this for the modules example.

As mentioned earlier, I also think this might be the right call.
I just wanted to mention it's a shift against the current Clang design and exposes a few other issues that need to be worked out and also makes processing default arguments slower.