This is an archive of the discontinued LLVM Phabricator instance.

Treat `std::move`, `forward`, and `move_if_noexcept` as builtins.
ClosedPublic

Authored by rsmith on Apr 7 2022, 4:54 PM.

Details

Summary

We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.

This change aims to reduce various costs associated with these functions

  • per-instantiation memory costs, compile time and memory costs due to

creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.

We now automatically infer the const and nothrow attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.

In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.

The same infrastructure is extended to the existing MSVC builtin
__GetExceptionInfo, which is now only recognized in namespace std
like it always should have been.

Diff Detail

Event Timeline

rsmith created this revision.Apr 7 2022, 4:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 4:54 PM
rsmith requested review of this revision.Apr 7 2022, 4:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 4:54 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Thank you for this, it's a great improvement! I think it's generally correct, but I did have some minor questions.

Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change.

Any chance you'd be interested in submitting this patch to http://llvm-compile-time-tracker.com/ (instructions at http://llvm-compile-time-tracker.com/about.php) so we can have an idea of how compile times are impacted? (It's not critical, but having some more compile time performance measurements would be helpful to validate that this actually saves compile time as expected.

clang/include/clang/Basic/DiagnosticSemaKinds.td
6594–6595

Thank you for making this one on by default :-)

clang/lib/Sema/SemaExpr.cpp
20542

I wonder how often these get passed around as function objects... e.g.,

template <typename Fn>
void func(Fn &&Call);

int main() {
  func(std::move<int>);
}

(I don't see evidence that this is a common code pattern, but C++ surprises me often enough that I wonder if this will cause issues.)

clang/lib/Sema/SemaInit.cpp
8218–8221

Do we need to make this call every time we've called FixOverloadedFunctionReference()? I guess I'm not seeing the changes that necessitated this call (or the one below).

clang/test/SemaCXX/builtin-std-move.cpp
13

Formatting looks like it's gone wonky in this file.

joerg added a subscriber: joerg.Apr 9 2022, 5:55 PM

As is, I think this conflicts with -ffreestanding assumptions or at the very least the spirit.

As is, I think this conflicts with -ffreestanding assumptions or at the very least the spirit.

Why? These functions are in <utility> which is not required in freestanding, but implementations are allowed to support more anyway (http://eel.is/c++draft/compliance#2). As the codegen doesn't emit a call to a library function but is purely using language facilities to reimplement the functionality, these don't seem to be in conflict with freestanding to me. If you could expound on what problems you see, that'd be helpful.

joerg added a comment.Apr 11 2022, 7:49 AM

The patch contains at least one user visible change that would be quite surprising: it is no longer possible to intentionally set a break point on std::move.

Thinking more about it, what about a slightly different implementation strategy? Provide a compiler built-in __builtin_std_move. If it is present, libc++ can alias it into namespace std using regular C++. Much of the name-matching and argument checking in various places disappears. The check to skip template instantiation can be done the same way. Over-all, it should be much less intrusive with the same performance benefits for an built-in-aware STL. It completely side-steps the question of ignoring the actual implementation of std::move in the source, because there is none.

rsmith updated this revision to Diff 422305.Apr 12 2022, 11:47 AM
  • Add support for -fno-builtin, -ffreestanding, -fno-builtin-std-move.
rsmith updated this revision to Diff 422311.Apr 12 2022, 12:10 PM
  • Document -fno-builtin-std-foo.

Any chance you'd be interested in submitting this patch to http://llvm-compile-time-tracker.com/ (instructions at http://llvm-compile-time-tracker.com/about.php) so we can have an idea of how compile times are impacted? (It's not critical, but having some more compile time performance measurements would be helpful to validate that this actually saves compile time as expected.

Nice idea, done:

The patch contains at least one user visible change that would be quite surprising: it is no longer possible to intentionally set a break point on std::move.

Yeah, I think it's reasonable to support -fno-builtin-std-move and the like -- done. The intended model here is to treat these std:: functions just like how we treat the functions in the global namespace for which we provide built-in semantics, so I think -fno-builtin and -ffreestanding and the like should all treat these functions the same way they treat malloc and strlen and such.

Thinking more about it, what about a slightly different implementation strategy? Provide a compiler built-in __builtin_std_move. If it is present, libc++ can alias it into namespace std using regular C++. Much of the name-matching and argument checking in various places disappears. The check to skip template instantiation can be done the same way. Over-all, it should be much less intrusive with the same performance benefits for an built-in-aware STL. It completely side-steps the question of ignoring the actual implementation of std::move in the source, because there is none.

This is an interesting idea, but I don't think it would work out well. As a superficial problem, C++ aliases (eg, using declarations) don't let you change the name of a function. More fundamentally, I think this would end up being a lot more complicated: we'd need to implement the rules for how to deduce parameter types and transform them into return types ourselves, we'd need to somehow support std::forward's explicit template argument, and in C++17 and before, it's valid to take the address of std::move and co, which would require us to invent a function definition for them in IR generation. This would also be a major deviation from how we treat all other standard library functions for which we have built-in knowledge.

clang/lib/Sema/SemaExpr.cpp
20542

I expect people will tell us if it this fires a lot... if it does, we can consider making the C++20 error a DefaultError ExtWarn.

clang/lib/Sema/SemaInit.cpp
8218–8221

We only need to do this in places that assume that the result doesn't have a placeholder type. Specifically, FixOverloadedFunctionReference can now take us from an expression whose type is the "overloaded function" builtin type to an expression whose type is the "builtin function" builtin type, after overload resolution picks a specific template specialization, which these places were not prepared for (they thought they'd always get a real function type).

In this case, we first build an initialization sequence (for example, for int &&(*p)(int&) = std::move;, we'd build this sequence: (1) resolve std::move to std::move<int>, then (2) decay std::move<int> to a function pointer), and then we apply it. If step (1) leaves us with a builtin function, we end up decaying *that* to a pointer, producing <builtin-function-type>*, which is an invalid type. The other case is similar: we first build an implicit conversion sequence then apply it and assume it will make sense. Other places that call FixOverloadedFunctionReference don't assume that they can blindly perform operations on its result, so they don't need this handling.

This never happened before because we never before had a builtin function that both requires overload resolution and uses a placeholder type as the type of a reference to the function. We do the latter here so that we can detect uses of the function that aren't calls, in order to trigger a real instantiation of the body in the rare cases where they're needed.

clang/test/SemaCXX/builtin-std-move.cpp
13

What are you referring to? The lines longer than 80 columns, or something else?

FWIW, long lines are common in our test/ files. Substantially more than half of them go over 80 columns:

$ rgrep '.\{81\}' clang/test  --files-with-match | wc -l
12481
$ rgrep '.\{81\}' clang/test  --files-without-match | wc -l
7194
rsmith updated this revision to Diff 422366.Apr 12 2022, 4:47 PM
  • Switch to a different example with underscores.
xbolva00 added inline comments.
clang/docs/CommandGuide/clang.rst
264

-fno-builtin=xxx,yyy,zzz

wdyt?

rsmith added inline comments.Apr 12 2022, 6:53 PM
clang/docs/CommandGuide/clang.rst
264

Seems reasonable to me to treat -fno-builtin-<blah> and -fno-builtin=<blah> as equivalent, and I certainly prefer the latter form. I think that's somewhat orthogonal to this patch, though: I'm not introducing -fno-builtin-blah here, only documenting it. (It's a GCC flag originally.)

Any chance you'd be interested in submitting this patch to http://llvm-compile-time-tracker.com/ (instructions at http://llvm-compile-time-tracker.com/about.php) so we can have an idea of how compile times are impacted? (It's not critical, but having some more compile time performance measurements would be helpful to validate that this actually saves compile time as expected.

Nice idea, done:

Awesome, thank you for that information! Not a massive win, but definitely seems like an improvement worth having.

The patch contains at least one user visible change that would be quite surprising: it is no longer possible to intentionally set a break point on std::move.

Yeah, I think it's reasonable to support -fno-builtin-std-move and the like -- done. The intended model here is to treat these std:: functions just like how we treat the functions in the global namespace for which we provide built-in semantics, so I think -fno-builtin and -ffreestanding and the like should all treat these functions the same way they treat malloc and strlen and such.

I can live with it. :-)

Thinking more about it, what about a slightly different implementation strategy? Provide a compiler built-in __builtin_std_move. If it is present, libc++ can alias it into namespace std using regular C++. Much of the name-matching and argument checking in various places disappears. The check to skip template instantiation can be done the same way. Over-all, it should be much less intrusive with the same performance benefits for an built-in-aware STL. It completely side-steps the question of ignoring the actual implementation of std::move in the source, because there is none.

This is an interesting idea, but I don't think it would work out well. As a superficial problem, C++ aliases (eg, using declarations) don't let you change the name of a function. More fundamentally, I think this would end up being a lot more complicated: we'd need to implement the rules for how to deduce parameter types and transform them into return types ourselves, we'd need to somehow support std::forward's explicit template argument, and in C++17 and before, it's valid to take the address of std::move and co, which would require us to invent a function definition for them in IR generation. This would also be a major deviation from how we treat all other standard library functions for which we have built-in knowledge.

Ah drat, great point about not being able to change the name of the function and the extra complexity.

Do you have ideas on how we can improve the debugging checkpoint behavior (if at all)?

clang/lib/Sema/SemaInit.cpp
8218–8221

Ah, thank you for that explanation, that helps a bunch!

clang/test/SemaCXX/builtin-std-move.cpp
13

Oh, the line length doesn't bother me (we smash all over that in tests, as you've seen). It was more the extra indentation within a namespace -- we don't usually indent like that and it caught me off guard. It's not a major deal though (the same indentation happens in clang/test/SemaCXX/builtin-std-move-nobuiltin.cpp).

Do you have ideas on how we can improve the debugging checkpoint behavior (if at all)?

I think we just live with it, like we do for other builtin functions. (There might be things we can do by emitting inlining info into the debug info. If we do that, we should presumably do it for all builtin lib functions.)

clang/test/SemaCXX/builtin-std-move.cpp
13

Gotcha.

This is just what my editor / fingers happened to do, but now I'm consciously thinking about it, I see a bit of nuance here: I think it makes sense to not indent a namespace that spans a whole file, because the indent doesn't add anything for the reader and loses two columns of useful screen real estate. But when the namespace is covering a relatively small scope it seems useful to me to indent it to contrast the contents of the namespace against the surrounding code that's outside the namespace, in the same way it makes sense to indent a class definition to contrast the class scope against surrounding code.

It looks like indenting namespace bodies is pretty common in the clang test suite: of the 1917 files that contain a line matching /^namespace.*$/, 1171 of those files have a next line that is indented, which I think makes sense because namespaces in tests will tend to be short, and intended to semantically separate a portion of the test from the rest of the file. (By contrast, only ~10% of the clang sources and headers have indented namespaces, and from a quick sampling it looks like they basically follow the same pattern: indented namespaces are mostly used for short scopes and unindented ones are mostly used for longer / whole-file scopes.)

rnk added a comment.Apr 14 2022, 4:13 PM

Generally speaking, this sounds like a good idea to me. One time in 2019 I used -ftime-trace+ClangBuildAnalyzer and it told me that std::unique_ptr was the most expensive template because it is instantiated so much. Those results don't even capture the -O0 object file size impact.

Do you have ideas on how we can improve the debugging checkpoint behavior (if at all)?

I think we just live with it, like we do for other builtin functions. (There might be things we can do by emitting inlining info into the debug info. If we do that, we should presumably do it for all builtin lib functions.)

Honestly, I don't think it's worth the debug info bytes to describe these inlined call sites. Debug info isn't free.

aaron.ballman accepted this revision.Apr 15 2022, 4:28 AM

Do you have ideas on how we can improve the debugging checkpoint behavior (if at all)?

I think we just live with it, like we do for other builtin functions. (There might be things we can do by emitting inlining info into the debug info. If we do that, we should presumably do it for all builtin lib functions.)

Okie dokie, so be it.

The changes LGTM aside from a previous question about diagnostic wording (feel free to accept the suggestion or not as you see fit). Thanks for working on this!

clang/docs/ReleaseNotes.rst
167–170

Probably worth moving down to the C++ language changes section because it's C++ specific.

clang/include/clang/Basic/DiagnosticSemaKinds.td
6594–6595

Thoughts on this one?

This revision is now accepted and ready to land.Apr 15 2022, 4:28 AM

Generally speaking, this sounds like a good idea to me. One time in 2019 I used -ftime-trace+ClangBuildAnalyzer and it told me that std::unique_ptr was the most expensive template because it is instantiated so much. Those results don't even capture the -O0 object file size impact.

Do you have ideas on how we can improve the debugging checkpoint behavior (if at all)?

I think we just live with it, like we do for other builtin functions. (There might be things we can do by emitting inlining info into the debug info. If we do that, we should presumably do it for all builtin lib functions.)

Honestly, I don't think it's worth the debug info bytes to describe these inlined call sites. Debug info isn't free.

+1 there - and also these operations/intrinsics produce no instructions, so far as I understand/know - so for now, LLVM's got to way to represent that anyway (there's some talk in DWARF about how to have multiple "states" for a single instruction location (so, eg, you could step in/out of an inlined function even though you stay at exactly the same instruction))

rsmith updated this revision to Diff 423164.Apr 15 2022, 2:06 PM
rsmith marked 3 inline comments as done.
  • Address review comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
6594–6595

Sorry, I was looking at your comments over email and didn't see the suggestion. Done.

This revision was landed with ongoing or failed builds.Apr 15 2022, 2:09 PM
This revision was automatically updated to reflect the committed changes.
vitalybuka reopened this revision.Apr 16 2022, 12:30 AM
vitalybuka added a subscriber: vitalybuka.

Reverted with e75d8b70370435b0ad10388afba0df45fcf9bfcc to recover bots

This revision is now accepted and ready to land.Apr 16 2022, 12:30 AM
This revision was automatically updated to reflect the committed changes.

Hi, this is also breaking Fuchsia's clang CI builders (https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8816531831869640417/overview). If this will be hard to address, would you mind reverting until a patch is ready?

Hi, this is also breaking Fuchsia's clang CI builders (https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8816531831869640417/overview). If this will be hard to address, would you mind reverting until a patch is ready?

I'm working on a fix. I'll revert if it takes me more than a few minutes. If you'd prefer to not wait, please feel free to revert it yourself.

I'm working on a fix. I'll revert if it takes me more than a few minutes. If you'd prefer to not wait, please feel free to revert it yourself.

Thanks for your patience, should be fixed in rGe43c93dd63cca295ef26ab69cd305816a71d45fd.

Hi, unfortunately there's a build failure on AIX: https://lab.llvm.org/buildbot/#/builders/214/builds/779/steps/9/logs/stdio. Could you take a look?

Hi, unfortunately there's a build failure on AIX: https://lab.llvm.org/buildbot/#/builders/214/builds/779/steps/9/logs/stdio. Could you take a look?

Hi, any feedback on the AIX break? Seems like this might be non-trivial to address (were on an older libc++ level), so perhaps this needs a revert till can be worked out.

Hi, unfortunately there's a build failure on AIX: https://lab.llvm.org/buildbot/#/builders/214/builds/779/steps/9/logs/stdio. Could you take a look?

Hi, any feedback on the AIX break? Seems like this might be non-trivial to address (were on an older libc++ level), so perhaps this needs a revert till can be worked out.

Sorry, I somehow missed this. What you're seeing is a symptom of a libc++ issue (PR12704) that was fixed >10 years ago (rGbff1bfc6be0615ba3036a861fd27b75c96e3297c). I don't think we have explicit rules on how old a version of libc++ we aim to support, but trying to use version 3.1 or earlier with trunk Clang seems at least a little unreasonable. That said, it's probably not too hard to work around this, so I will :) but I would encourage whoever is responsible for libc++ in AIX to look into an upgrade!

thakis added a subscriber: thakis.Apr 20 2022, 5:59 PM

I think your revert to fix the aix CI broke CI everywhere: http://45.33.8.238/linux/74239/step_7.txt

Please take a look and reland for now if it takes a while to fix.

I think your revert to fix the aix CI broke CI everywhere: http://45.33.8.238/linux/74239/step_7.txt

Please take a look and reland for now if it takes a while to fix.

Relanded with a workaround for super-old libc++ in rG72315d02c432a0fe0acae9c96c69eac8d8e1a9f6.

Thanks for getting this fixed this so quickly and sorry for the messy revert.

What you're seeing is a symptom of a libc++ issue (PR12704) that was fixed >10 years ago (rGbff1bfc6be0615ba3036a861fd27b75c96e3297c). I don't think we have explicit rules on how old a version of libc++ we aim to support, but trying to use version 3.1 or earlier with trunk Clang seems at least a little unreasonable. That said, it's probably not too hard to work around this, so I will :) but I would encourage whoever is responsible for libc++ in AIX to look into an upgrade!

Yeah, I'd tend to agree with you :) Thanks for accommodating us and identifying the libc++ issue as well, we'll definitely be looking into this.