This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Fix lifetime issues of temporaries.
ClosedPublic

Authored by Mordante on Nov 3 2021, 12:53 PM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project
Commits
rG1e78d5d008f9: [libc++] Fix lifetime issues of temporaries.
Summary

The ASAN build failed due to using pointers to a temporary whose
lifetime had expired.

Updating the libc++ Docker image to Ubuntu Focal caused some breakage.
This was temporary disabled in D112737. This re-enables two of these
tests.

Diff Detail

Event Timeline

Mordante requested review of this revision.Nov 3 2021, 12:53 PM
Mordante created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2021, 12:53 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript

These tests are outside my interest area, so I don't really care what happens with them; and I also don't know much about how format_args is meant to be used; but, does this proposed PR actually fix the dangling, or just hide it enough to fool the tool?

libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

Style: std::basic_format_args args = format_arg_store; (it's not an aggregate or sequence-container)

Do I understand correctly that std::make_format_args does not actually return a std::format_args? Who came up with that? 😛

But does this fix the dangling? https://en.cppreference.com/w/cpp/utility/format/make_format_args insists that "A formatting argument has reference semantics for user-defined types and does not extend the lifetime of args," so do you really need to do something like

auto args_tuple = std::make_tuple(true, CharT('a'), 42, string);
auto args = std::apply([](auto&&... ts) {
    return std::make_format_args<std::basic_format_context<OutIt, CharT>>(static_cast<decltype(ts)>(ts)...);
}, std::tuple<bool&&, CharT&&, int&&, std::basic_string<CharT>&>(args_tuple));

? (Or consider turning each of the following blocks into its own function, and just calling them like test_foo(std::make_format_args<...>(...)); which would eliminate the lifetime issues in a more realistic way: basically by treating format_args as a parameter-only type.)

Mordante added inline comments.Nov 4 2021, 10:57 AM
libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

Style: std::basic_format_args args = format_arg_store; (it's not an aggregate or sequence-container)

Do I understand correctly that std::make_format_args does not actually return a std::format_args? Who came up with that? 😛

Indeed

Returns an object that stores an array of formatting arguments and can be implicitly converted to std::basic_format_args<Context>.

The basic_format_args stores a pointer to the data of this array. Before the change that array was a temporary, after the change it's no longer a temporary. (The format functions are implemented like return vformat_to(out, fmt.str, make_format_args(args...));. So the temporary's lifetime isn't an issue there.)

But does this fix the dangling? https://en.cppreference.com/w/cpp/utility/format/make_format_args insists that "A formatting argument has reference semantics for user-defined types and does not extend the lifetime of args," so do you really need to do something like

auto args_tuple = std::make_tuple(true, CharT('a'), 42, string);
auto args = std::apply([](auto&&... ts) {
    return std::make_format_args<std::basic_format_context<OutIt, CharT>>(static_cast<decltype(ts)>(ts)...);
}, std::tuple<bool&&, CharT&&, int&&, std::basic_string<CharT>&>(args_tuple));

? (Or consider turning each of the following blocks into its own function, and just calling them like test_foo(std::make_format_args<...>(...)); which would eliminate the lifetime issues in a more realistic way: basically by treating format_args as a parameter-only type.)

No this should work; the values of the fundamental types are copied in the array so they won't dangle. The emphasis in the note is user-defined types. Using a temporary would be an issue for std::string, it's stored as a std::string_view in the array. That's why the argument string isn't a temporary.

Mordante updated this revision to Diff 384817.Nov 4 2021, 11:01 AM

Addresses review comment.

Quuxplusone added inline comments.Nov 4 2021, 1:11 PM
libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

No, this should work; the values of the fundamental types are copied in the array so they won't dangle. The emphasis in the note is user-defined types. Using a temporary would be an issue for std::string, it's stored as a std::string_view in the array.

Okay, everything you say is true for libc++'s implementation — we copy ints but merely reference/view std::strings — but does the Standard actually mandate that that happens, and if so, how? (Does it actually mandate value semantics for a whitelist of primitive types, reference semantics for "everything else"? Is there any gray area? e.g. what happens for int128_t, std::byte, non-const-void*, void(*)()?) Is it possible that a "maliciously conforming" implementation might capture your int argument by reference, even if libc++ never will? We do want the test suite to work even for non-libc++ implementations... although I think it's quite defensible in this case to say "we'll cross that bridge if we ever come to it."

(My vague unease with this is probably refactoring-related. What if somebody six months from now notices that we test make_format_args with a string lvalue but never with a string rvalue: will it be easy for them to add a test for string rvalues, or will they have to reinvent all the stuff we're going through now?)

Anyway, I'm not blocking this; I still say if you're happy with it (and Louis isn't unhappy with it) then it should be shipped.

Mordante marked 2 inline comments as done.Nov 5 2021, 11:22 AM
Mordante added inline comments.
libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

The value is stored in an exposition only variant in basic_format_arg http://eel.is/c++draft/format.arg. (Actually it will never be stored in a variant since optimizations are required http://eel.is/c++draft/format#args-1. I'm working on that requirement, but nothing public yet.)

In our implementation the values are always lvalue references. P2418 allows rvalue references, but Charlie noticed some issues with the wording. So for now I prefer to keep it as is. I did some work on P2418, but I wait for an LWG-issue to see the final direction of that paper. (P2418 has been accepted at the last plenary.)

So for now I'm happy and when I implement P2418 I need to see what the exact impact rvalue references have. (I expect no changes for this test, since I haven't implemented the handle class yet.)

ldionne accepted this revision.Nov 8 2021, 1:47 PM
ldionne added a subscriber: vitaut.

Thanks for the fix. I'd still like to discuss whether we want to file a LWG issue or if I'm misunderstanding the (lack of) importance of this in the user visible API.

libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

This sounds like a very very dangerous API to me -- it's taking stuff by reference but that's not obvious at all. And even worse, basic_format_args(const format-arg-store<Context, Args...>&) is implicit.

@Mordante Do you think this issue is worth solving? Perhaps users will never actually use this API? But shouldn't we at least make the constructor explicit? I don't have nearly as much context as you do on std::format, so I'm curious to hear what you think. Also CCing @vitaut

This revision is now accepted and ready to land.Nov 8 2021, 1:47 PM
This revision was automatically updated to reflect the committed changes.
Mordante marked an inline comment as done.
Mordante marked an inline comment as done.Nov 9 2021, 11:01 AM
Mordante added inline comments.
libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

This only happens when you create the format-arg-store. This can be useful when you want to format the same arguments multiple times. Then you call the vformat family of functions. When you just call the format family the library internals make sure the life-time issues can't happen. I don't know how common it is for users to use that part of the API.

I'm not sure how feasible it is to solve, using explicit isn't a real fix. I'm not sure it can be easily be fixed without API/ABI breaks. I'm not familiar with the implementation history of libfmt, but I assume it has used this method for multiple years.

If we want to propose a fix we should first find a good alternative that doesn't involve API/ABI breaks. If that's not possible we should discuss with MSVC STL. Their <format> implementation is shipped to customers. I don't know their stance API/ABI breaks(, but I can guess). AFAIK libstdc++ doesn't have an implementation of <format> so it wouldn't affect them.

My general feeling is:

  • This pitfall isn't great, but I don't think it will happen to the average developer who just uses the format functions with a parameter pack.
  • It's really late to make API/ABI breaks in C++20 so I expect it will need very strong arguments to convince the Committee. Judging by the recent range-based for statement discussion, I don't have high hopes for such a proposal being accepted.

So unless there's a good and simple solution I don't think it isn't worth trying to solve this issue.

vitaut added inline comments.Nov 10 2021, 2:43 PM
libcxx/test/std/utilities/format/format.formatter/format.context/format.context/ctor.pass.cpp
54

Who came up with that?

I did =)

I don't think it makes sense to make the conversion explicit because apart from changing API it will only mask the referency nature of basic_format_args and make failures more subtle. Unfortunately there is an inherent tradeoff between efficiency and safety here. To make the API completely safe we would have to make copies of all formatting arguments and not just eliminate conversion.

Fortunately this API is not supposed to be used by anyone except maybe writers of logging libraries and people who are testing the standard library.