This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Start using `arc4random()` to implement `std::random_device` on Apple
ClosedPublic

Authored by ldionne on Dec 20 2021, 8:26 AM.

Details

Summary

[libc++] Start using arc4random() to implement std::random_device on Apple

On Apple platforms, arc4random is faster than /dev/urandom, and it is
the recommended user-space RNG according to Apple's own OS folks.

This commit adds an ABI switch to guard ABI-break-protections in
std::random_device, and starts using arc4random instead of /dev/urandom
to implement std::random_device on Apple platforms.

Note that previously, std::random_device would allow passing a custom
token to its constructor, and that token would be interpreted as the name
of a file to read entropy from. This was implementation-defined and
undocumented. After this change, Apple platforms will be using arc4random()
instead, and any custom token passed to the constructor will be ignored.
This behavioral change will also impact other platforms that use the
arc4random() implementation, such as OpenBSD. This should be fine since
that is effectively a relaxation of the constructor's requirements.

rdar://86638350

Diff Detail

Event Timeline

ldionne created this revision.Dec 20 2021, 8:26 AM
ldionne requested review of this revision.Dec 20 2021, 8:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2021, 8:26 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

Turns out this is an ABI break:

  1. random_device contains this, so we'd be changing the layout of the type:
class random_device {
#ifdef _LIBCPP_USING_DEV_RANDOM
    int __f_;
#endif
    ....
};
  1. When using arc4random, std::random_device's constructor will throw whenever the token isn't exactly "/dev/urandom". Previously, it would work with any token that represents a path that we can std::open().

If we're willing to break (2), then I think there's definitely a way we can still roll this out by retaining the same field in the struct.

Are we willing to break (2)? Well, http://eel.is/c++draft/rand.device#lib:random_device,constructor says:

Throws: A value of an implementation-defined type derived from exception if the random_device cannot be initialized.
Remarks: The semantics of the token parameter and the token value used by the default constructor are implementation-defined.

So technically, users should not be expecting a whole lot of portability if they are using these custom tokens. In practice, I suspect users don't really provide a custom token cause we've never documented our "implementation-defined" behavior, but I could be wrong.

For the above reasons, and considering the benefits of using arc4random, I would be willing to give a shot at this. The strategy would basically be to keep the int __f_ around to preserve the layout of std::random_device, but not do anything with it. Since all the class methods are defined in the dylib, any old code compiled against the previous definition of std::random_device will still work when run against the new dylib, but it won't use the int __f_ in the struct. Any code compiled against the new library (headers + dylib) but running against an older dylib will also work, since it will still have the int __f_ member for the old dylib to use std::open().

I assume that's something along the lines of

  • #ifdef _LIBCPP_USING_DEV_RANDOM

+ #if defined(_LIBCPP_USING_DEV_RANDOM) || defined(APPLE)

With the appropriate comment and closing #endif.

I grepped for urandom and it didn't find documentation, so it seems it's only documented in the code.
It's however documented at https://en.cppreference.com/w/cpp/numeric/random/random_device/random_device
Which also mentions MSVC STL ignores the token completely.

I wonder whether it wouldn't be better to ignore the token for _LIBCPP_USING_ARC4_RANDOM instead of throwing an exception. (Probably in a separate patch.) That way even when Apple users now select a device their code will keep working after this change instead of suddenly throwing an exception.

libcxx/include/__config
381–382

This comment is somewhat misleading, since it doesn't work when the token is /dev/random.
But as commented below I think it would be good to document here that the token is ignored when using arc4random.

Documenting the behavior of token for the other devices would be nice.

ldionne marked an inline comment as done.Dec 21 2021, 12:37 PM

I assume that's something along the lines of

  • #ifdef _LIBCPP_USING_DEV_RANDOM

+ #if defined(_LIBCPP_USING_DEV_RANDOM) || defined(APPLE)

With the appropriate comment and closing #endif.

Yes, roughly.

I wonder whether it wouldn't be better to ignore the token for _LIBCPP_USING_ARC4_RANDOM instead of throwing an exception. (Probably in a separate patch.) That way even when Apple users now select a device their code will keep working after this change instead of suddenly throwing an exception.

I am really ambivalent on this. My understanding is that these __throws are meant to basically say "you should not ever use a non-default-constructed random_device". Failing at least ensures that users don't think they are using some other source of randomness -- they will definitely notice the change. Like I said, I would be surprised if anyone ever noticed the change, however if they did, I'm unsure whether it would be better to error out explicitly (so they can notice and fix their stuff), or to swallow the error and just disregard the token. I am currently leaning slightly towards the former, but I can easily be convinced otherwise. For now I'll update with my proposed patch if we do go for the former.

libcxx/include/__config
381–382

Under _LIBCPP_USING_DEV_RANDOM, using /dev/random as the parameter will work. I fixed the documentation issue in a separate NFC commit to keep those separate: https://github.com/llvm/llvm-project/commit/346ef5e5879e9143fe008c35c3dd6ea3c402634a

ldionne marked an inline comment as done.Dec 21 2021, 12:41 PM

I am really ambivalent on this. My understanding is that these __throws are meant to basically say "you should not ever use a non-default-constructed random_device". Failing at least ensures that users don't think they are using some other source of randomness -- they will definitely notice the change. Like I said, I would be surprised if anyone ever noticed the change, however if they did, I'm unsure whether it would be better to error out explicitly (so they can notice and fix their stuff), or to swallow the error and just disregard the token. I am currently leaning slightly towards the former, but I can easily be convinced otherwise. For now I'll update with my proposed patch if we do go for the former.

I guess what I should say here to justify my leaning is that I view the act of ignoring a user's custom token as being as much of an ABI break as suddenly starting to throw an exception. One of them is just waaaay more subtle, but I can imagine users being broken by either equally (e.g. if they were specifying a custom file to read entropy from and somehow the entropy provided in that file had important properties). We're kind of stretching it, but that's what's on my mind in preferring to break users explicitly instead of more subtly.

ldionne updated this revision to Diff 395740.Dec 21 2021, 12:48 PM
ldionne retitled this revision from [libc++] Use arc4random instead of /dev/urandom on Apple platforms to [libc++][ABI BREAK] Start using `arc4random()` to implement `std::random_device` on Apple.
ldionne edited the summary of this revision. (Show Details)

Update with ABI break mechanism and documentation

I assume software on Mac dynamically links libc++.so and that normally there's only one system version of libc++.so installed. I'm somewhat concerned about this use case:

  • A user has bought software that uses random_device("/dev/not_urandom")
  • The user update to a new MacOS which has the proposed changes
  • Now the application terminated due to an uncaught exception

When this scenario can happen it's possible to break working software by updating the random device. So I agree that not throwing is a kind of ABI-break, but I consider possibly breaking working code after a system upgrade by the user also an ABI-break and worse than the other.

Should we discuss this further on Discord?

I assume software on Mac dynamically links libc++.so and that normally there's only one system version of libc++.so installed. I'm somewhat concerned about this use case:

  • A user has bought software that uses random_device("/dev/not_urandom")
  • The user update to a new MacOS which has the proposed changes
  • Now the application terminated due to an uncaught exception

When this scenario can happen it's possible to break working software by updating the random device. So I agree that not throwing is a kind of ABI-break, but I consider possibly breaking working code after a system upgrade by the user also an ABI-break and worse than the other.

Should we discuss this further on Discord?

Right -- so I tried explaining this in my commit message here:

This behavior is preferred to ignoring the custom token in order to avoid surprising changes in behavior when a custom token was expected to result in the corresponding file being used for entropy.

Basically, what I'm suggesting is that it's better to fail fast and loud than to be silently incorrect. One could argue that the behavior if we silently ignore the custom token isn't really incorrect, though. I'll ignore the token instead.

I assume software on Mac dynamically links libc++.so and that normally there's only one system version of libc++.so installed. I'm somewhat concerned about this use case:

  • A user has bought software that uses random_device("/dev/not_urandom")
  • The user update to a new MacOS which has the proposed changes
  • Now the application terminated due to an uncaught exception

When this scenario can happen it's possible to break working software by updating the random device. So I agree that not throwing is a kind of ABI-break, but I consider possibly breaking working code after a system upgrade by the user also an ABI-break and worse than the other.

Should we discuss this further on Discord?

Right -- so I tried explaining this in my commit message here:

This behavior is preferred to ignoring the custom token in order to avoid surprising changes in behavior when a custom token was expected to result in the corresponding file being used for entropy.

Basically, what I'm suggesting is that it's better to fail fast and loud than to be silently incorrect. One could argue that the behavior if we silently ignore the custom token isn't really incorrect, though. I'll ignore the token instead.

I read the original message. I was just wondering whether the change can break a working executables after MacOS update. Just to make sure we considered that case and made a conscious decision to accept that issue. I think it would be good the mention whether or not this can break existing executables in the commit message.

ldionne updated this revision to Diff 398737.Jan 10 2022, 2:03 PM

Ignore the custom token instead of erroring out.

ldionne added a subscriber: emaste.Jan 10 2022, 2:05 PM

This version of the patch relaxes the std::random_device constructor so it doesn't error out when a custom token is passed, which eases transition to arc4random() for vendors who have already shipped std::random_device with a different implementation.

@emaste do you know who's responsible for OpenBSD? This is technically a small behavior change on their platform since an invalid custom token is going to be ignored instead of throwing an exception. That's most likely OK but we might as well double check with them if we can get a hold of someone.

ldionne retitled this revision from [libc++][ABI BREAK] Start using `arc4random()` to implement `std::random_device` on Apple to [libc++] Start using `arc4random()` to implement `std::random_device` on Apple.Jan 10 2022, 2:07 PM
ldionne edited the summary of this revision. (Show Details)
Mordante accepted this revision as: Mordante.Jan 11 2022, 8:50 AM

LGTM after the CI is green.

ldionne updated this revision to Diff 398992.Jan 11 2022, 9:39 AM

Try to fix CI. The test doesn't fail anymore on older macOS because the exception-throwing test isn't enabled anymore (since we're not using _LIBCPP_USING_DEV_RANDOM).

ldionne accepted this revision as: Restricted Project.Jan 12 2022, 8:23 AM
This revision is now accepted and ready to land.Jan 12 2022, 8:23 AM