This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][AArch64] Add a workaround for Exynos 9810
ClosedPublic

Authored by bc-lee on Nov 24 2021, 4:08 AM.

Details

Summary

Big.LITTLE Heterogeneous architectures, as described by ARM [1],
require that the instruction set architecture of the big and little
cores be compatible. However, the Samsung Exynos 9810 is known to
have different ISAs in its core.
According to [2], some cores are ARMv8.2 and others are ARMv8.0.

Since LSE is for ARMv8.1 and later, it should be disabled
for this broken CPU.

[1] https://developer.arm.com/documentation/den0024/a/big-LITTLE-Technology
[2] https://github.com/golang/go/issues/28431

Diff Detail

Event Timeline

bc-lee created this revision.Nov 24 2021, 4:08 AM
bc-lee requested review of this revision.Nov 24 2021, 4:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2021, 4:08 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
bc-lee edited projects, added Restricted Project; removed Restricted Project.Nov 24 2021, 4:13 AM
bc-lee removed a subscriber: Restricted Project.
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2021, 4:13 AM
bc-lee updated this revision to Diff 389729.Nov 25 2021, 3:59 AM
bc-lee edited the summary of this revision. (Show Details)

I noticed that using __system_property_get from a constructor function probably doesn't work within the Bionic loader (linker[64]), but maybe Bionic ought to be fixed (and it shouldn't matter much in practice):

  • The Bionic dynamic loader calls the constructor functions before calling __system_properties_init. (see tmp_linker_so.call_constructors();)
  • However, a static executable calls __system_properties_init before the constructors, so static executables should be OK.
  • Maybe the dynamic loader ought to call __system_properties_init before constructors.
  • Calling __system_property_get before __system_properties_init is probably OK? It looks like SystemProperties::Find would return nullptr, and __system_property_get would return 0.
  • In any case, it wouldn't actually matter unless someone ran a new loader on an affected device+kernel.

If someone (e.g. LineageOS?) upgrades the platform to something with outline atomics, without fixing the kernel, then this bug could start to matter. I *think* Bionic can be fixed, though.

Also: is it OK to use a mix of LSE and non-LSE atomic operations to access the same memory concurrently? I would think that should be fine, but it seems worth mentioning.

bc-lee added a comment.Dec 1 2021, 3:07 PM

From what I understand, for dynamic executables, si->call_constructors() [1] calls the constructor of each loaded shared object. Right?

I put the LD_DEBUG=2 environment variable and found that libc.so's constructor is called before the main executable or any other shared object's constructor. And libc.so calls __libc_preinit_impl -> __libc_init_common -> __system_properties_init.
So the point is that __system_properties_init is initialized before this compiler-rt code is called. But I don't know if this is guaranteed behavior.

Of course, I think this code should work fine even if __system_properties_init is not yet called.

[1] https://cs.android.com/android/platform/superproject/+/9aafec59ccad4b3851269e4119561efd8a2f5e65:bionic/linker/linker_main.cpp;l=508

From what I understand, for dynamic executables, si->call_constructors() [1] calls the constructor of each loaded shared object. Right?

I put the LD_DEBUG=2 environment variable and found that libc.so's constructor is called before the main executable or any other shared object's constructor. And libc.so calls __libc_preinit_impl -> __libc_init_common -> __system_properties_init.
So the point is that __system_properties_init is initialized before this compiler-rt code is called. But I don't know if this is guaranteed behavior.

Of course, I think this code should work fine even if __system_properties_init is not yet called.

[1] https://cs.android.com/android/platform/superproject/+/9aafec59ccad4b3851269e4119561efd8a2f5e65:bionic/linker/linker_main.cpp;l=508

Yes, a dynamically-linked executable and its DSOs should all be fine. libc.so's constructors should run before the constructors of any DSO/exe that needs libc.so (indirectly or directly), and __libc_preinit_impl is a higher priority constructor that should run before a default-priority constructor like init_have_lse_atomics.

The trouble is that linker[64] has its own copy of a lot of libc state, including the state initialized by __system_properties_init. linker[64] calls its own constructor functions first before it calls its copy of __system_properties_init. Later on, linker_main locates all the needed DSOs, maps them, and calls their constructors. Maybe the linker ought to defer calling its own constructors until after it has initialized its copy of libc.

Of course, I think this code should work fine even if __system_properties_init is not yet called.

Currently it wouldn't, because __system_property_get would return 0 even if ro.arch were exynos9810, but only if someone is running a new copy of linker[64] on a device with a broken kernel. That should be unlikely. Also, only the atomic accesses in the linker[64] binary itself would be affected, and I suspect that that binary might not do any atomic accesses. But maybe you're suggesting that __system_property_get should initialize system properties on demand?

Anyway, I think this LLVM patch is fine.

bc-lee added a comment.Dec 2 2021, 1:04 AM

But maybe you're suggesting that __system_property_get should initialize system properties on demand?

Android Bionic libc or Linker is not my area of expertise, so it's hard to say, but it seems much easier to put the -fno-outline-atomics flag into the linker's .bp file instead of changing the behavior of __system_property_get.
And I think the hardware vendors are already checking that the linker is working fine. If they were using Outline-atomics with a misbehaving Kernel, even without this patch, they would have already run into this problem.

For reference, the last product to use the Exynos 9810 is the Samsung Galaxy Note 10 Lite, and I think the future OS update will be received at least once(Android 12).

bc-lee updated this revision to Diff 391252.Dec 2 2021, 1:43 AM

Fix the compile error with __FreeBSD__.

bc-lee added a comment.Dec 6 2021, 1:05 AM

Hi, can this patch able to be merged, or do I need to add more reviewers? Thanks.

... but it seems much easier to put the -fno-outline-atomics flag into the linker's .bp file instead of changing the behavior of __system_property_get.

That would work, though in principle we'd also have to use -mno-outline-atomics when compiling the parts of libc linked into the linker.

I believe Android 12 is using a version of Clang (clang-r416183b1) that doesn't enable -moutline-atomics by default:

If fixing the linker is necessary, then I would just defer calling constructors until after system properties are initialized.

srhines accepted this revision.Dec 8 2021, 1:31 AM

Ryan's comment above is correct that this won't affect the Android 12 platform image. The more important task will be backporting this patch for any necessary NDK, since apps compiled with outlined atomics would start causing problems otherwise. I will take care of that part once this is merged.

This revision is now accepted and ready to land.Dec 8 2021, 1:31 AM
bc-lee added a comment.Dec 8 2021, 5:24 AM

Thanks for the review. Since I don't have commit access, so I want someone else to apply this patch.

Stephen, could you land this patch? It can help fix downstream issues like https://crbug.com/1272795.

Stephen, could you land this patch? It can help fix downstream issues like https://crbug.com/1272795.

Yes, I'll try to land this ASAP. I was meaning to do this sooner, but got distracted. We also want to ensure we can put this in the NDK for our users too. Thanks for the reminder.

This revision was automatically updated to reflect the committed changes.

It was noted in other discussions that problems have only been reported on the Galaxy S9 and S9+ devices, running older versions of the device firmware (based on Android 8). Although I have no first-hand knowledge, presumably the kernel in the more recent builds has been updated to (correctly) not report HWCAP_ATOMICS as available.

So I don't think this is likely to ever affect any platform builds -- even with 3rd party firmwares, presumably they'd use the latest vendor kernel available, which has already been fixed.

Of course, since there are apparently a fair number of users still running the ancient and broken firmware version for some reason (despite Android 9 and 10 updates being available on those devices!), a workaround does seem appropriate for NDK.

Possibly the workaround should be version-limited, so that if the minimum version supported by compiler-rt is ever updated to Android 9 or later, we can get rid of this?

This workaround can also be improved using the ro.build.version.sdk property, provided there is adequate data that Crash is only occurs on a certain Android version.

Possibly the workaround should be version-limited, so that if the minimum version supported by compiler-rt is ever updated to Android 9 or later, we can get rid of this?

I think so too. Maybe we need a comment about this?
The minimum OS supported by the latest NDK(r24 beta) is KitKat (4.4, API level 19), so that will be a distant future.

Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 6:36 AM