This is an archive of the discontinued LLVM Phabricator instance.

[Builtins] Do not use tailcall for Thumb1
ClosedPublic

Authored by weimingz on Nov 6 2017, 2:27 PM.

Details

Summary

The b instruction in Thumb1 has limited range, which may cause link-time errors if the jump target is far away.
This patch guards the tailcalls for non-Thumb1

Diff Detail

Repository
rL LLVM

Event Timeline

weimingz created this revision.Nov 6 2017, 2:27 PM
weimingz updated this revision to Diff 121792.Nov 6 2017, 2:36 PM

add memcpy

peter.smith edited edge metadata.Nov 7 2017, 2:27 AM

I agree that we must avoid the 16-bit branch on Thumb1 and that the code sequence is fine for that purpose.

I do wonder if compiler-rt should define the aeabi_... forwarding functions in it? I would have expected the C-library to define them. For example the v6-m libgcc.a does not define the aeabi memcpy family, the forwarding is done in newlib. Similarly Arm's proprietary compiler's C library defines aeabi_memcpy that just falls through directly into memcpy. I'm guessing that there is some C-library that we support that does not define the aeabi_ function?

I agree that we must avoid the 16-bit branch on Thumb1 and that the code sequence is fine for that purpose.

I do wonder if compiler-rt should define the aeabi_... forwarding functions in it? I would have expected the C-library to define them. For example the v6-m libgcc.a does not define the aeabi memcpy family, the forwarding is done in newlib. Similarly Arm's proprietary compiler's C library defines aeabi_memcpy that just falls through directly into memcpy. I'm guessing that there is some C-library that we support that does not define the aeabi_ function?

I looked the MUSL C library: it implements aeabi_mem{cpy,move,set,clear}, but misses "aeabi_memcmp".
From the "C Library ABI for the ARM" [1], it only requires some aeabi specific constants or typedefs in libc's header file.
My understanding is, compiler-rt should define complete function sets to support other libc.

[1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0039d/IHI0039D_clibabi.pdf

I agree that we must avoid the 16-bit branch on Thumb1 and that the code sequence is fine for that purpose.

I do wonder if compiler-rt should define the aeabi_... forwarding functions in it? I would have expected the C-library to define them. For example the v6-m libgcc.a does not define the aeabi memcpy family, the forwarding is done in newlib. Similarly Arm's proprietary compiler's C library defines aeabi_memcpy that just falls through directly into memcpy. I'm guessing that there is some C-library that we support that does not define the aeabi_ function?

I looked the MUSL C library: it implements aeabi_mem{cpy,move,set,clear}, but misses "aeabi_memcmp".
From the "C Library ABI for the ARM" [1], it only requires some aeabi specific constants or typedefs in libc's header file.
My understanding is, compiler-rt should define complete function sets to support other libc.

[1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0039d/IHI0039D_clibabi.pdf

I'm not sure that the ABI will be that much help here. My understanding is that the toolchain is required to provide the aeabi_ family of functions. Whether they are implemented in a runtime library or in the C-library is an implementation detail. Given that compiler-rt will need to support more than one C-library (Musl, libc, newlib etc.) it probably has to assume that none of the aeabi_ specialisations of memcpy, memset etc. are implemented by the C-library.

Looking again at the default library search order, the C-library will be searched before compiler-rt so if the C-library does define an optimised __aeabi_ implementation it will get used in preference to the one in compiler-rt so I don't think it hurts to have these in compiler-rt.

Overall I don't have any objections to the change.

dalias added a subscriber: dalias.Nov 7 2017, 2:07 PM

In musl we do not define the __aeabi_* functions for the sake of having optimized implementations but simply for ABI conformance. They are specifically and intentionally NOT optimized since their ABIs (special rules about which registers they don't clobber) preclude writing them in C, and also make it hard to do the potentially-more-worthwhile optimizations like vectorization (since you don't have vector scratch registers). With this in mind, I would strongly recommend against ever generating calls to them. Our view on the musl side is that they exist just for minimal (i.e. non-performance) ABI compat with code that might have been compiled with a legacy compiler making use of them.

If compiler-rt is to provide optimized version, then, should MUSL just forwards the calls like memcpy ->__aeabi_memcpy ? Because libc is usually appear first in linking order, a standard implementation in libc will hide the optimized version in compiler-rt.

dalias added a comment.Nov 7 2017, 2:56 PM

Looking further at this, it looks like your current implementations of the __aeabi_* functions are simply wrong - they clobber registers that the ABI does not allow them to clobber.

rengolin edited edge metadata.Nov 7 2017, 3:15 PM
rengolin added a subscriber: joerg.

Given that compiler-rt will need to support more than one C-library (Musl, libc, newlib etc.) it probably has to assume that none of the __aeabi_ specialisations of memcpy, memset etc. are implemented by the C-library.

We probably still have to provided them in freestanding environments (at least that's true for standard memset/cpy/cmp). @joerg / @compnerd know better, though.

Looking again at the default library search order, the C-library will be searched before compiler-rt so if the C-library does define an optimised __aeabi_ implementation it will get used in preference to the one in compiler-rt so I don't think it hurts to have these in compiler-rt.

As Rich said, most C-libraries have them for compatibility purposes, not specialisation.

The fact that mem*4/8 redirect here is yet another indication that the original half-baked implementation we did to "make things work" has been grossly overlooked for 5 years.

It's a bit silly that we make such a bad effort in something so crucial to every program out there. Or maybe this is a side-effect of how RT is used by people (either from downstream patched copies or not at all).

This patch "makes things work" for Thumb1, sure, but I wouldn't use this in production for anything...

dalias added a comment.EditedNov 7 2017, 3:16 PM

This commit in musl has details on why calling memset, etc. doesn't work and violates the ABI:

https://git.musl-libc.org/cgit/musl/commit?id=e6def544358afd5648a428d2e02c147a1f901048

Looking further at this, it looks like your current implementations of the __aeabi_* functions are simply wrong - they clobber registers that the ABI does not allow them to clobber.

Those are not clobbering r0~r3, just moving the arguments around, as mem* functions are assumed to be in GNU ABI. (yet another horrible mess).

This commit in musl has details on why calling memset, etc. doesn't work and violates the ABI:

https://git.musl-libc.org/cgit/musl/commit?id=e6def544358afd5648a428d2e02c147a1f901048

oh, hum, that's a good point, actually.

Unfortunately, I think that "fix" is present in Android and Chromium, at least. Plus in RT for the past 5 years... :/

I don't think anyone uses it, though.

With respect to this review specifically, I think that the change should make compiler-rt no worse and for its likely use case of v6-m, there aren't any non-core registers for the C language functions to corrupt. With respect to the larger problem of many of the other __aeabi_ functions calling C functions I think we should address separately to this Review. I have raised pr35243 to cover the non-compliance https://bugs.llvm.org/show_bug.cgi?id=35243 please feel to comment/correct as necessary.

rengolin accepted this revision.Nov 8 2017, 3:44 AM

With respect to this review specifically, I think that the change should make compiler-rt no worse and for its likely use case of v6-m, there aren't any non-core registers for the C language functions to corrupt. With respect to the larger problem of many of the other __aeabi_ functions calling C functions I think we should address separately to this Review.

agreed, not for this review.

I have raised pr35243 to cover the non-compliance https://bugs.llvm.org/show_bug.cgi?id=35243 please feel to comment/correct as necessary.

Thanks!

This revision is now accepted and ready to land.Nov 8 2017, 3:44 AM
This revision was automatically updated to reflect the committed changes.