In freestanding environment we don't want extra dependencies on the out-of-line
helpers that implements atomic operations. So don't enable outline atomics in
this situation.
Details
- Reviewers
SjoerdMeijer t.p.northover ilinpv MaskRay
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The build failure has nothing to do with the patch and it caused by https://reviews.llvm.org/D140612
From what i understand in freestanding environment runtime library is not used and clang driver can detect this situation and set outline atomics off by default ( see clang/lib/Driver/ToolChains/Linux.cpp Linux::IsAArch64OutlineAtomicsDefault -> GetRuntimeLibType )
The problem here is that compiler might have runtime library available, but we don't want a dependencies in freestanding mode on it. This is the reason why -fno-builtin is implied with freestanding option. So it seems to be logical to me to disable outline-atomics too in this situation even if the compiler has runtime library available.
Outline atomics are dependent on runtime library availability ( libgcc or compler-rt ). If there are no proper library available they will be disabled. So if in freestanding mode compiler is not dependent on runtime library you can remove it and get rid of outline atomics calls automatically.
Having runtime library in freestanding mode you can disable outline atomics specifying -mno-outline-atomics option. But disabling them just by -ffreestanding option will create divergence with GCC behaviour, which has outline atomics not disabled in this case.
Outline atomics are dependent on runtime library availability ( libgcc or compler-rt ).
I understand that, but we can use compiler that have runtime library, but in freestanding mode usually we don't want it to be used.
I understand that you could disable it with extra option and that for now it would be different with gcc, but it looks debatable to me that such a behaviour in gcc is correct and expected, maybe someone need to change it there too. (Please keep in mind that I might be wrong with my position and during discussion please consider this patch as NFC so we could discuss here am I right or not :)). My point is that the gcc is not a golden standard in this questions too and in my mind such a behaviour (disabling extra dependencies in freestanding mode without passing extra flags) would be expected the very same way the -fno-builtint get's auto implied with -ffreestanding flag passed.
I understand that you could disable it with extra option and that for now it would be different with gcc, but it looks debatable to me that such a behaviour in gcc is correct and expected, maybe someone need to change it there too. (Please keep in mind that I might be wrong with my position and during discussion please consider this patch as NFC so we could discuss here am I right or not :)). My point is that the gcc is not a golden standard in this questions too and in my mind such a behaviour (disabling extra dependencies in freestanding mode without passing extra flags) would be expected the very same way the -fno-builtint get's auto implied with -ffreestanding flag passed.
I think we are on the same page here. I am not a freestanding mode user and respect opinions of real users, but we need to keep compilers aligned, so to make such changes we need GCC community consensus as well.
Great! I've add MaskRay to the reviewers list as a known expert in both clang and gcc, maybe he has some thoughts on this proposal :)
Also please keep in mind that despite of different behaviour in gcc/clang not implying outline atomics won't result in any problems, but implying them in cases where we don't want them might result in some problems.
Offtopic: Outlining atomics seems to be very CPU specific thing. In my experience LSE were ~= old exclusive semantics. So adding extra call + extra bit check (too bad IFUNCs are not used :)) each time it would be executed seems to be quite an extra load (for CPU, TLB, dcache..), so I'm not sure that outline atomics is a win-win thing (at least on some of the CPUs). This is absolutely not a case for this patch anyway, just some of my thoughts, I would be glad to hear other opinions :)
Offtopic: Outlining atomics seems to be very CPU specific thing. In my experience LSE were ~= old exclusive semantics. So adding extra call + extra bit check (too bad IFUNCs are not used :)) each time it would be executed seems to be quite an extra load (for CPU, TLB, dcache..), so I'm not sure that outline atomics is a win-win thing (at least on some of the CPUs). This is absolutely not a case for this patch anyway, just some of my thoughts, I would be glad to hear other opinions :)
Outline atomics overhead is mostly negligible. "Various members in the Arm ecosystem have measured the performance impact of this indirection on a diverse set of systems and we were happy to find out that it was minimal compared to the benefit of using the LSE instructions for better scalability at large core counts." https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/making-the-most-of-the-arm-architecture-in-gcc-10
For IFUNCs Function Multi Versioning https://github.com/ARM-software/acle/blob/main/main/acle.md#function-multi-versioning lse feature can be used.