LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations.
Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Differential D100879
[Clang] Propagate guaranteed alignment for malloc and others xbolva00 on Apr 20 2021, 11:54 AM. Authored by
Details LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations. Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang.
Diff Detail Event Timeline
Comment Actions GCC already knows about this guaranteed alignment, check: Alignment checks were optimized out. I can confirm that with this patch we do same. Comment Actions Please addd tests, including tests that we suppress this assumption under e.g. -fno-builtin-malloc Comment Actions Test looks good, thanks.
Comment Actions Hello. Nice idea. Unfortunately this blocks tail folding, making codesize a bit bigger: https://godbolt.org/z/rncPvbh8d Comment Actions Hmm.. I know probably where to fix it. diff --git a/llvm/lib/CodeGen/Analysis.cpp b/llvm/lib/CodeGen/Analysis.cpp index 5d1b00463dc8..425e1b4e8e2d 100644 --- a/llvm/lib/CodeGen/Analysis.cpp +++ b/llvm/lib/CodeGen/Analysis.cpp @@ -560,6 +560,8 @@ bool llvm::attributesPermitTailCall(const Function *F, const Instruction *I, // Following attributes are completely benign as far as calling convention // goes, they shouldn't affect whether the call is a tail call. + CallerAttrs.removeAttribute(Attribute::Alignment); + CalleeAttrs.removeAttribute(Attribute::Alignment); CallerAttrs.removeAttribute(Attribute::NoAlias); CalleeAttrs.removeAttribute(Attribute::NoAlias); CallerAttrs.removeAttribute(Attribute::NonNull); Your testcase works for me with this modification. Inspiration: https://github.com/llvm/llvm-project/commit/62ad2128255877ed41c714366861eee9c1da30dd If you want you can fix it (or I can do that later today). Comment Actions In https://bugzilla.mozilla.org/show_bug.cgi?id=1741454, oxalica discovered this breaks jemalloc-using code, since jemalloc does not guarantee 16-byte alignment. Comment Actions Not sure if jemalloc does the right thing here. (See comment above getNewAlign). You can force 16-byte alignment by configuring with --with-lg-quantum=4. Comment Actions Reproducing oxalica's comment at https://github.com/NixOS/nixpkgs/pull/146817, to make sure it is not missed: "The divergence here is about how to understand the sentence from C17, 7.22.3. There is already an proposal rephrasing it and making it clear that 8-byte allocation CAN result in only 8-byte alignment, not 16-byte-aligned. It also mentions that forcing jemalloc to have 16-byte alignment will significantly slow down its performance. This N2293 is already accepted in the latest working draft, and should be present in the next C2x standard. Clang should eventually change to support it." Comment Actions Just curious, did you try firefox with rpmalloc? (If you care about the best performance). “ It also mentions that forcing jemalloc to have 16-byte alignment will significantly slow down its performance.” It would be interesting to see real world data, like firefox benchmarks. glibc and macOS defaults to 16-byte alignment and I think they have good reasons too. Comment Actions Platforms are permitted to make stronger guarantees than the C standard requires, and it is totally reasonable for compilers to assume that malloc meets the target platform's documented guarantees. Arguably, libraries should not be replacing malloc if they fail to meet the platform's documented guarantees. With that said, it's probably not unreasonable for Clang to provide an option that controls its assumptions about malloc. We should think somewhat carefully about whether there are other behaviors we'd want to include in that, though. Comment Actions I think this patch was wrong -- we should not be assuming 16-byte alignment for an allocation smaller than 16 bytes. Comment Actions (For background: I'm a jemalloc maintainer and wrote N2293). I agree generally that assuming platform guarantees is reasonable for the compiler in targeting that platform, but I don't think I agree that the dlmalloc alignment is one of them. Supporting malloc replacement is a first-class feature for glibc malloc, and related projects avoid relying on alignment guarantees (e.g. libstdc++ at one point considered assuming that 8-byte allocs were 16-byte aligned, and decided not to). At least one Linux distribution using clang (Alpine) uses musl, which is a weak-alignment implementation (contrary to what I claimed in N2293; I screwed up my background research).
This makes sense to me; this is broadly how clang handles operator new alignment as well. Comment Actions Someone filed https://github.com/llvm/llvm-project/issues/53540, which covers some of the issues also discussed here. |
As a followup, I need to teach Clang about aligned_alloc and memalign.