⚙ D100879 [Clang] Propagate guaranteed alignment for malloc and others

xbolva00 requested review of this revision.Apr 20 2021, 11:54 AM

xbolva00 created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2021, 11:54 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

If this approach makes sense, I will update / add tests.

xbolva00 updated this revision to Diff 338950.Apr 20 2021, 11:55 AM

xbolva00 added inline comments.

clang/lib/CodeGen/CGCall.cpp
2059	As a followup, I need to teach Clang about aligned_alloc and memalign.

lebedev.ri added a subscriber: lebedev.ri.Apr 20 2021, 11:56 AM

lebedev.ri added inline comments.

clang/lib/CodeGen/CGCall.cpp
2060	malloc != new

xbolva00 added inline comments.Apr 20 2021, 11:59 AM

clang/lib/CodeGen/CGCall.cpp

2060

But the comment for the implementation of this value points on eg. gcc's docs:

https://clang.llvm.org/doxygen/Basic_2TargetInfo_8cpp_source.html line 67

// From the glibc documentation, on GNU systems, malloc guarantees 16-byte
// alignment on 64-bit systems and 8-byte alignment on 32-bit systems. See
// https://www.gnu.org/software/libc/manual/html_node/Malloc-Examples.html.
// This alignment guarantee also applies to Windows and Android. On Darwin,
// the alignment is 16 bytes on both 64-bit and 32-bit systems.

So this value is correct to be used for malloc and friends IMHO.

GCC already knows about this guaranteed alignment, check:
https://godbolt.org/z/sxn6K7Yq7

Alignment checks were optimized out. I can confirm that with this patch we do same.

Harbormaster completed remote builds in B99773: Diff 338948.Apr 20 2021, 1:27 PM

Harbormaster completed remote builds in B99774: Diff 338950.Apr 20 2021, 1:50 PM

Please addd tests, including tests that we suppress this assumption under e.g. -fno-builtin-malloc

Added tests

In D100879#2703979, @rjmccall wrote:

Please addd tests, including tests that we suppress this assumption under e.g. -fno-builtin-malloc

Done

Handle also aligned_alloc and memalign.

Fixed small typo in tests.

Harbormaster completed remote builds in B99950: Diff 339189.Apr 21 2021, 5:45 AM

Harbormaster completed remote builds in B99951: Diff 339191.Apr 21 2021, 5:53 AM

Harbormaster completed remote builds in B99952: Diff 339192.

Test looks good, thanks.

clang/lib/CodeGen/CGCall.cpp
2060	There's no immediate reason to have two different methods, but we should at least update the comment on `getNewAlign` to document that we're assuming it describes both `operator new` and `malloc`.

Updated comment for

getNewAlign()

xbolva00 marked an inline comment as done.Apr 23 2021, 12:22 AM

Thanks, LGTM.

This revision is now accepted and ready to land.Apr 23 2021, 12:56 AM

Harbormaster completed remote builds in B100485: Diff 339912.Apr 23 2021, 1:53 AM

This revision was landed with ongoing or failed builds.Apr 23 2021, 2:07 AM

Closed by commit rGc2297544c047: [Clang] Propagate guaranteed alignment for malloc and others (authored by xbolva00). · Explain Why

This revision was automatically updated to reflect the committed changes.

xbolva00 added a commit: rGc2297544c047: [Clang] Propagate guaranteed alignment for malloc and others.

xbolva00 added a reverting change: rG6914a0ed2b30: Revert "[Clang] Propagate guaranteed alignment for malloc and others".Apr 23 2021, 2:34 AM

Hello. Nice idea. Unfortunately this blocks tail folding, making codesize a bit bigger: https://godbolt.org/z/rncPvbh8d
I'm guessing it shouldn't? But the attribute isn't handled somewhere along the way. Any ideas where?

In D100879#2714431, @dmgreen wrote:

Hello. Nice idea. Unfortunately this blocks tail folding, making codesize a bit bigger: https://godbolt.org/z/rncPvbh8d
I'm guessing it shouldn't? But the attribute isn't handled somewhere along the way. Any ideas where?

Hmm.. I know probably where to fix it.

diff --git a/llvm/lib/CodeGen/Analysis.cpp b/llvm/lib/CodeGen/Analysis.cpp
index 5d1b00463dc8..425e1b4e8e2d 100644
--- a/llvm/lib/CodeGen/Analysis.cpp
+++ b/llvm/lib/CodeGen/Analysis.cpp
@@ -560,6 +560,8 @@ bool llvm::attributesPermitTailCall(const Function *F, const Instruction *I,
 
   // Following attributes are completely benign as far as calling convention
   // goes, they shouldn't affect whether the call is a tail call.
+  CallerAttrs.removeAttribute(Attribute::Alignment);
+  CalleeAttrs.removeAttribute(Attribute::Alignment);
   CallerAttrs.removeAttribute(Attribute::NoAlias);
   CalleeAttrs.removeAttribute(Attribute::NoAlias);
   CallerAttrs.removeAttribute(Attribute::NonNull);

Your testcase works for me with this modification.

Inspiration: https://github.com/llvm/llvm-project/commit/62ad2128255877ed41c714366861eee9c1da30dd

If you want you can fix it (or I can do that later today).

xbolva00 mentioned this in D101230: [Analysis] Attribute alignment should not prevent tail call optimization.Apr 24 2021, 4:58 AM

xbolva00 mentioned this in rGef2dc7ed9f76: [Analysis] Attribute alignment should not prevent tail call optimization.Apr 24 2021, 10:58 AM

In https://bugzilla.mozilla.org/show_bug.cgi?id=1741454, oxalica discovered this breaks jemalloc-using code, since jemalloc does not guarantee 16-byte alignment.

In D100879#3144751, @collares wrote:

In https://bugzilla.mozilla.org/show_bug.cgi?id=1741454, oxalica discovered this breaks jemalloc-using code, since jemalloc does not guarantee 16-byte alignment.

Not sure if jemalloc does the right thing here. (See comment above getNewAlign). You can force 16-byte alignment by configuring with --with-lg-quantum=4.

Reproducing oxalica's comment at https://github.com/NixOS/nixpkgs/pull/146817, to make sure it is not missed:

"The divergence here is about how to understand the sentence from C17, 7.22.3. There is already an proposal rephrasing it and making it clear that 8-byte allocation CAN result in only 8-byte alignment, not 16-byte-aligned. It also mentions that forcing jemalloc to have 16-byte alignment will significantly slow down its performance.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2293.htm

This N2293 is already accepted in the latest working draft, and should be present in the next C2x standard. Clang should eventually change to support it."

Just curious, did you try firefox with rpmalloc? (If you care about the best performance).

“ It also mentions that forcing jemalloc to have 16-byte alignment will significantly slow down its performance.”

It would be interesting to see real world data, like firefox benchmarks. glibc and macOS defaults to 16-byte alignment and I think they have good reasons too.

Platforms are permitted to make stronger guarantees than the C standard requires, and it is totally reasonable for compilers to assume that malloc meets the target platform's documented guarantees. Arguably, libraries should not be replacing malloc if they fail to meet the platform's documented guarantees.

With that said, it's probably not unreasonable for Clang to provide an option that controls its assumptions about malloc. We should think somewhat carefully about whether there are other behaviors we'd want to include in that, though.

I think this patch was wrong -- we should not be assuming 16-byte alignment for an allocation smaller than 16 bytes.

(For background: I'm a jemalloc maintainer and wrote N2293).

In D100879#3145361, @rjmccall wrote:

Platforms are permitted to make stronger guarantees than the C standard requires, and it is totally reasonable for compilers to assume that malloc meets the target platform's documented guarantees. Arguably, libraries should not be replacing malloc if they fail to meet the platform's documented guarantees.

I agree generally that assuming platform guarantees is reasonable for the compiler in targeting that platform, but I don't think I agree that the dlmalloc alignment is one of them. Supporting malloc replacement is a first-class feature for glibc malloc, and related projects avoid relying on alignment guarantees (e.g. libstdc++ at one point considered assuming that 8-byte allocs were 16-byte aligned, and decided not to). At least one Linux distribution using clang (Alpine) uses musl, which is a weak-alignment implementation (contrary to what I claimed in N2293; I screwed up my background research).

With that said, it's probably not unreasonable for Clang to provide an option that controls its assumptions about malloc. We should think somewhat carefully about whether there are other behaviors we'd want to include in that, though.

This makes sense to me; this is broadly how clang handles operator new alignment as well.

Someone filed https://github.com/llvm/llvm-project/issues/53540, which covers some of the issues also discussed here.

jyknight added a reverting change: D118804: Revert "[Clang] Propagate guaranteed alignment for malloc and others".Feb 2 2022, 8:48 AM

jyknight added a reverting change: rG9545976ff160: Revert "[Clang] Propagate guaranteed alignment for malloc and others".Feb 8 2022, 11:35 AM

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Propagate guaranteed alignment for malloc and others
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 339953

clang/include/clang/Basic/TargetInfo.h

clang/lib/CodeGen/CGCall.cpp

clang/test/CodeGen/alloc-fns-alignment.c

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Propagate guaranteed alignment for malloc and others ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 339953

clang/include/clang/Basic/TargetInfo.h

clang/lib/CodeGen/CGCall.cpp

clang/test/CodeGen/alloc-fns-alignment.c

[Clang] Propagate guaranteed alignment for malloc and others
ClosedPublic