This is an archive of the discontinued LLVM Phabricator instance.

[clang] Allow 'nomerge' attribute for function pointers
ClosedPublic

Authored by eddyz87 on Jun 14 2023, 6:09 PM.

Details

Summary

Allow specifying 'nomerge' attribute for function pointers,
e.g. like in the following C code:

extern void (*foo)(void) __attribute__((nomerge));
void bar(long i) {
  if (i)
    foo();
  else
    foo();
}

With the goal to attach 'nomerge' to both calls done through 'foo':

@foo = external local_unnamed_addr global ptr, align 8
define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 {
  ; ...
  %0 = load ptr, ptr @foo, align 8, !tbaa !5
  ; ...
if.then:
  tail call void %0() #1
  br label %if.end
if.else:
  tail call void %0() #1
  br label %if.end
if.end:
  ret void
}
; ...
attributes #1 = { nomerge ... }

Report a warning in case if 'nomerge' is specified for a variable that
is not a function pointer, e.g.:

t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes]
    2 | int j __attribute__((nomerge));
      |                      ^

The intended use-case is for BPF backend.

BPF provides a sort of "standard library" functions that are called
helpers. BPF also verifies usage of these helpers before program
execution. Because of limitations of verification / runtime model it
is important to keep calls to some of such helpers from merging.

An example could be found by the link [1], there input C code:

if (data_end - data > 1024) {
    bpf_for_each_map_elem(&map1, cb, &cb_data, 0);
} else {
    bpf_for_each_map_elem(&map2, cb, &cb_data, 0);
}

Is converted to bytecode equivalent to:

if (data_end - data > 1024)
  tmp = &map1;
else
  tmp = &map2;
bpf_for_each_map_elem(tmp, cb, &cb_data, 0);

However, BPF verification/runtime requires to use the same map address
for each particular bpf_for_each_map_elem() call.

The 'nomerge' attribute is a perfect match for this situation, but
unfortunately BPF helpers are declared as pointers to functions:

static long (*bpf_for_each_map_elem)(void *map, ...) = (void *) 164;

Hence, this commit, allowing to use 'nomerge' for function pointers.

[1] https://lore.kernel.org/bpf/03bdf90f-f374-1e67-69d6-76dd9c8318a4@meta.com/

Diff Detail

Event Timeline

eddyz87 created this revision.Jun 14 2023, 6:09 PM
eddyz87 published this revision for review.Jun 15 2023, 6:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 6:49 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
eddyz87 added a subscriber: rnk.

Hi @aaron.ballman, @rnk,

I see that you were involved in the discussion when nomerge attribute was added in D79121.
Could you please take a look at this proposed change?
It would be useful for BPF back-end and I tried to explain the use-case in the revision description.

Hi @jemarch, @dfaust,

You might be interested in this discussion.

rnk added a comment.Jun 15 2023, 11:44 AM

The purpose of the attribute is really limited to preserving source location information on instructions, and this isn't really a supported usage. The BPF backend and verifier needs to learn to tolerate valid LLVM transforms if it wants to be a real LLVM backend. Of course, you can do what you like.

Considered in the context of the original use case, I think it's reasonable to allow the attribute on function pointers for the same reasons we allow it on function declarations. It makes it easy to work the attribute onto the direct call sites of the function without modifying tons of source code. However, I'd like to see clearer documentation on the limitations.

clang/include/clang/Basic/AttrDocs.td
549–577

This statement of the attribute having "no effect on indirect calls" is slightly confusing now that we talk about function pointers immediately afterward. Can you please rework this a bit, and clarify that when applied to function pointers, the attribute only takes effect when the call target is directly the variable which carries the attribute? For example, this has no effect:

void (*fp)() __attribute__((nomerge));
void callit() {
  auto tmp = fp;
  tmp();
  (*fp)(); // I think TargetDecl will be null in the code, tell me if I'm wrong
}

Thank you for the review!

The purpose of the attribute is really limited to preserving source location information on instructions, and this isn't really a supported usage.

But it would prevent merging, right? I understand that using this attribute might block some legitimate optimizations.
Or do you mean that it is a "best effort" thing and not all transformations might honor it?

The BPF backend and verifier needs to learn to tolerate valid LLVM transforms if it wants to be a real LLVM backend. Of course, you can do what you like.

Well, I agree with that. The issue here is with the map API design: it is polymorphic, but it is expected that at load time verifier can replace all polymorphic calls with static calls.
People know it and write code in a way that allows verifier to infer which static functions to call. So I have limited options here:

  • either adjust IR at early stages of pipeline so that specific calls are not altered;
  • or do nothing and recommend users to use [[clang::nomerge]] on statement level or insert something like asm volatile ("" :::"memory") here and there.

But, yeah, I understand that is is not a C language semantics and headache is mine.

Considered in the context of the original use case, I think it's reasonable to allow the attribute on function pointers for the same reasons we allow it on function declarations. It makes it easy to work the attribute onto the direct call sites of the function without modifying tons of source code. However, I'd like to see clearer documentation on the limitations.

Please see my "inline" response.

clang/include/clang/Basic/AttrDocs.td
549–577

I can make it more elaborate, would the text as below be fine?
Regarding TargetDecl value it is not null both times:

  • (VarDecl 'tmp' (ImplicitCastExpr (DeclRefExpr (Var fp))))
  • (VarDecl 'fp')

`nomerge` attribute can also be used as function attribute to prevent all
calls to the specified function from merging. It has no effect on indirect
calls to such functions. For example:

.. code-block:: c++

[[clang::nomerge]] void foo(int) {}

void bar(int x) {
  auto *ptr = foo;
  if (x) foo(1); else foo(2); // will not be merged
  if (x) ptr(1); else ptr(2); // indirect call, can be merged
}

`nomerge` attribute can also be used for pointers to functions to
prevent calls through such pointer from merging. In such case the
effect applies only to a specific function pointer. For example:

.. code-block:: c++

[[clang::nomerge]] void (*foo)(int);

void bar(int x) {
  auto *ptr = foo;
  if (x) foo(1); else foo(2); // will not be merged
  if (x) ptr(1); else ptr(2); // 'ptr' has no 'nomerge' attribute,
                              // can be merged
}
rnk accepted this revision.Jun 21 2023, 1:40 PM

The purpose of the attribute is really limited to preserving source location information on instructions, and this isn't really a supported usage.

But it would prevent merging, right? I understand that using this attribute might block some legitimate optimizations.
Or do you mean that it is a "best effort" thing and not all transformations might honor it?

Essentially, yes.

Anyway, I think this looks good, please apply the doc update.

clang/include/clang/Basic/AttrDocs.td
549–577

Looks good to me, thanks!

This revision is now accepted and ready to land.Jun 21 2023, 1:40 PM
eddyz87 updated this revision to Diff 533824.Jun 22 2023, 5:14 PM

Rebase, nomerge documentation updated.

eddyz87 marked 2 inline comments as done.Jun 22 2023, 5:16 PM
This revision was landed with ongoing or failed builds.Jun 26 2023, 3:23 PM
This revision was automatically updated to reflect the committed changes.