This is an archive of the discontinued LLVM Phabricator instance.

[Globals] Treat nobuiltin fns as maybe-derefined.
ClosedPublic

Authored by fhahn on Mar 1 2021, 2:09 PM.

Details

Summary

Fixes PR49022.

Callsites could be marked as builtin while calling nobuiltin
functions. This can lead to problems, if local optimizations apply
transformations based on the semantics of the builtin, but then IPO
treats the function as nobuiltin and applies a transform that breaks
builtin semantics (assumed earlier).

To avoid this, mark such functions as maybey-derefined, to avoid IPO
transforms on them that may break assumptions of earlier calls.

Fixes #57075
Fixes #48366

Diff Detail

Event Timeline

fhahn created this revision.Mar 1 2021, 2:09 PM
fhahn requested review of this revision.Mar 1 2021, 2:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2021, 2:09 PM

I'm not thrilled to pretend they may be derefinded but it seems practical and sufficient. Fine with me.

rsmith added a comment.Mar 1 2021, 3:29 PM

I'm not thrilled to pretend they may be derefinded but it seems practical and sufficient. Fine with me.

I think it's feasible to argue that this really is a derefinement issue -- or at least the dual of one. In this case, we have two different notional definitions of the function:

  1. LLVM's idealized model of the builtin function.
  2. The actual implementation.

... where (2) is (ideally) a refinement of (1). In a normal derefinement bug, we can't use positive properties of (2) because we might end up picking a different refinement of (1) as the actual definition. In this case, we can't use negative properties of (1) (such as "parameter is noundef") after refining to (2) because those properties might not be true of (2). I suppose one way to get a bit more comfortable with calling this derefinement would be to argue that when the optimizer makes transformations based on known builtin semantics, what it's notionally doing is replacing the function definition with the idealized builtin definition (which is a derefinement step, at least if we've already seen the real definition), performing analysis / transformation with that derefined definition, then re-refining back to the actual definition.

llvm/include/llvm/IR/GlobalValue.h
147

I don't think nobuiltin is relevant; I think what matters is whether the function is a builtin function. We have the same miscompile if we removed all the builtin / nobuiltin annotations from the IR generated for the PR49022 example, and this change does nothing to address that case.

jdoerfert added inline comments.Mar 1 2021, 3:38 PM
llvm/include/llvm/IR/GlobalValue.h
147

So you are saying we can legally define our own operator delete(void*) = _ZdlPv without a nobuiltin attribute and that would still need to stop IPO from assuming it is the idealized version we all know and love?

fhahn added inline comments.Mar 2 2021, 7:54 AM
llvm/include/llvm/IR/GlobalValue.h
147

I think the problem is how would we detect that a given global value may be used with builtin semantics?

Do frontends create function definitions for recognized builtins without nobuiltin? If frontends need to do so, can they add an attribute to indicate that this functions may be used with some builtin semantics?

llvm/include/llvm/IR/GlobalValue.h
147

I think the problem is how would we detect that a given global value may be used with builtin semantics?

Do frontends create function definitions for recognized builtins without nobuiltin? If frontends need to do so, can they add an attribute to indicate that this functions may be used with some builtin semantics?

How can frontends know about 'future builtins' ? Should they already annotate all functions as 'nobuiltin' (except for the known builtins).
What about the following: https://www.godbolt.org/z/To69x5
A 'c' program where we happen to use the encoded 'new/delete' names. I don't think llvm should optimize those.

ychen added a subscriber: ychen.May 15 2021, 12:25 AM
ychen added inline comments.May 15 2021, 10:56 PM
llvm/include/llvm/IR/GlobalValue.h
147

I have the same thoughts as @jdoerfert. If there is no nobuiltin annotation for the builtin definition, the optimization to trap for the PR49022 example should be expected and not a bug.

I agree with @rsmith 's that it is whether the function is a builtin function matter, not the annotations. However, it seems that builtin annotation is not checked anywhere and only emitted for C++ global replacement new/delete, so basically it is a no-op for LLVM. nobuiltin annotation is emitted for all builtin definition with inline specifier (https://github.com/llvm/llvm-project/blob/d56729b4a4393da9c65bdfe762b51f8b7b0ce0ca/clang/lib/CodeGen/CodeGenModule.cpp#L2097). So nobuiltin is almost the same as saying this is a builtin definition. To achieve what we want, just make sure nobuiltin annotation is also emitted for any builtin definition without inline specifier. This inhibits IPO for all builtin definitions, how much performance would be lost due to that?

ychen added a comment.Sep 16 2021, 4:57 PM

I think we should move forward with this approach. The reason being that: the derefinement *only* happens for replaceable builtins. For library provided version of these, the derefinement should not happen (the implementation must provide expected semantics) otherwise it is should be a UB. For replacement definition, we already decorate the function with "nobuiltin" (and it seems this is the use case "nobuiltin" is designed for (https://groups.google.com/g/llvm-dev/c/0HA_Td1y5Po).

PS: I think implementing what @rsmith has suggested should also work: use TargetLibararyInfo to query if a function definition is builtin. But it seems to do more than we needed (i.e, for library definitions where derefinement cannot happen, we don't need this?), and querying TargetLibararyInfo has some compile-time impact.

thoughts?

I found this problem (PR49022) when running some regression test for an out of tree target.
So, what is the status of patch?
Any reason this hasn't been committed?

fhahn updated this revision to Diff 454581.Aug 22 2022, 11:42 AM

Rebase and ping. This also popped up in anohter issue: https://github.com/llvm/llvm-project/issues/57075

Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 11:42 AM
fhahn edited the summary of this revision. (Show Details)Aug 22 2022, 11:44 AM
ychen accepted this revision.Aug 22 2022, 11:54 AM

I also saw similar problems in internal codes recently.

LGTM.

This revision is now accepted and ready to land.Aug 22 2022, 11:54 AM
This revision was landed with ongoing or failed builds.Aug 23 2022, 5:45 AM
This revision was automatically updated to reflect the committed changes.
w2yehia added a subscriber: w2yehia.Sep 8 2022, 9:29 AM
w2yehia added inline comments.
llvm/include/llvm/IR/GlobalValue.h
147

just wanna point out that _ZdlPv and _Znwm fall under the reserved names, and so the compiler might be allowed to assume certain stuff about such symbols.

C11 standard (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
7.1.3 Reserved Identifiers:
All identifiers that begin with an underscore and either an uppercase letter or another

underscore are always reserved for any use

syzaara added subscribers: Florian, syzaara.EditedSep 12 2022, 10:34 AM

@fhahn
Hello. I am uncertain whether having ‘nobuiltin’ attribute on a function declaration should imply that it returns true for mayBeRedefined. The description for ‘nobuiltin’ in the llvm reference does not imply that the function definition may change. We should still be able to enable IPO transformations for functions with nobuiltin if the transformations do not make assumptions about the semantics of the function. There is also no statement that ‘nobuiltin’ applies only to library functions. If the attribute is used on non library user function, all IPO transformations are then blocked on this function. One suggested workaround could be to remove the ‘nobuiltin’ attribute from user functions that are not library calls in InferFunctionAttrs.cpp. with something like:

+++ b/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
@@ -21,7 +21,7 @@ static bool inferAllPrototypeAttributes(
   Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
  bool Changed = false;
 
- for (Function &F : M.functions())
+ for (Function &F : M.functions()) {
   // We only infer things using the prototype and the name; we don't need
   // definitions. This ensures libfuncs are annotated and also allows our
   // CGSCC inference to avoid needing to duplicate the inference from other
@@ -32,7 +32,12 @@ static bool inferAllPrototypeAttributes(
     Changed |= inferNonMandatoryLibFuncAttrs(F, GetTLI(F));
    Changed |= inferAttributesFromOthers(F);
   }
-
+  if (!F.isDeclaration()) {
+   LibFunc LibFn;
+   if (F.hasFnAttribute(Attribute::NoBuiltin) && !GetTLI(F).getLibFunc(F, LibFn))
+    F.removeFnAttr(Attribute::NoBuiltin);
+  }
+ }
  return Changed;
 }
syzaara added a comment.EditedSep 19 2022, 7:02 AM

ping
Hi @fhahn , could you please provide your thoughts on the earlier comment?

@fhahn
Hello. I am uncertain whether having ‘nobuiltin’ attribute on a function declaration should imply that it returns true for mayBeRedefined. The description for ‘nobuiltin’ in the llvm reference does not imply that the function definition may change. We should still be able to enable IPO transformations for functions with nobuiltin if the transformations do not make assumptions about the semantics of the function. There is also no statement that ‘nobuiltin’ applies only to library functions. If the attribute is used on non library user function, all IPO transformations are then blocked on this function. One suggested workaround could be to remove the ‘nobuiltin’ attribute from user functions that are not library calls in InferFunctionAttrs.cpp. with something like:

+++ b/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
@@ -21,7 +21,7 @@ static bool inferAllPrototypeAttributes(
   Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
  bool Changed = false;
 
- for (Function &F : M.functions())
+ for (Function &F : M.functions()) {
   // We only infer things using the prototype and the name; we don't need
   // definitions. This ensures libfuncs are annotated and also allows our
   // CGSCC inference to avoid needing to duplicate the inference from other
@@ -32,7 +32,12 @@ static bool inferAllPrototypeAttributes(
     Changed |= inferNonMandatoryLibFuncAttrs(F, GetTLI(F));
    Changed |= inferAttributesFromOthers(F);
   }
-
+  if (!F.isDeclaration()) {
+   LibFunc LibFn;
+   if (F.hasFnAttribute(Attribute::NoBuiltin) && !GetTLI(F).getLibFunc(F, LibFn))
+    F.removeFnAttr(Attribute::NoBuiltin);
+  }
+ }
  return Changed;
 }