This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Derive AACallEdges attribute
ClosedPublic

Authored by kuter on Jun 10 2021, 1:51 PM.

Details

Summary

This attribute computes the optimistic live call edges using the attributor
liveness information. This attribute will be used for deriving a
inter-procedural function reachability attribute.

Diff Detail

Event Timeline

kuter created this revision.Jun 10 2021, 1:51 PM
kuter requested review of this revision.Jun 10 2021, 1:51 PM
Herald added a reviewer: sstefan1. · View Herald Transcript
Herald added a reviewer: baziotis. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
kuter added a comment.EditedJun 10 2021, 2:17 PM

The way CallGraph handles callbacks doesn't make sense to me.
In the CallGraph::populateCallGraphNode callbacks are handled by the forEachCallbackFunction function.
like this:

// Add reference to callback functions.
forEachCallbackFunction(*Call, [=](Function *CB) {
    Node->addCalledFunction(nullptr, getOrInsertFunction(CB));
});

Which uses the AbstractCallSite::getCallbackUses and checks if the Use is a function.
The source of the AbstractCallSite::getCallbackUses is given below.

void AbstractCallSite::getCallbackUses(       
    const CallBase &CB, SmallVectorImpl<const Use *> &CallbackUses) { 
  const Function *Callee = CB.getCalledFunction(); 
  if (!Callee)                                                      
    return; 

   ....                                                              
}

CB.getCalledFunction returns null if the call is indirect. And a callback is indirect, right ?
Also the getCallbackUses uses the callback metadata which is intended for the function definitions rather than calls.

Am I misunderstanding something ?

Can you please reupload after clang formatting the code :)

ormris removed a subscriber: ormris.Jun 11 2021, 2:53 PM
kuter updated this revision to Diff 351694.Jun 12 2021, 6:36 PM
  • Custom iterator for lazly creating AACallEdges
  • GraphTraits implementation
kuter added a comment.Jun 12 2021, 6:39 PM

Can you please reupload after clang formatting the code :)

Sorry, I thought arcanist did that automatically.

kuter updated this revision to Diff 351726.Jun 13 2021, 9:29 AM
  • Added DOTGraphTraits implementation
  • Added Test
  • Small fix
kuter edited the summary of this revision. (Show Details)Jun 13 2021, 9:37 AM
jdoerfert added inline comments.Jun 13 2021, 10:46 AM
llvm/include/llvm/Transforms/IPO/Attributor.h
3881

Maybe call it AACallEdgeIterator?

3935

Don't store the Attributor instance, let the user pass it where needed.

3961

Same as above.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8177

Rather than asking for liveness, use a generic value traversal to visit all potential values of Val. Each of which is a potential callee. Also make sure to handle the case where a callee is not known, e.g., a function pointer call. We need to add them to the call graph as unknown callees too.

8182

checkForAllCallLikeInstructions

8193

Return at least the number as well.

kuter updated this revision to Diff 351736.Jun 13 2021, 12:22 PM
  • Reanme custom iterator
  • Use checkForAllCallLikeInstructions
kuter marked 2 inline comments as done.Jun 13 2021, 12:23 PM
kuter added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
3935

I wanted not to but we need this to have a GraphTraits implementation.
GraphTraits::child_begin only takes the node itself and we need Attributor to create the AACallEdges.
I don't think there are any other way to inject the Attributor inside GraphTraits::child_begin

jdoerfert added inline comments.Jun 13 2021, 2:21 PM
llvm/include/llvm/Transforms/IPO/Attributor.h
3935

Then store it in AACallGraphNode I gues.

kuter updated this revision to Diff 351760.Jun 13 2021, 6:51 PM

Move attributor instance to the AACallGraphNode

kuter added inline comments.Jun 14 2021, 8:11 PM
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8177

Ok when I add the genericValueTraversal should we still keep the getCallbackUses function since it uses the callback metadata ?
There is also a callees meta data that is ment to annotate a call, it gurantees that the called function is one of the functions it specifies.
This seems useful for our purposes. getCallbackUses doesn't use it.

I was planning to add a flag to tell if there are any calls to a unknown function somethling like bool hasUnknownCallee()

Also for us to be able to say that function x can't reach function b, we need to make sure that we know all functions that function x might ever call. Meaning that if there are any functions that x can reach that has a unknown callee we can't tell for sure, right ?

jdoerfert added inline comments.Jun 14 2021, 9:30 PM
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8177

I would have done getCallbackUses and then on the argument operands a genericValueTraversal. In the code above that means genericValueTraversal for Val. See also callees metadata below

If we have an unknown caller, we can reach any "visible" function. For now, you can assume any function but in reality you could exclude internal functions that don't have their address taken, for example.

So, how to use callees metadata is a good question. We want to expand what an AbstractCallSite is, I think. So first, an AbstractCallSite needs to get a new "kind", which is a direct call site with a "potential" callee. This is somewhat independent and we want to use it also as part of the checkForAllCallSites. To make this happen we'll need to either go through metadata uses (which is hard, IIRC) or use the AACallEdge in checkForAllCallSites. For the latter we also track in the AACallEdgeFunction node what all callers are, basically the callers should notify the AACallEdgeFunction of the Callee. If we do it that way, we can handle callee metadata here. I would suggest to have a new AbstractCallSite interface that allows you to iterate over a set of AbstractCallSites that a CallBase "exposes". For a direct call, that's just the direct callee. For a call with callees metadata, it will be one AbstractCallSite per callee node specified. And for all callbacks annotations, it would be one AbstractCallSite per callback annotation on the callee, basically line 8166-8168 above. This new interface would replace the getCallbackUses above and we could also (for now at least) ignore the genericValueTraversal thing. Certainly remove the isdead part.
Does this make sense?

kuter updated this revision to Diff 352471.EditedJun 16 2021, 10:07 AM
  • Add genericValueTraveral
  • Add test a test for call with select (Doesn't work :/)
  • Track if there are any unknown callees.
kuter added inline comments.Jun 16 2021, 10:14 AM
llvm/test/Transforms/Attributor/callgraph.ll
63

This edge is missing. genericValueTraversal doesn't seem to be returning the func4.

More comments, also, please fix the clang tidy warnings, they make it hard to read.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8152
8170

You need to bail if the return value is false, IIRC.

8173
8192

You need to bail if the return value is false.

llvm/test/Transforms/Attributor/callgraph.ll
63

That would be a problem, can you try to figure out what's going on?

kuter updated this revision to Diff 352488.Jun 16 2021, 11:29 AM

Fix code styling.

there is not much I can do with the clang tidy since.

  • I can't rename nodes_begin, it is needed for GraphTraits.
  • And you should never desconstruct a AACallGraphNode.
kuter updated this revision to Diff 352536.Jun 16 2021, 1:21 PM

Fix the problem with the missing edge.

kuter marked an inline comment as done.Jun 16 2021, 1:22 PM
kuter updated this revision to Diff 352541.Jun 16 2021, 1:36 PM

Asume that there are unknown callees if we haven't looked at all values/instructions.

kuter updated this revision to Diff 352542.Jun 16 2021, 1:37 PM

clang-format

kuter marked 8 inline comments as done.Jun 16 2021, 1:39 PM

Can you add a test case with a callback and one with an unknown function pointer call. Maybe also one with a function pointer and callee metadata + a FIXME.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8158
kuter updated this revision to Diff 352611.Jun 16 2021, 8:17 PM
  • Handle callees metadata..
  • Added test cases for callees and callback attributes.
  • Small fix
kuter updated this revision to Diff 352613.Jun 16 2021, 8:20 PM

clang-format

kuter updated this revision to Diff 352615.Jun 16 2021, 8:21 PM

small fix

jdoerfert accepted this revision.Jun 17 2021, 6:38 AM

LGTM, some nits. Thanks for fixing this up so quickly.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8179

We should not do this if we have callee metadata. Except maybe (later) if we can improve the callee metadata.

8192

No cast, a Function is a Value.

8224

Documentation.

This revision is now accepted and ready to land.Jun 17 2021, 6:38 AM
kuter retitled this revision from [WIP][Attributor] Derive AACallEdges attribute to [Attributor] Derive AACallEdges attribute.Jun 17 2021, 2:05 PM
This revision was landed with ongoing or failed builds.Jun 17 2021, 2:29 PM
This revision was automatically updated to reflect the committed changes.
hokein added a subscriber: hokein.Jun 18 2021, 3:03 AM

I encountered two issues while integrating this patch to our internal codebase -- And the fixes look like trivial, I landed them directly to unblock my integration. Feel free to improve them if they don't suit.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
8146

undefined behavior detected by asan: HasUnknownCallee value might be undefined, fixed in 3f5d53a525c62c507a482fd5f4c08451835b342d.

8206

hard-coding a filename looks like not a common pattern in LLVM AFAIK, and the test might fail on some sandbox environment. I landed 7670938bbad8755a34a282febc852651255c69b3 to improve this behavior.