This is an archive of the discontinued LLVM Phabricator instance.

[Inliner] Teach inliner to merge 'min-legal-vector-width' function attribute
ClosedPublic

Authored by craig.topper on Jul 10 2018, 5:08 PM.

Details

Summary

When we inline a function with a min-legal-vector-width attribute we need to make sure the caller also ends up with at least that vector width.

In the future we may want to have heuristics to block inlining for different vector widths possibly with another attribute, but we haven't defined that yet.

I've based this entirely on the stack-probe-size merging code.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Jul 10 2018, 5:08 PM
chandlerc added inline comments.Jul 23 2018, 4:26 PM
lib/IR/Attributes.cpp
1685–1686 ↗(On Diff #154904)

I feel like we're going to want to do something a bit more nuanced than this...

For example, consider a function doing dynamic dispatch based on CPUID detection. It will look like:

void do_algo() {
  if (has_feature_X())
    do_algo_with_X();
  else if (has_feature_Y())
    do_algo_with_Y();
  else
    do_algo_generic();
}

I don't think we're going to want to promote the min legal width of this wrapper to be the largest of all the things it calls, even if they are viable for inlining....

Right now, these usually are subtarget selecting and I think we *block* inlining in that case. But now that we can talk about vector width, I could imagine the above selecting a 256-bit algorithm when running on a Skylake CPU, but a 128-bit algorithm when running on older CPUs, and not needing an target features to differ between the two. Just the vector min length.

If we need a heuristic, the one I would suggest goes along the lines of:

  1. Explicit attributes always win, we don't adjust them.
  2. An implicit attribute can be promoted iff the callee post dominates the entry of the caller. That is, the callee is not *predicated* in some way that might select one callee instead of another.

Would #2 still be too restrictive for the use cases you have in mind?

craig.topper added inline comments.Jul 23 2018, 4:49 PM
lib/IR/Attributes.cpp
1685–1686 ↗(On Diff #154904)

This code is the code that gets called after the inlining decision has been made right? My immediate goal was just to get always_inline to propagate correctly so the intrinsics get propagated.

Heuristics would need to go into something like TTI::areInlineCompatible, but we don't have a CallSite there for #2.

chandlerc added inline comments.Jul 23 2018, 5:12 PM
lib/IR/Attributes.cpp
1685–1686 ↗(On Diff #154904)

Ok, that makes more sense. Maybe some comments here explaining that while this may not be *desirable*, there is a pretty clear semantic transformation to merge the two attributes?

(Also, the always_inline thing is a great "0" in my heuristics above! =] So hopefully will be easy to at least write an inital cut at TTI::areInlineCompatible that gets past basic sanity by inlining intrinsics and such w/o pushing much further.

Improve comment.

As I mentioned on IRC always_inline doesn't go through inline cost analysis including attribute compatibility checking. So this patch is the minimum necessary to make intrinsics propagate their vector width to the function they get inlined into.

Add back the test case.

This revision is now accepted and ready to land.Jul 24 2018, 10:58 AM
This revision was automatically updated to reflect the committed changes.