Add an IR expansion pass for the experimental reductions
ClosedPublic

Authored by aemerson on Apr 19 2017, 2:34 PM.

Details

Summary

This is an IR expansion pass intended to allow targets to opt-in to using the experimental reduction intrinsics introduced in D30086.

Its purpose is to see the effects of switching to the intrinsics in the IR, so this pass should be added to a target's pass config late, just before codegen. The expansion should result in the same shufflevector sequence form that targets currently expect reductions to be in.

Diff Detail

Repository
rL LLVM
aemerson created this revision.Apr 19 2017, 2:34 PM

Adding some target people - I think they ought to care about this more than I do. :-)

lib/CodeGen/ExpandReductions.cpp
82

I'm not a huge fan of this - I would prefer not to rely on the invalidation semantics. Maybe collect all relevant instructions into a vector first, then do the replacement?
(But if others disagree, this is fine.)

84

How do you expect this to happen?

92

What do we expect to happen in this case?

96

Please annotate the fallthrough here. (And perhaps it would be better to rewrite this to avoid it)

105

What about the internal instructions?

117

I'd expect a target query somewhere, regarding whether the intrinsic needs to be expanded.

aemerson marked 4 inline comments as done.Apr 26 2017, 3:18 AM
aemerson added a subscriber: RKSimon.
aemerson added inline comments.
lib/CodeGen/ExpandReductions.cpp
82

I can do that, but I've seen this kind of thing done before in other places.

84

Sorry, not entirely sure what you mean? This is an early exit if the given instruction isn't an intrinsic call.

92

As no in-tree target currently supports ordered reductions, and given that for SVE we want to enable support completely without using this expansion pass, I decided against trying to handle ordered reductions here. We just skip the intrinsic if we find it's an ordered reduction.

If other targets want to experiment with ordered I think they can implement expansion via some scalarization method here.

117

My expectation was that targets wouldn't need, at least at first, that level of granularity. @RKSimon what do you think about this?

Please add full context to the diff - especially as its dependent on another (in progress) patch.

lib/CodeGen/ExpandReductions.cpp
117

Are we guaranteeing that the reductions will match the ones supported by TargetTransformInfo::getReductionCost ?

test/CodeGen/Generic/expand-experimental-reductions.ll
1

I'd prefer to see the full reduction codegen here - regenerate with utils\update_test_checks.py ?

aemerson marked 3 inline comments as done.Apr 26 2017, 5:47 AM
aemerson added inline comments.
lib/CodeGen/ExpandReductions.cpp
117

Do you mean useReductionIntrinsic()? If so, I suppose it comes down to the exact use case of this expansion. Michael originally asked for this so that targets could check the effect of using the intrinsics at the IR level only, and at a very late stage converting them into the shuffle form we have now. For that, I don't see why you would care about which individual intrinsics are expanded, rather than a simple on/off decision.

If however there might be more uses of this, for example in future, if we want to enable intrinsic forms for *all* targets as a canonical form, and then use this pass with TTI to make a target dependent decision on which codegen-level form is preferred, then I think a TTI hook would make sense.

I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it.

RKSimon added inline comments.Apr 26 2017, 6:35 AM
lib/CodeGen/ExpandReductions.cpp
96

You've marked it as done by LLVM_FALLTHROUGH is still missing from these - you will get warnings on some buildbots

aemerson added inline comments.Apr 26 2017, 6:39 AM
lib/CodeGen/ExpandReductions.cpp
96

I might be misunderstanding what the "Done" means. I used it to mean I'll address this in the next patch when I upload it. I haven't got around to that yet.

mkuper added inline comments.Apr 26 2017, 10:45 AM
lib/CodeGen/ExpandReductions.cpp
82

I'd suggest getting another reviewer's opinion on this.

84

Sorry, I misread this is dyn_cast<Instruction>, ignore.

92

The issue is that target-independent intrinsics are, by definition, supposed to be handled by any target. I shouldn't see a backend crash if I write IR that has the ordered intrinsic, and try to compile it for x86.

Having said that - this is fine for now, but if we ever want to make these intrinsics non-experimental, this will have to be dealt with somehow. Please add a TODO.

117

I can add a hook anyway, perhaps defaulting to "expand all intrinsics" unless the target overrides it.

That's exactly what I'd expect, thanks.

aemerson updated this revision to Diff 98285.May 9 2017, 7:56 AM

Addressed review comments, rewritten the pass a bit to be somewhat neater. D30086 is now committed now so this is ready to go if it looks ok.

mkuper accepted this revision.May 9 2017, 3:38 PM

LGTM, with a nit.

lib/CodeGen/ExpandReductions.cpp
138

I don't believe you should ever be in the situation you don't have a TTI here. So it should be safe to just do:

const auto *TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
This revision is now accepted and ready to land.May 9 2017, 3:38 PM
aemerson marked an inline comment as done.May 10 2017, 2:21 AM

Thanks, I'll make that change and commit.

This revision was automatically updated to reflect the committed changes.