This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Respect the convergent attribute.
ClosedPublic

Authored by jlebar on Feb 22 2016, 3:33 PM.

Details

Summary

Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

for (int i = 0; i < N; ++i) {
  if (i == 0) convergent_op();
  foo();
}

into

int i = 0;
if (N % 2 == 1) {
  convergent_op();
  foo();
  ++i;
}
for (; i < N - 1; i += 2) {
  if (i == 0) convergent_op();
  foo();
  foo();
}.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Diff Detail

Repository
rL LLVM

Event Timeline

jlebar updated this revision to Diff 48741.Feb 22 2016, 3:33 PM
jlebar retitled this revision from to [LoopUnroll] Respect the convergent attribute..
jlebar updated this object.
jlebar added a reviewer: resistor.
jlebar added a subscriber: llvm-commits.

Friendly ping. I'm happy to send this to someone else if you like; you've just been interested and had really good feedback on the rest of this stuff.

jlebar added a reviewer: rnk.Mar 11 2016, 5:31 PM
mzolotukhin accepted this revision.Mar 11 2016, 6:27 PM
mzolotukhin added a reviewer: mzolotukhin.

Hi Justin,

The patch looks good to me, but maybe we need to add a test with pragma and convergent call - I think it's uncovered now.

Michael

lib/Transforms/Utils/LoopUnroll.cpp
275–278 ↗(On Diff #48741)

I think this should go in a separate commit.

This revision is now accepted and ready to land.Mar 11 2016, 6:27 PM
jlebar added inline comments.Mar 14 2016, 4:13 PM
lib/Transforms/Utils/LoopUnroll.cpp
275–278 ↗(On Diff #48741)

Since you approved this patch, I'm going to submit as one commit, I hope that's OK. Other passes already remove said prelude in the non-convergent case -- the only reason I think you'd want this change is if you're also taking the rest of this patch.

Maybe if this were a huge change and there were some readability benefit to be had from splitting it up, but that also doesn't seem to be the case here. Keeping this all together also makes our life a bit easier if we have to revert (which is possible if this prolog removal is somehow wrong).

maybe we need to add a test with pragma and convergent call - I think it's uncovered now.

Done, thanks.

mzolotukhin added inline comments.Mar 14 2016, 4:19 PM
lib/Transforms/Utils/LoopUnroll.cpp
275–278 ↗(On Diff #48741)

The reason I proposed it is that this is a change that touches general code, while all the rest is here to deal with 'convergent' attribute. If for some (theoretical) reason we later decide to revert the 'convergent' part, we still might want to keep the generic part. But that's unlikely to actually happen so yeah, you an go ahead with a single patch.

This revision was automatically updated to reflect the committed changes.
jlebar added inline comments.Mar 14 2016, 4:21 PM
lib/Transforms/Utils/LoopUnroll.cpp
275–278 ↗(On Diff #48741)

Ah, interesting, you saw this in the reverse way!

Thanks, just submitted. And thank you very much for the review -- it's much appreciated.