Page MenuHomePhabricator

[LoopIdiom] Introduce 'left-shift until bittest' idiom
ClosedPublic

Authored by lebedev.ri on Nov 8 2020, 11:49 AM.

Details

Summary

The motivation here is the following inner loop in fp16/fp24 -> fp32 expander,
that runs as part of the floating-point DNG decompression in RawSpeed library:
https://github.com/darktable-org/rawspeed/blob/cd380bb9a209bd2e7a0e7022b0cab04528d151e7/src/librawspeed/decompressors/DeflateDecompressor.cpp#L112-L115

while (!(fp32_fraction & (1 << 23))) {
  fp32_exponent -= 1;
  fp32_fraction <<= 1;
}

(https://godbolt.org/z/r13YMh)
As one might notice, that loop is currently uncountable, and that whole code stays scalar.
Yet, it is rather trivial to make that loop countable:
https://godbolt.org/z/do8WMz
and we can prove that via alive2:
https://alive2.llvm.org/ce/z/7vQnji (ha nice, isn't it?)
... and that allow for the whole fp16->fp32 code to vectorize:
https://godbolt.org/z/7hYr13

Now, while i'd love to get there, i feel like i should take it in steps.

For now, this introduces support for the most basic case,
where the bit position is known as a variable,
and the loop *will* go away (has no live-outs other than the recurrence,
no extra instructions in the loop).

I have added sufficient (i believe) test coverage,
and alive2 is happy with those transforms.

Thoughts?

Diff Detail

Event Timeline

lebedev.ri created this revision.Nov 8 2020, 11:49 AM
lebedev.ri requested review of this revision.Nov 8 2020, 11:49 AM
lebedev.ri edited the summary of this revision. (Show Details)Nov 8 2020, 12:42 PM
liuz added a subscriber: liuz.Nov 8 2020, 5:52 PM
jdoerfert added inline comments.Nov 9 2020, 8:02 AM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2088

Drive by: Should we move this "early exit" before step 2?

lebedev.ri marked an inline comment as done.Nov 9 2020, 8:14 AM
lebedev.ri added inline comments.
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2088

We can't do that, because we can't do this check before we've canonicalized the comparison predicate
(see previous if), which we can't do until after we've matched the bit-mask
(because in next patches said matching will result in other predicate changes)

lebedev.ri marked an inline comment as done.Nov 12 2020, 5:42 AM

Note that i'm mostly fine with doing this in post-commit-review mode,
i'm reasonably confident with the base change itself,
i'm mostly only looking for the high-level review.

Rebase, NFC.

Do recurrence analysis afterwards, not when matching backedge condition.

lebedev.ri edited the summary of this revision. (Show Details)Nov 19 2020, 10:21 AM
craig.topper added inline comments.Nov 19 2020, 1:51 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2066

Why are we adding a whole new matcher infrastructure for phis? Why can't we use getIncomingValueForBlock?

(This is in a need of a bit of rebase once i'm ready to post the final patch to actually rewrite the loop into countable.)

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2066

There is no particular reason other than that
m_c_And(m_Value(), m_Constant())
is preferred to
I.getOpcode() == Instruction::And && (isa<Constant>I.getOperand(0) || isa<Constant>I.getOperand(1)),
because it's just that much more readable.
Of course, i can use old good methodology..

lebedev.ri updated this revision to Diff 309874.Dec 7 2020, 4:40 AM

Rebased. Small tidyups.

Originally, i allowed the wrong instruction to liveout - in my motivational example
i need %x.next to liveout, but i allowed only %x.curr.
So let's just allow both of them to liveout.

lebedev.ri marked an inline comment as done.

@craig.topper thank you for taking a look!
Dropping PHI abstraction - on a second look it's more decent that i remember.

craig.topper added inline comments.Dec 14 2020, 2:17 PM
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2077

Should this be add -1 to match the code?

2174

This is called ".leadingonepos" in the comment block.

lebedev.ri marked 2 inline comments as done.

Fixup values names / comments, NFC otherwise.

llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
2066

On a second look, let's drop it for now.

2174

Argh, i was sure i fixed that already :)

This revision is now accepted and ready to land.Dec 18 2020, 12:53 PM

LGTM

Thank you for the review!
Note that there are 3 essential followups here, don't bother with D91725/D91726, but it would be good to get an agreement on D92754.

Rebased, NFC.

This revision was landed with ongoing or failed builds.Dec 23 2020, 11:28 AM
This revision was automatically updated to reflect the committed changes.