This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix sign-extends for type-shrinking
ClosedPublic

Authored by mssimpso on Dec 6 2016, 9:02 AM.

Details

Summary

This patch ensures the correct minimum bit width during type-shrinking. Previously when type-shrinking, we always sign-extended values back to their original width. However, if we are going to sign-extend, and the sign bit is unknown, we have to increase the minimum bit width by one bit so the sign-extend will fill the upper bits correctly. If the sign bit is known to be zero, we can perform a zero-extend instead. This should fix PR31243.

Reference: https://llvm.org/bugs/show_bug.cgi?id=31243

Event Timeline

mssimpso updated this revision to Diff 80434.Dec 6 2016, 9:02 AM
mssimpso retitled this revision from to [SLP] Fix sign-extends for type-shrinking.
mssimpso updated this object.
mssimpso added reviewers: mzolotukhin, mkuper.
mssimpso added subscribers: llvm-commits, mcrosier.
mkuper added inline comments.Dec 6 2016, 10:54 AM
test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
13

Can you make the test more explicit by checking the extends we actually want to appear?

mssimpso added inline comments.Dec 6 2016, 11:01 AM
test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
13

Sure. I'll upload a new version.

mssimpso updated this revision to Diff 80453.Dec 6 2016, 11:23 AM
mssimpso marked an inline comment as done.

Addressed Michael's comments.

  • Made expected test output more explicit.
mkuper added inline comments.Dec 6 2016, 11:30 AM
test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
13

Shouldn't we also have a test where we increase the bit-width because we don't know the sign bit?
I'm a bit surprised this didn't affect any of our existing tests.

mssimpso added inline comments.Dec 6 2016, 11:49 AM
test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
13

That's a good idea - I'll add another test for the sext case.

mssimpso updated this revision to Diff 80786.Dec 8 2016, 11:08 AM
mssimpso marked an inline comment as done.

Addressed Michael's comments.

  • Reduced the existing test case. The test now uses -slp-threshold to force vectorization.
  • Added another test for the sext case.
  • Added FIXME's in both the source and test explaining why the sext case is currently suboptimal, but correct.
mssimpso updated this revision to Diff 80802.Dec 8 2016, 12:02 PM

Renamed test variables.

mkuper accepted this revision.Dec 9 2016, 11:46 AM
mkuper edited edge metadata.

LGTM with a couple of nits.

lib/Transforms/Vectorize/SLPVectorizer.cpp
1936

Could you rename this? ExTy makes it look like a type (yes, it's the "type of the extensions", but...)

3542

Can you use all_of?

This revision is now accepted and ready to land.Dec 9 2016, 11:46 AM
mssimpso added inline comments.Dec 9 2016, 11:51 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
1936

Sure, I'll just go with Extend.

3542

Yes, good point.

Gerolf requested changes to this revision.Dec 9 2016, 2:52 PM
Gerolf added a reviewer: Gerolf.
Gerolf added a subscriber: Gerolf.

Michael, could you also lend this your expert eye? Do you agree with the extra memory (pair vs vector) and compile-time (look at the loop tree) investment for this fix? Shortening the types may not be worth it at least on some architectures.

Thanks
Gerolf

lib/Transforms/Vectorize/SLPVectorizer.cpp
1939–1940

I would expect the TTI function to make use of the extra information of the sign bit, but can't see that code. Or is this intentional?

3521

This looks convoluted. You define a lambda and then use the inverted value. I think what you want i hasSignBitSet[] which returns is KnownOne. Then you use this as IsSigned = hasSignBitSet(). I think in your code you mixup concepts and intended use.

This revision now requires changes to proceed.Dec 9 2016, 2:52 PM

Michael, could you also lend this your expert eye? Do you agree with the extra memory (pair vs vector) and compile-time (look at the loop tree) investment for this fix? Shortening the types may not be worth it at least on some architectures.

Gerolf,

In terms of compile-time, the fix adds a single pass over the *roots* of non-store-rooted trees. The number of roots is equal to the vectorization factor, so this will be minimal. In terms of memory usage, the fix adds a bool for each instruction in the vectorizable trees that we can type-shrink. This also seems minimal to me.

lib/Transforms/Vectorize/SLPVectorizer.cpp
1939–1940

I'm not sure how the TTI function could use sign bit information here. We know that if we have a zext, then the sign bit is zero. If we have a sext, the sign bit can be either one or zero. The TTI function returns the expect cost of the pair of instructions:

%0 = extractelement ...
%1 = zext/sext %0 ...

We're giving the TTI function the kind of extend we are performing. Is there are case you have in mind that I'm not thinking of?

3521

I'm not sure this is what we want. We're checking if the sign bit is known to be zero, not known to be one. We set IsSigned only if we can't prove the sign bit to be zero (even though in reality it may be). Perhaps the "IsSigned" name is a bit confusing? I'll rename it when rewriting the code below to use "any_of" to address Michael's comment.

To clarify what's happening, we're checking the roots to determine if we know the sign bit is zero. If so, we zext. If we can't prove the sign bit to be zero, we add one to the max bit width and sext.

mssimpso updated this revision to Diff 81084.Dec 12 2016, 7:02 AM
mssimpso edited edge metadata.
mssimpso marked 3 inline comments as done.

Addressed comments from Michael and Gerolf. Thanks!

  • Rewrote sign bit computation to use "all_of" (Michael), and renamed variable to make the logic less confusing (Gerolf).
  • Renamed ExTy to Extend (Michael).
mkuper accepted this revision.Dec 12 2016, 11:36 AM
mkuper edited edge metadata.

Still LGTM.

Gerolf / Michael Z, are you ok with continuing the discussion post-commit, if you still have objections?

Gerolf, correct me if I'm wrong, but it sounds like your main issue was not with this patch, but rather with the whole approach to type shortening.
I believe we should keep it as is, but even if we do end up deciding to remove it, I think we should fix the functional bug we have (PR31243) first.

mzolotukhin accepted this revision.Dec 12 2016, 12:15 PM
mzolotukhin edited edge metadata.

I agree with Michael K here - first we need to fix the stability issue, then we can discuss if there is anything we want to do another way.

Michael Z

Thanks everyone. I'll go ahead and commit the bug fix for now, and then we can continue the discussion.

This revision was automatically updated to reflect the committed changes.