This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem
ClosedPublic

Authored by ctetreau on Apr 30 2020, 1:07 PM.

Details

Summary

getLogBase2 tries to iterate over the number of vector elements. Since
the number of elements of a scalable vector is unknown at compile time,
we must return null if the input type is scalable.

Identified by test LLVM.Transforms/InstCombine::nsw.ll

Diff Detail

Event Timeline

ctetreau created this revision.Apr 30 2020, 1:07 PM
Herald added a project: Restricted Project. · View Herald Transcript

Hi @ctetreau ,

you say:

getLogBase2 tries to walk its input vector type to do its work. This
cannot be done for scalable vectors

I agree with you if we imagine the input scalable vector consisting of different powers of two, say [2^c1, 2^c2, 2^c3, ...]

I am thinking at the situation in which a scalable vector is the splat of a power of 2, say [2^c, 2^c, ..., 2^c]. I think in this case returning nullptr might lead to a missed optimization opportunity? For example, I have highlighted a place in the code where the power of 2 is used to simplify udiv. I think we want to have that optimization for scalable vectors too: surely we can retrieve the scalable vector "splat of C" from the scalable vector "splat of 2^C"?

Thank you for working on this!

Regards,

Francesco

llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
857–859

Wouldn't your changes get to LLVM unreachable here? Would it make sense to have a test that checks that X udiv Y don't crash the compiler when both X and Y are scalable vector and Y is a splat of a power of 2?

getLogBase2 should already work for scalable splats, I think? m_APInt should match vectors.

This revision is now accepted and ready to land.Apr 30 2020, 4:14 PM

I am thinking at the situation in which a scalable vector is the splat of a power of 2, say [2^c, 2^c, ..., 2^c]. I think in this case returning nullptr might lead to a missed optimization opportunity? For example, I have highlighted a place in the code where the power of 2 is used to simplify udiv. I think we want to have that optimization for scalable vectors too: surely we can retrieve the scalable vector "splat of C" from the scalable vector "splat of 2^C"?

My current mandate is to fix all the misuses of getNumElements(). Right now we're in a situation where llvm does some transformations that maybe speeds things up sometimes, but other times causes a correctness issue or crash. I'm more concerned with fixing the correctness issues and crashes; the optimizations can be reintroduced after the compiler works. Step 1 in "make it work, then make if fast" is "make it work" after all. For many of these fixes that I make, there's something the code can do to take the optimization opportunity with scalable vectors. However, if I spend the time for each one to find all of these I'll never get this done.

ctetreau updated this revision to Diff 261796.May 4 2020, 6:14 AM

Add requested test case

ctetreau marked an inline comment as done.May 4 2020, 6:18 AM
ctetreau added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
857–859

Test case added. It looks like the fixed width version gets here but the scalable version does not enter this function.

ctetreau updated this revision to Diff 261834.May 4 2020, 8:48 AM

fix new test

Looked a little more; I guess m_APInt doesn't match scalable vectors at the moment. Something to look at in the future.

I am thinking at the situation in which a scalable vector is the splat of a power of 2, say [2^c, 2^c, ..., 2^c]. I think in this case returning nullptr might lead to a missed optimization opportunity? For example, I have highlighted a place in the code where the power of 2 is used to simplify udiv. I think we want to have that optimization for scalable vectors too: surely we can retrieve the scalable vector "splat of C" from the scalable vector "splat of 2^C"?

My current mandate is to fix all the misuses of getNumElements(). Right now we're in a situation where llvm does some transformations that maybe speeds things up sometimes, but other times causes a correctness issue or crash. I'm more concerned with fixing the correctness issues and crashes; the optimizations can be reintroduced after the compiler works. Step 1 in "make it work, then make if fast" is "make it work" after all. For many of these fixes that I make, there's something the code can do to take the optimization opportunity with scalable vectors. However, if I spend the time for each one to find all of these I'll never get this done.

Sure, I understand. My comment was mostly raised by the message "this cannot be done for scalable vectors" in the summary. I think that you should amend the commit message and add a comment in the code before if (!isa<FixedVectorType>(Ty)) saying something along the lines of FIXME: we can extract pow of 2 of splat constant for scalable vectors.

Thank you!

Francesco

ctetreau updated this revision to Diff 262172.May 5 2020, 10:59 AM
ctetreau edited the summary of this revision. (Show Details)

address code review issues

... My comment was mostly raised by the message "this cannot be done for scalable vectors" in the summary. ...

By "this", I mean "iterating over the fixed number of vector elements". I've updated the commit message to be more explicit about it. I also added the FIXME comment.

fpetrogalli accepted this revision.May 5 2020, 11:58 AM

LGTM, thank you!

ctetreau updated this revision to Diff 262222.May 5 2020, 2:07 PM

fix test typo

Harbormaster completed remote builds in B55847: Diff 262225.
This revision was automatically updated to reflect the committed changes.