Page MenuHomePhabricator

[InstSimplify] Add constant fold for extractelement + splat for scalable vectors
ClosedPublic

Authored by CarolineConcatto on May 26 2021, 9:22 AM.

Details

Summary

This patch allows that scalable vector can fold extractelement and splat of a constant
only when the lane index is lower than the minimum number of elements of the vector.

Diff Detail

Event Timeline

CarolineConcatto requested review of this revision.May 26 2021, 9:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2021, 9:22 AM
Matt added a subscriber: Matt.May 26 2021, 9:28 AM
CarolineConcatto edited the summary of this revision. (Show Details)May 26 2021, 9:29 AM

The implementation in this patch was previously in:
https://reviews.llvm.org/D101916.

foad added a comment.May 26 2021, 11:45 AM

Thanks for splitting it up. I like small patches.

llvm/lib/IR/ConstantFold.cpp
910–911

You should be able to remove the two lines above handling the CAZ case, because your new code will handle it.

Thanks for splitting it up. I like small patches.

Definitely agree. :)

For the test where the index is presumed invalid (i32 -1), is that enforced somehow? Is there a hard limit somewhere that says i32 0xffffffff must be invalid?

Hi @foad,
I don't know if I understood correct your suggestion.
I did not change any line.
I have tried some combinations of IF removals, but they all broke some tests

llvm/lib/IR/ConstantFold.cpp
910–911

a

910–911

Hi @foad
I believe we should keep these 2 tests:
!CIdx->uge(ValSVTy->getMinNumElements())
because it limits the index to be lower than the minimum width of the scalable vector.

and it should be only for scalable vector because of this

return CAZ->getElementValue(CIdx->getZExtValue());

transformation.
If we change to be vector type many other tests fail. For instance this one:
CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
Transforms/LoopVectorize/X86/gather_scatter.ll

sdesmalen added inline comments.Jun 1 2021, 5:37 AM
llvm/lib/IR/ConstantFold.cpp
910–911

Hi @CarolineConcatto, I think @foad meant removing the lines:

if (auto *CAZ = dyn_cast<ConstantAggregateZero>(Val))
  return CAZ->getElementValue(CIdx->getZExtValue());

If Val is a constant aggregate zero (<=> all zeroes), then this is a splat value that's recognized by getSplatValue, so your new lines added in this patch will handle the same case making the above lines redundant.

  • Remove vector of zeros check by a splat vector check
CarolineConcatto marked an inline comment as done.Jun 1 2021, 6:05 AM

Hi @foad,
Sorry for my miss interpretation.
I believe now it is as you suggested.

llvm/lib/IR/ConstantFold.cpp
910–911

Thank you @sdesmalen, that makes sense.
Change made.

CarolineConcatto marked an inline comment as done.Jun 2 2021, 7:58 AM

Thanks for splitting it up. I like small patches.

Definitely agree. :)

For the test where the index is presumed invalid (i32 -1), is that enforced somehow? Is there a hard limit somewhere that says i32 0xffffffff must be invalid?

Sorry, @spatel
I've missed your comment above
I think the reason it says is invalid is because of the test:
uge(ValSVTy->getMinNumElements() -> unsigned greater or equal

For the test where the index is presumed invalid (i32 -1), is that enforced somehow? Is there a hard limit somewhere that says i32 0xffffffff must be invalid?

I think the reason it says is invalid is because of the test:
uge(ValSVTy->getMinNumElements() -> unsigned greater or equal

Hmm...but those are uint64_t types/compares, so I don't see how it would automatically be considered invalid to have a "i32 -1". I don't think it really matters to this patch. I was just curious if there's some implicit semantic difference being verified between the tests "extractconstant_shuffle_maybe_out_of_range" and "extractconstant_shuffle_invalid_index".

So no objection from me, but someone more directly involved with scalable vectors should probably give final approval.

sdesmalen accepted this revision.Jun 3 2021, 12:41 AM

The patch looks fine to me with the nit addressed.

llvm/lib/IR/ConstantFold.cpp
909–910

nit: s/!CIdx->uge/CIdx->ult/

This revision is now accepted and ready to land.Jun 3 2021, 12:41 AM
llvm/lib/IR/ConstantFold.cpp
909–910

Hey @sdesmalen
I may be completely wrong, but I believe your suggestion is not possible
https://llvm.org/doxygen/classllvm_1_1ConstantInt.html
If I looked the documentation correctly llvm::constant only has UGE

sdesmalen added inline comments.Jun 3 2021, 1:46 AM
llvm/lib/IR/ConstantFold.cpp
909–910

Ah I hadn't realised that. In that case, could you write it more explicitly as

CIdx->getZExtValue() < ValSVTy->getMinNumElements()
dmgreen added a subscriber: dmgreen.Jun 3 2021, 1:58 AM
dmgreen added inline comments.
llvm/lib/IR/ConstantFold.cpp
909–910

Do we know the type of CIdx is always less that 64bits?

uge() is just a convenience wrapper around APInt::uge(). You can use CIdx->getValue().ult(..) if you feel the ult is better than the !uge

sdesmalen added inline comments.Jun 3 2021, 2:00 AM
llvm/lib/IR/ConstantFold.cpp
909–910

Agreed, that's better, my main concern was the !uge.

  • replace the use of !CIdx->ult() by !CIdx->getValue().uge(..)
CarolineConcatto marked 2 inline comments as done.Jun 3 2021, 3:09 AM
foad accepted this revision.Jun 3 2021, 8:35 AM

The patch looks OK to me. I added some inline comments but you don't need to do anything about them, they are just observations.

llvm/lib/IR/ConstantFold.cpp
909–910

Or you could implement ConstantInt::ult.

909–910

I don't really understand why this special case for scalable types exists here. Why not just fall through to the call to Val->getAggregateElement below? It looks like getAggregateElement already has some support for scalable types.

  • update test vector_splat_indices_nxv2i64_ext0 in gep-vector-indices.ll
Matt added inline comments.Jun 8 2021, 8:38 AM
llvm/test/Transforms/InstSimplify/ConstProp/extractelement-vscale.ll
15

(Tiny nit:) Perhaps "extracting" would fit a little bit better here?

  • s/extract/extracting in test commemts
CarolineConcatto marked an inline comment as done.Jun 9 2021, 12:23 AM
  • remove scalable vector check because getAggregateElement has safeguards for scalable vectors