This patch allows that scalable vector can fold extractelement and splat of a constant
only when the lane index is lower than the minimum number of elements of the vector.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The implementation in this patch was previously in:
https://reviews.llvm.org/D101916.
Thanks for splitting it up. I like small patches.
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
910–911 | You should be able to remove the two lines above handling the CAZ case, because your new code will handle it. |
Definitely agree. :)
For the test where the index is presumed invalid (i32 -1), is that enforced somehow? Is there a hard limit somewhere that says i32 0xffffffff must be invalid?
Hi @foad,
I don't know if I understood correct your suggestion.
I did not change any line.
I have tried some combinations of IF removals, but they all broke some tests
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
910–911 | a | |
910–911 | Hi @foad and it should be only for scalable vector because of this return CAZ->getElementValue(CIdx->getZExtValue()); transformation. |
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
910–911 | Hi @CarolineConcatto, I think @foad meant removing the lines: if (auto *CAZ = dyn_cast<ConstantAggregateZero>(Val)) return CAZ->getElementValue(CIdx->getZExtValue()); If Val is a constant aggregate zero (<=> all zeroes), then this is a splat value that's recognized by getSplatValue, so your new lines added in this patch will handle the same case making the above lines redundant. |
Hi @foad,
Sorry for my miss interpretation.
I believe now it is as you suggested.
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
910–911 | Thank you @sdesmalen, that makes sense. |
Sorry, @spatel
I've missed your comment above
I think the reason it says is invalid is because of the test:
uge(ValSVTy->getMinNumElements() -> unsigned greater or equal
Hmm...but those are uint64_t types/compares, so I don't see how it would automatically be considered invalid to have a "i32 -1". I don't think it really matters to this patch. I was just curious if there's some implicit semantic difference being verified between the tests "extractconstant_shuffle_maybe_out_of_range" and "extractconstant_shuffle_invalid_index".
So no objection from me, but someone more directly involved with scalable vectors should probably give final approval.
The patch looks fine to me with the nit addressed.
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | nit: s/!CIdx->uge/CIdx->ult/ |
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | Hey @sdesmalen |
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | Ah I hadn't realised that. In that case, could you write it more explicitly as CIdx->getZExtValue() < ValSVTy->getMinNumElements() |
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | Do we know the type of CIdx is always less that 64bits? uge() is just a convenience wrapper around APInt::uge(). You can use CIdx->getValue().ult(..) if you feel the ult is better than the !uge |
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | Agreed, that's better, my main concern was the !uge. |
The patch looks OK to me. I added some inline comments but you don't need to do anything about them, they are just observations.
llvm/lib/IR/ConstantFold.cpp | ||
---|---|---|
909–910 | Or you could implement ConstantInt::ult. | |
909–910 | I don't really understand why this special case for scalable types exists here. Why not just fall through to the call to Val->getAggregateElement below? It looks like getAggregateElement already has some support for scalable types. |
llvm/test/Transforms/InstSimplify/ConstProp/extractelement-vscale.ll | ||
---|---|---|
15 | (Tiny nit:) Perhaps "extracting" would fit a little bit better here? |
- remove scalable vector check because getAggregateElement has safeguards for scalable vectors
nit: s/!CIdx->uge/CIdx->ult/