Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1831 ↗ | (On Diff #84576) | This isn't a question about this patch (and maybe I'm not the person who should be reviewing it :-) ), but I'm now confused about how this works at all. What's the benefit of calling SimplifyDemandedVectorElts() on an instruction when you're demanding all elements, the instruction you're calling this on will also demand all the elements of its operands, and throwing UndefElts away? Is this just so that things have a chance to happen in the middle of the tree? |
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
1475 | Add a TODO for the AVX512 versions? | |
1495 | Can you make this more verbose? It's a bit opaque unless you already know what the layout looks like. | |
1497 | i -> Lane, j -> Elt (or something similar)? | |
1527 | Maybe make UndefElts an array of 2 APInts? This will make this code slightly prettier. |
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
1831 ↗ | (On Diff #84576) | Yeah this probably has no value for packs. Its useful for things like scalar intrinsics because they demand less elements from their inputs no matter what future users demand from the intrinsic. I think this means PSHUFB doesn't need code in InstCombineCalls either. |
I've removed the demanded elts handling of pshufb/vpermil from InstCombiner::visitCallInst
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
1527 | UndefElts is a pass by reference argument that is used by the caller so I can't do much to change it. I agree that the code is messy but it is probably as good as it can be. |
Removed unnecessary handling in InstCombiner::visitCallInst, cleaned up code as requested.
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
1527 | Sorry, I meant "make UndefElts2 and UndefElts3 a 2-element array". |
One more minor naming nit, other than that, LGTM.
lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp | ||
---|---|---|
1496 | i -> OpNum? |
Add a TODO for the AVX512 versions?