Implemente patterns to extract W vector elements and convert to QP
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I think we should review the code sequences for signed conversions and make more consistent use of loops.
lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
3165 ↗ | (On Diff #146270) | This probably needs let Predicates = [IsBigEndian, HasP9Vector] right? |
3168 ↗ | (On Diff #146270) | Is this sequence actually correct? We convert a vector of 4 4-byte integers into a vector of 2 8-byte double precision floating point values. Then we treat it as a signed 8-byte integer and convert it to a 16-byte floating point value. Shouldn't the outer instruction be xscvdpqp? In any case, vextsw2d -> xscvsdqp is a much lower latency sequence than this. Why not use that? |
3176 ↗ | (On Diff #146270) | Is there no mul function in TableGen? i.e. can't we just write !mul(Idx, 4)? |
3194 ↗ | (On Diff #146270) | Same note regarding the predicate. |
3196 ↗ | (On Diff #146270) | To be consistent, I think you should write these as a neat for-loop as you did above. The element would be Idx and the splat index would be !sub(3, Idx). Wouldn't that work? |
3207 ↗ | (On Diff #146270) | Same thing here, a for-loop would be nicer and more consistent. |
lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
3165 ↗ | (On Diff #146270) | This is within a Predicates = [HasP9Vector] section so is not needed here. |
3168 ↗ | (On Diff #146270) | I guess there is no need for us to convert to double precision here. Will update. |
3176 ↗ | (On Diff #146270) | There is only !add(a,b,...) |
3194 ↗ | (On Diff #146270) | this is within a Predicates = [HasP9Vector] code section. |
3196 ↗ | (On Diff #146270) | Unfortunately there is no !sub() operator. |
3207 ↗ | (On Diff #146270) | I agree. I just couldn't find a way to do it with just !add() |
lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
3167 ↗ | (On Diff #146407) | It is actually word 1 that doesn't need the splat. Word 0 does need a splat. |
3196 ↗ | (On Diff #146407) | This should actually be foreach Idx = [[0,3],[1,2],[3,0]] shouldn't it? For LE word element 2, VEXTSW2D will sign extend it into LE doubleword element 1 which is where XSCVSDQP needs it to be - so a splat is not needed. LE word element 3 on the other hand will need a splat since the input is in the left half of LE doubleword 1 and it needs to be in the right half. |
LGTM. Feel free to address the minor nit on the commit.
lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
3202 ↗ | (On Diff #146597) | Nit: For consistency, move this up before the loop since that's the order of definitions in the big-endian block above. |