This is an archive of the discontinued LLVM Phabricator instance.

[Power9]Legalize and emit code for W vector extract and convert to Quad-Precision
ClosedPublic

Authored by lei on May 7 2018, 10:59 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

lei created this revision.May 7 2018, 10:59 AM
lei updated this revision to Diff 146270.May 10 2018, 7:21 PM
nemanjai requested changes to this revision.May 11 2018, 4:30 AM

I think we should review the code sequences for signed conversions and make more consistent use of loops.

lib/Target/PowerPC/PPCInstrVSX.td
3165 ↗(On Diff #146270)

This probably needs let Predicates = [IsBigEndian, HasP9Vector] right?

3168 ↗(On Diff #146270)

Is this sequence actually correct? We convert a vector of 4 4-byte integers into a vector of 2 8-byte double precision floating point values. Then we treat it as a signed 8-byte integer and convert it to a 16-byte floating point value. Shouldn't the outer instruction be xscvdpqp?

In any case, vextsw2d -> xscvsdqp is a much lower latency sequence than this. Why not use that?

3176 ↗(On Diff #146270)

Is there no mul function in TableGen? i.e. can't we just write !mul(Idx, 4)?

3194 ↗(On Diff #146270)

Same note regarding the predicate.

3196 ↗(On Diff #146270)

To be consistent, I think you should write these as a neat for-loop as you did above. The element would be Idx and the splat index would be !sub(3, Idx). Wouldn't that work?

3207 ↗(On Diff #146270)

Same thing here, a for-loop would be nicer and more consistent.

This revision now requires changes to proceed.May 11 2018, 4:30 AM
lei added inline comments.May 11 2018, 6:55 AM
lib/Target/PowerPC/PPCInstrVSX.td
3165 ↗(On Diff #146270)

This is within a Predicates = [HasP9Vector] section so is not needed here.

3168 ↗(On Diff #146270)

I guess there is no need for us to convert to double precision here. Will update.

3176 ↗(On Diff #146270)

There is only !add(a,b,...)

3194 ↗(On Diff #146270)

this is within a Predicates = [HasP9Vector] code section.

3196 ↗(On Diff #146270)

Unfortunately there is no !sub() operator.

3207 ↗(On Diff #146270)

I agree. I just couldn't find a way to do it with just !add()

lei added inline comments.May 11 2018, 8:22 AM
lib/Target/PowerPC/PPCInstrVSX.td
3165 ↗(On Diff #146270)

will add!

3194 ↗(On Diff #146270)

will add!

lei updated this revision to Diff 146364.May 11 2018, 11:10 AM
lei marked 2 inline comments as done.

This should address all review comments.

lei updated this revision to Diff 146407.May 11 2018, 1:46 PM
lei marked an inline comment as done and an inline comment as not done.

Update to use foreach loop for LE patterns.

nemanjai added inline comments.May 14 2018, 5:37 AM
lib/Target/PowerPC/PPCInstrVSX.td
3167 ↗(On Diff #146407)

It is actually word 1 that doesn't need the splat. Word 0 does need a splat.

3196 ↗(On Diff #146407)

This should actually be foreach Idx = [[0,3],[1,2],[3,0]] shouldn't it? For LE word element 2, VEXTSW2D will sign extend it into LE doubleword element 1 which is where XSCVSDQP needs it to be - so a splat is not needed. LE word element 3 on the other hand will need a splat since the input is in the left half of LE doubleword 1 and it needs to be in the right half.

lei updated this revision to Diff 146597.May 14 2018, 6:59 AM

Address review comments

lei marked 3 inline comments as done.May 14 2018, 6:59 AM
lei added inline comments.
lib/Target/PowerPC/PPCInstrVSX.td
3167 ↗(On Diff #146407)

You are right!

3196 ↗(On Diff #146407)

Yes. This is true... I forgot to check this when I switched xvcvsxwdp for vextsw2d.

nemanjai accepted this revision.May 18 2018, 6:41 AM

LGTM. Feel free to address the minor nit on the commit.

lib/Target/PowerPC/PPCInstrVSX.td
3202 ↗(On Diff #146597)

Nit: For consistency, move this up before the loop since that's the order of definitions in the big-endian block above.

This revision is now accepted and ready to land.May 18 2018, 6:41 AM
This revision was automatically updated to reflect the committed changes.
lei marked 2 inline comments as done.