This is an archive of the discontinued LLVM Phabricator instance.

[Power9]Legalize and emit code for W vector extract and convert to Quad-Precision
ClosedPublic

Authored by lei on May 7 2018, 10:59 AM.

Diff Detail

Event Timeline

lei created this revision.May 7 2018, 10:59 AM
lei updated this revision to Diff 146270.May 10 2018, 7:21 PM
nemanjai requested changes to this revision.May 11 2018, 4:30 AM

I think we should review the code sequences for signed conversions and make more consistent use of loops.

lib/Target/PowerPC/PPCInstrVSX.td
3164–3165

This probably needs let Predicates = [IsBigEndian, HasP9Vector] right?

3167

Is this sequence actually correct? We convert a vector of 4 4-byte integers into a vector of 2 8-byte double precision floating point values. Then we treat it as a signed 8-byte integer and convert it to a 16-byte floating point value. Shouldn't the outer instruction be xscvdpqp?

In any case, vextsw2d -> xscvsdqp is a much lower latency sequence than this. Why not use that?

3175

Is there no mul function in TableGen? i.e. can't we just write !mul(Idx, 4)?

3194–3216

Same note regarding the predicate.

3196

To be consistent, I think you should write these as a neat for-loop as you did above. The element would be Idx and the splat index would be !sub(3, Idx). Wouldn't that work?

3207

Same thing here, a for-loop would be nicer and more consistent.

This revision now requires changes to proceed.May 11 2018, 4:30 AM
lei added inline comments.May 11 2018, 6:55 AM
lib/Target/PowerPC/PPCInstrVSX.td
3164–3165

This is within a Predicates = [HasP9Vector] section so is not needed here.

3167

I guess there is no need for us to convert to double precision here. Will update.

3175

There is only !add(a,b,...)

3194–3216

this is within a Predicates = [HasP9Vector] code section.

3196

Unfortunately there is no !sub() operator.

3207

I agree. I just couldn't find a way to do it with just !add()

lei added inline comments.May 11 2018, 8:22 AM
lib/Target/PowerPC/PPCInstrVSX.td
3164–3165

will add!

3194–3216

will add!

lei updated this revision to Diff 146364.May 11 2018, 11:10 AM
lei marked 2 inline comments as done.

This should address all review comments.

lei updated this revision to Diff 146407.May 11 2018, 1:46 PM
lei marked an inline comment as done and an inline comment as not done.

Update to use foreach loop for LE patterns.

nemanjai added inline comments.May 14 2018, 5:37 AM
lib/Target/PowerPC/PPCInstrVSX.td
3167

It is actually word 1 that doesn't need the splat. Word 0 does need a splat.

3196

This should actually be foreach Idx = [[0,3],[1,2],[3,0]] shouldn't it? For LE word element 2, VEXTSW2D will sign extend it into LE doubleword element 1 which is where XSCVSDQP needs it to be - so a splat is not needed. LE word element 3 on the other hand will need a splat since the input is in the left half of LE doubleword 1 and it needs to be in the right half.

lei updated this revision to Diff 146597.May 14 2018, 6:59 AM

Address review comments

lei marked 3 inline comments as done.May 14 2018, 6:59 AM
lei added inline comments.
lib/Target/PowerPC/PPCInstrVSX.td
3167

You are right!

3196

Yes. This is true... I forgot to check this when I switched xvcvsxwdp for vextsw2d.

nemanjai accepted this revision.May 18 2018, 6:41 AM

LGTM. Feel free to address the minor nit on the commit.

lib/Target/PowerPC/PPCInstrVSX.td
3202

Nit: For consistency, move this up before the loop since that's the order of definitions in the big-endian block above.

This revision is now accepted and ready to land.May 18 2018, 6:41 AM
This revision was automatically updated to reflect the committed changes.
lei marked 2 inline comments as done.