Page MenuHomePhabricator

[MVE] [ARM] Select VQABS

Authored by anwel on Nov 13 2019, 6:56 AM.



Adds a pattern to to use a VQABS instruction if an equivalent multi-instruction construct is found.

Diff Detail

Event Timeline

anwel created this revision.Nov 13 2019, 6:56 AM
SjoerdMeijer added inline comments.Nov 13 2019, 7:13 AM

Should this be HasMVEFloat?

simon_tatham added inline comments.Nov 13 2019, 7:22 AM

No – VQABS is an integer instruction. The point is that it takes the absolute value of a signed integer, and gives you back something that fits in the same signed integer type, which means it has to map the largest negative value (say -128) to one less than its true absolute value (e.g. +127) or else it still ends up negative.


These magic numbers like 3712 and 2688 and so on could do with a comment explaining what they represent. From context it looks as if they're some kind of non-literal immediate encoding used by special node types like ARMvcmpz, ARMvmovImm and so on – but what real numbers do they represent?


This integer 3711 is different from the 3712 two lines above it. But in the other two patterns, the two corresponding numbers are identical (two copies of 2688, and two of 1664). If that's deliberate, could you add a comment saying why?


Since all three of these patterns look very similar, it ought to be possible to fold them all up into a class or multiclass. If you pass an MVEVectorVTInfo as one of the template parameters to the class, you should be able to extract both the vector type and the predicate type that goes with it (e.g. MVE_v16i8.Vec is v16i8, and MVE_v16i8.Pred is v16i1).

You might have to pass those magic integer constants as extra template parameters, though.

SjoerdMeijer added inline comments.Nov 13 2019, 7:31 AM

Ah yes, but think I got confused by the


in the pseudo-code of the instruction. And not that it really matters, but I guess that means the test just needs -mattr=+mve instead of -mattr=+mve.fp

anwel updated this revision to Diff 229780.Nov 18 2019, 2:51 AM

Wrapped vqabs pattern into a multiclass as suggested by @simon_tatham

anwel marked 10 inline comments as done.Nov 18 2019, 3:01 AM
anwel added inline comments.

Thanks for making me aware of that, I removed the superfluous .fp


A very good suggestion as this file needs more explanatory comments in it, I've added some now.


The difference emerges because the s16 and s32 variants used once ARMvmovImm ... to represent INT_MIN and once ARMvmvnImm ... to represent INT_MAX (by negating INT_MIN), while s8 uses ARMvmovImm ... both times and thus in the used encoding has to decrease the magic number for INT_MIN by one to represent INT_MAX.
In the new version of the patch this should be hopefully be more clear.


Just updated the patch to do exactly this, it indeed looks much nicer.

Thanks, this does look much nicer!

Now I can read it more easily, I can see that what you're actually matching is effectively the vectorization of a pair of nested ternary-expressions, so that each lane is being independently transformed via the function

x > 0 ? x 
      : (x == INT_MIN ? INT_MAX
                      : -x)

which it's reasonably clear does implement the same function as the vqabs instruction.

I only have one small nitpick left.


This (i32 12), and the (i32 0) in the ARMvcmp a couple of lines below, are the only remaining magic numbers that have no explanation here. I think they deserve a comment explaining them: 12 and 0 are respectively the Arm architecture's condition-code encodings for GT and EQ, so this ARMvcmpz is testing if each lane is greater than zero, and the ARMvcmp below is testing if each lane is equal to (the corresponding lane of) int_min.

anwel updated this revision to Diff 230063.Nov 19 2019, 6:40 AM
anwel marked 4 inline comments as done.

Added a comment that depicts what expression the tree pattern matches.

anwel marked 2 inline comments as done.Nov 19 2019, 6:43 AM
anwel added inline comments.

Thanks for reminding me, I decided to just add a comment that explains what expression the tree structure will match - hopefully that should make it clear what happens.

simon_tatham accepted this revision.Nov 19 2019, 6:50 AM

Good idea – the extra comment definitely helps!

This revision is now accepted and ready to land.Nov 19 2019, 6:50 AM
anwel closed this revision.Nov 20 2019, 6:05 AM
anwel marked an inline comment as done.

Pushed this as commit