This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Load all vector and FP constants in Select()
ClosedPublic

Authored by jonpa on Feb 14 2019, 6:51 PM.

Details

Reviewers
uweigand
Summary

(Continued from and replacing https://reviews.llvm.org/D58142 and https://reviews.llvm.org/D57926)

Hmm. Actually, I'm now wondering why we need to reject anything in the first place. Can't we improve isFPImmLegal to accept *anything* that can be constructed via any of the vector instructions (VGBM, VGM, VREPI)?

OK - did a new attempt with this patch to follow this broader principle.

So in end, we just need two routines:

  • can a FP immediate or BUILD_VECTOR be loaded?

isVectorConstantLegal()

  • actually load a FP immediate or BUILD_VECTOR

loadVectorConstant()

Since the current code for for VREP / VGM is based on finding the smallest splat, I wanted to reuse BuildVectorSDNode::isConstantSplat(). For an APFloat, it was not simple to build a temporary BVN (DAG pointer not available), but the finding of the splat without any undefined bits was not that much work to implement.

To try VGBM, the int bits are used, and they are either found with isConstantSplat() called with 128 as minimum splat size, or for APFloat, with a conversion to APInt.

Added a new struct type to wrap this called SystemZVectorConstantInfo.

This should handle any and all constants, and all previously removed tests have now be restored.

as a general principle if we have an instruction that can do something, we should be using it, if it's possible without a lot of overhead ...

There actually seems to be a few less VGMGs now on SPEC after all, and it seems that this is because a scalar FP constant can now reuse a vector splat constant of the same value. This is what the new test vec-const-19.ll checks.

Opcode differences on SPEC over all files:
master <> patch

vgmg           :                 3982                 3945      -37
mdbr           :                 6957                 6949       -8
vgbm           :                 3885                 3893       +8 ?
wfmdb          :                19561                19569       +8
  • Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?
  • Still a little unsure about the way to do the actual selection in loadVectorConstant(). It seems that the new node must be in the DAG before it is selected, so ReplaceNode() needs to be called first, and then SelectCode(). The bitcast handling varies depending on if VecVT and VT are the same. The fact that VGBM is a machine node, but ROTATE_MASK and REPLICATE are not also makes for some more handling. I guess one could consider selecting them all as machine nodes directly, and one could do an explicit comparison of VT and VecVT perhaps

Diff Detail

Event Timeline

jonpa created this revision.Feb 14 2019, 6:51 PM

Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?

Sure, why not ...

Still a little unsure about the way to do the actual selection in loadVectorConstant(). It seems that the new node must be in the DAG before it is selected, so ReplaceNode() needs to be called first, and then SelectCode(). The bitcast handling varies depending on if VecVT and VT are the same. The fact that VGBM is a machine node, but ROTATE_MASK and REPLICATE are not also makes for some more handling. I guess one could consider selecting them all as machine nodes directly, and one could do an explicit comparison of VT and VecVT perhaps

Maybe it would be simplest to add back SystemZISD::BYTE_MASK and then just handle everything as ISD nodes? I agree you should only create a BITCAST node if the types are actually different.

Another question: it seems odd to have to call getTargetLowering in ::Select, that seems a bit of a layering violation. Why does ​isVectorConstantLegal have to be a member function of SystemZTargetLowering in the first place? Can't this be just a stand-alone function?

jonpa updated this revision to Diff 187625.Feb 20 2019, 11:05 AM

Added handling for fp128 on z14, with some new tests that checks that this is working in fp-const-11.ll. NFC on spec on z14.

Maybe it would be simplest to add back SystemZISD::BYTE_MASK and then just handle everything as ISD nodes? I agree you should only create a BITCAST node if the types are actually different.

Another question: it seems odd to have to call getTargetLowering in ::Select, that seems a bit of a layering violation. Why does isVectorConstantLegal have to be a member function of SystemZTargetLowering in the first place? Can't this be just a stand-alone function?

I followed those suggestions and it indeed looks better now, thanks.

This looks generally good to me. Some additional options to possibly make the code simpler occurred to me:

  • This looks a bit repetitve to me:
if (VCI.Opcode == SystemZISD::BYTE_MASK)
  Op = CurDAG->getNode(SystemZISD::BYTE_MASK, DL, VCI.VecVT,
                       CurDAG->getConstant(VCI.Mask, DL, MVT::i32));
else if (VCI.Opcode == SystemZISD::REPLICATE)
  Op = CurDAG->getNode(SystemZISD::REPLICATE, DL, VCI.VecVT,

Would it make sense to instead store in VCI just the Opcode and an Operands array? And then just create a DAG node with that Opcode and that list of (i32) operands, without checking specifically which Opcode it is?

  • Would it make sense to not just store FBImm or BVN in the VCI, but instead perform the equivalent of getIntBits() and getSplat() in the two constructors and store the result values of those in the VCI instead? This would eliminate those functions (which are completely different for FBImm and BVN anyway) and maybe just make the logic simpler overall.
lib/Target/SystemZ/SystemZISelDAGToDAG.cpp
1149

We're now so late that I don't think we need the isOpaque flag any more.

lib/Target/SystemZ/SystemZISelLowering.h
676

Do we really need to convert to APInt here just to determine the type?

683

This function (and possibly the type?) should probably be at least in the lvm::SystemZ:: namespace, to avoid polluting the common llvm:: namespace. Or else (maybe better?) make it a member function of SystemZVectorConstantInfo?

jonpa updated this revision to Diff 188063.Feb 23 2019, 3:46 PM
jonpa marked 6 inline comments as done.

Patch updated per review, thanks.

Would it make sense to instead store in VCI just the Opcode and an Operands array? And then just create a DAG node with that Opcode and that list of (i32) operands, without checking specifically which Opcode it is?

Yes, that seems to simplify things. I however am somewhat wary to create DAG nodes during all the queries during legalization, so I instead first make a vector of unsigned and then make the actual DAG operands in loadVectorConstant().

Would it make sense to not just store FBImm or BVN in the VCI, but instead perform the equivalent of getIntBits() and getSplat() in the two constructors and store the result values of those in the VCI instead? This would eliminate those functions (which are completely different for FBImm and BVN anyway) and maybe just make the logic simpler overall.

That seems all good to me, except I am not sure if we would like to avoid computing the splat in the cases were it is not needed (BYTE_MASK case). If that was not a concern, it would probably be better to just have the constructors find the various data members common for both cases and then forget if it was a BVN or ConstantFP.

lib/Target/SystemZ/SystemZISelDAGToDAG.cpp
1149

right

lib/Target/SystemZ/SystemZISelLowering.h
676

Seems that we could do either

(&FPImm.getSemantics() == &APFloat::IEEEquad())

or

APFloat::getSizeInBits(FP.getSemantics()) > 64
683

Seems to work fine to make isVectorConstantLegal() a member function. I am guessing the rest is fine, since SystemZ is part of the name, or?

Yes, that seems to simplify things. I however am somewhat wary to create DAG nodes during all the queries during legalization, so I instead first make a vector of unsigned and then make the actual DAG operands in loadVectorConstant().

Yes, that's what I meant. This is looking good to me now ...

That seems all good to me, except I am not sure if we would like to avoid computing the splat in the cases were it is not needed (BYTE_MASK case). If that was not a concern, it would probably be better to just have the constructors find the various data members common for both cases and then forget if it was a BVN or ConstantFP.

I don't think this will make much of a difference compile-time wise, so I'd prefer to go with the version where the code looks simpler ...

jonpa updated this revision to Diff 188220.Feb 25 2019, 10:51 AM

I don't think this will make much of a difference compile-time wise, so I'd prefer to go with the version where the code looks simpler ...

OK. It does seem to simplify things to just do the work in the constructors - patch updated.

uweigand accepted this revision.Feb 26 2019, 1:23 AM

OK, this version LGTM. Thanks!

This revision is now accepted and ready to land.Feb 26 2019, 1:23 AM
jonpa closed this revision.Feb 26 2019, 8:51 AM

Thanks for help and review!

r354896