(Continued from and replacing https://reviews.llvm.org/D58142 and https://reviews.llvm.org/D57926)
Hmm. Actually, I'm now wondering why we need to reject anything in the first place. Can't we improve isFPImmLegal to accept *anything* that can be constructed via any of the vector instructions (VGBM, VGM, VREPI)?
OK - did a new attempt with this patch to follow this broader principle.
So in end, we just need two routines:
- can a FP immediate or BUILD_VECTOR be loaded?
isVectorConstantLegal()
- actually load a FP immediate or BUILD_VECTOR
loadVectorConstant()
Since the current code for for VREP / VGM is based on finding the smallest splat, I wanted to reuse BuildVectorSDNode::isConstantSplat(). For an APFloat, it was not simple to build a temporary BVN (DAG pointer not available), but the finding of the splat without any undefined bits was not that much work to implement.
To try VGBM, the int bits are used, and they are either found with isConstantSplat() called with 128 as minimum splat size, or for APFloat, with a conversion to APInt.
Added a new struct type to wrap this called SystemZVectorConstantInfo.
This should handle any and all constants, and all previously removed tests have now be restored.
as a general principle if we have an instruction that can do something, we should be using it, if it's possible without a lot of overhead ...
There actually seems to be a few less VGMGs now on SPEC after all, and it seems that this is because a scalar FP constant can now reuse a vector splat constant of the same value. This is what the new test vec-const-19.ll checks.
Opcode differences on SPEC over all files:
master <> patch
vgmg : 3982 3945 -37 mdbr : 6957 6949 -8 vgbm : 3885 3893 +8 ? wfmdb : 19561 19569 +8
- Could we actually handle FP128 as well with a present FeatureVectorEnhancements1?
- Still a little unsure about the way to do the actual selection in loadVectorConstant(). It seems that the new node must be in the DAG before it is selected, so ReplaceNode() needs to be called first, and then SelectCode(). The bitcast handling varies depending on if VecVT and VT are the same. The fact that VGBM is a machine node, but ROTATE_MASK and REPLICATE are not also makes for some more handling. I guess one could consider selecting them all as machine nodes directly, and one could do an explicit comparison of VT and VecVT perhaps
We're now so late that I don't think we need the isOpaque flag any more.