The lowerBUILD_VECTOR() handling of constant splats refactored into two methods: analyzeBVNForConstantReplication() is used both during lowering and in Select() to analyze the BVNs. SystemZDAGToDAGISel::tryReplicateConstantSplat() then performs the actual instruction selection.
This is a continuation of the handling of constant BVNs during legalization with the idea to expose more constant vectors to Combine2. The same problem with FP constants as was seen with VGBM persists, and this time it seems to have more impact on SPEC.
I see
spec-llvm_A_master/ spec-llvm vrepg : 3279 3621 +342 vgmg : 344 25 -319 larl : 153427 153664 +237 ldeb : 8511 8728 +217 vl : 22873 23025 +152 vst : 24059 24131 +72 vlrepf : 688 709 +21 vmrhg : 1154 1170 +16 vmrhf : 728 740 +12 vgbm : 3887 3898 +11 ... Spill|Reload : 189582 189793 +211
There are many more larls, which should be due to many more ConstantFP vectors loaded from the constant pool. It seems these are the FP splats present in SPEC:
206 BUILD_VECTOR ConstantFP:f64<1.000000e+00>, ConstantFP:f64<1.000000e+00> 96 BUILD_VECTOR ConstantFP:f64<5.000000e-01>, ConstantFP:f64<5.000000e-01> 17 BUILD_VECTOR ConstantFP:f64<2.000000e+00>, ConstantFP:f64<2.000000e+00> 12 BUILD_VECTOR ConstantFP:f64<-2.000000e+00>, ConstantFP:f64<-2.000000e+00> 8 BUILD_VECTOR undef:f64, ConstantFP:f64<2.000000e+00> 8 BUILD_VECTOR ConstantFP:f32<nan>, ConstantFP:f32<0.000000e+00>, ConstantFP:f32<nan>, ConstantFP:f32<0.000000e+00> 4 BUILD_VECTOR ConstantFP:f64<1.250000e-01>, ConstantFP:f64<1.250000e-01> 4 BUILD_VECTOR ConstantFP:f32<1.000000e+00>, ConstantFP:f32<1.000000e+00>, ConstantFP:f32<1.000000e+00>, ConstantFP:f32<1.000000e+00> 4 BUILD_VECTOR ConstantFP:f32<1.000000e+00>, ConstantFP:f32<0.000000e+00>, ConstantFP:f32<1.000000e+00>, ConstantFP:f32<0.000000e+00> 3 BUILD_VECTOR ConstantFP:f32<0.000000e+00>, ConstantFP:f32<1.000000e+00>, ConstantFP:f32<0.000000e+00>, ConstantFP:f32<1.000000e+00> 2 BUILD_VECTOR ConstantFP:f32<nan>, ConstantFP:f32<nan>, ConstantFP:f32<nan>, ConstantFP:f32<nan> 1 BUILD_VECTOR ConstantFP:f64<INF>, ConstantFP:f64<INF> 1 BUILD_VECTOR ConstantFP:f64<7.812500e-03>, ConstantFP:f64<7.812500e-03> 1 BUILD_VECTOR ConstantFP:f32<5.000000e-01>, ConstantFP:f32<5.000000e-01>, ConstantFP:f32<5.000000e-01>, ConstantFP:f32<5.000000e-01> 1 BUILD_VECTOR ConstantFP:f32<0.000000e+00>, ConstantFP:f32<5.000000e-01>, ConstantFP:f32<0.000000e+00>, ConstantFP:f32<5.000000e-01>
Not sure what to do next - does this mean we should consider again improving the handling of ConstantFP nodes in the backend, or should we abandon this?
Tests with FP splats that are no longer supported have been deleted.
Note: tryReplicateConstantSplat() first calls SelectCode() on the bitcast and then on the REPLICATE / ROTATE_MASK. Not entirely sure if that's wise, or if getMachineNode() should be called instead.