A few days ago I described this issue on llvm-commits along with patch:
Hi,
I found that on SystemZ, for v2f32, four and not two scalar operations are emitted. This is because the v2f32 type is widened, which is good in cases of memory-only operations for instance. There is however no fp32 vector support on z13, so these will always be scalarized. If this is done after type-legalization, four and not two operations are produced, which is particularly bad in case of fp32 divide inside a vectorized loop.
In order to fix this, my patch unrolls these operations before type legalization (with a target DAG combine). They must also have the operation action of 'Expand', since otherwise other DAGCombiner methods may re-vectorized them again. This happened in reduceBuildVecConvertToConvertBuildVec(), where I also needed to add a call to TLI.isOperationLegalOrCustom(Opcode, VT), to check on the *result* type, which would in this case be v2f32. It would not work to
mark all the operand VT's of SINT_TO_FP as 'Expand', because this is only true for the v2f32 result case.I do get some failing regression tests, which I am not sure about:
Failing Tests (3):
LLVM :: CodeGen/ARM/vdup.ll LLVM :: CodeGen/X86/2009-02-26-MachineLICMBug.ll LLVM :: CodeGen/X86/cvtv2f32.llIs it ok to add the check for the result type per what I did?
/Jonas
In short, this is something I did for the SystemZ target, but because I had to add an extra check in reduceBuildVecConvertToConvertBuildVec(), three tests failed.
I have now tried to regenerate the tests, but only one of them was successfully regenerated. I include that test diff, plus the two output diffs of the two remaining test cases.
I hope that either the tests have been improved as expected (which I can't tell for myself for sure), or that someone gives me a pointer on how to change the patch.
Thanks
/Jonas
This looks worse.