We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction
I've had to tweak the systemz reduced test case to prevent it folding away but the x86 test cases are all examples of the extra scalar conversions reported in PR39205