Enable by default post-legalization store merging to non-X86 machines for all targets to allow merging stores from lowered intrinsics / calls.
Pre-legalization store merging cannot yet be disabled as nodes with custom lowering may be lowered during legalization obscuring some merge candidates.
Can you point to an example testcase where the run before legalization matters?