In situations like the example below, GlobalISel produces an unnecessary fmov from wzr.
define void @test(float* %ptr, float %x) { %cmp = fcmp une float undef, 0.000000e+00 br i1 %cmp, label %a, label %b a: %gep = getelementptr inbounds float, float* %ptr, i32 1 store float 0.000000e+00, float* %gep, align 4 ret void b: ret void }
In a lot of cases, matchFConstantToConstant in the pre-legalizer combiner handles this sort of thing.
However, that combine only works when all users of a G_FCONSTANT are stores. In this case, we have a G_FCMP which also uses the G_FCONSTANT.
This patch adds a post-legalization (lowering) combine which looks for G_STORES which store a positive 0. If such a store is found, and we know it can use wzr/xzr, then it's updated to use a G_CONSTANT 0.
During selection, the constant 0 + store will be selected to a store of wzr/xzr.
This is a 0.1% code size improvement at -Os for CTMark/mafft/pairlocalalign. There are minor code size improvements in other CTMark tests as well.