In situations like the example below, GlobalISel produces an unnecessary fmov from wzr.
define void @test(float* %ptr, float %x) {
%cmp = fcmp une float undef, 0.000000e+00
br i1 %cmp, label %a, label %b
a:
%gep = getelementptr inbounds float, float* %ptr, i32 1
store float 0.000000e+00, float* %gep, align 4
ret void
b:
ret void
}In a lot of cases, matchFConstantToConstant in the pre-legalizer combiner handles this sort of thing.
However, that combine only works when all users of a G_FCONSTANT are stores. In this case, we have a G_FCMP which also uses the G_FCONSTANT.
This patch adds a post-legalization (lowering) combine which looks for G_STORES which store a positive 0. If such a store is found, and we know it can use wzr/xzr, then it's updated to use a G_CONSTANT 0.
During selection, the constant 0 + store will be selected to a store of wzr/xzr.
This is a 0.1% code size improvement at -Os for CTMark/mafft/pairlocalalign. There are minor code size improvements in other CTMark tests as well.