Darwin platforms for both AArch64 and X86 can provide optimized bzero() routines. In this case, it may be preferable to use bzero in place of a memset of 0.
This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can be generated by platforms which may want to use bzero.
To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The conditions for this are largely a port of the bzero case in AArch64SelectionDAGInfo::EmitTargetCodeForMemset.
The only difference in comparison to the SelectionDAG code is that, when compiling for minsize, this will fire for all memsets of 0. The original code notes that it's not beneficial to do this for small memsets; however, using bzero here will save a mov from wzr. For minsize, I think that it's preferable to prioritise omitting the mov.
This also fixes a bug in the libcall legalization code which would delete instructions which could not be legalized. It also adds a check to make sure that we actually get a libcall name.
Code size improvements (Darwin):
- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)