Now, move constant zero is lowered into two MIRs after instruction selection
```
v1 = copy wzr/xzr
v2 = copy v1
```
These two copies are coalesced in a later pass.
One problem of this is in Machine-Sink pass which runs before the copy propogation pass. Machine-sink can break a critical edge if at least two cheap MIRs can be sinked to that path. Thus, we may have a MBB which has only one mov wzr/xzr instruction. This can make block placement difficult to do the layout. For example, the test case below, copy-zero-reg.ll, has a loop unrolled by two. Sinking the mov wzr/xzr makes it impossible to find a fallthrough for every MBB and the currently generated code has a block looks like this
```
// BB#1:
mov w9, wzr
cbnz w8, .LBB0_5
b .LBB0_6
```
Below is the performance impacted by this patch
|spec2000/vpr| +1.4% |
|spec2006/libquantum| +4.9% |
|spec2006/perlbench| +1.2%|
|spec2017/blender +3.5% |
|spec2017/deepsjeng| -1.1%|