So far, we haven't exposed the allocation of whole-wave
registers to regalloc. We hand-picked them for various
whole wave mode operations. With a future patch, we
want the allocator to efficiently allocate them rather
than using the custom pre-allocation pass.
Any liverange split of virtual registers involved in
whole-wave operations require the resulting COPY
introduced with the split to be performed for all
lanes. It isn't implemented in the compiler yet.
This patch would identify all such copies and
manipulate the exec mask around them to enable all
lanes without affecting the value of exec mask
elsewhere.
I'm still not convinced why this is needed in the -O0 flow?
By now, the VGPR allocation is done in the -O0 flow, and we no longer have any virtual registers. This pass act on virtual registers to see if wwm copies needed exec manipulation.