I wasn't sure what the best strategy was for the wave size
difference. I went for an explicit, enforced builtin for each. The
other option would be to just assume wave64, and IRGen the different
mangling + zext. I didn't see an obvious way to check the wave size
where builtins are emitted, and it might be beneficial to force you to
acknowledge the wave size difference? Or it might be an unnecessary
complication.
The behavior is also slightly odd when directly specifying
-target-feature to cc1 for +/- the size (since we have both positive
and negative forms of both sizes), but this is probably unimportant.
We're also still missing a predefined macro for the wave size, which
we probably need.
So the implication here is that wave32 is the preferred choice on newer architectures, and hence the default when available?