Specially recognize the 1D computation as wave uniform. Insert a
readfirstlane during codegen prepare to assert the result is uniform
even if the operand is not. It would be better if the DAG divergence
could maintain the same property after the division is optimized into
shifts.
Addresses issue 54010.
There's currently a potential pass ordering issue because m_Mul
doesn't automatically handle commuted operands. The order the group ID
appears in the multiply is non-canonical after the lowering for
uniform-work-group-size replaces the operand.
Can this constant be defined somewhere?