Supposedly clauses can be formed up until vmcnt/lgkmcnt are
saturated.
The limit for vmcnt on gfx9 is pretty big, so this requires
looking ahead 64 instructions. I don't know how much of an
impact this has on compile time. Reducing it should be
conservatively correct at the cost of extra nops inserted
into unreasonably large clauses.
Another issue is this doesn't account for how some
instructions increase the counter by more than 1,
but this should be conservatively correct.
gfx10 has 6 bits for LGKMCNT.