I think this manages to not break the DAG handling with the divergent
predicates because the stadalone divergent patterns end up with a
higher priority than the pattern on the instruction definition.
The 16-bit versions don't work yet.
This looks OK to me, but I'm not as familiar with how this will impact SDAG. Might want to have someone else take a look too.
Yeah, let's take this.
r366254, though irritatingly I had to split the tests to not fail in a release build