The test case is based on an example from:
http://www.gdcvault.com/play/1023026/Taming-the-Jaguar-x86-Optimization
bool ArraySearch(int count, uint32_t needle, uint32_t haystack[]) { bool found = false; for (int i = 0; i < count; ++i) found |= needle == haystack[i]; return found; }
...so I think the pattern could show up in any "all_of" or "any_of" style of loop. I checked that the loop vectorizer is ok with the change in this case. x86 scalar and vector codegen looks fine with it too.
I'm hoping that D27933 will be approved so we don't have to check the commuted variant (unary cast op is less complex than phi, so it goes to the right side).