The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487:
...and those are shown in the IR test file and x86 codegen file.
Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use.
I replicated the codegen tests for AArch64 to show that forming usubo should be generally good (and that lines up with the existing uaddo sibling transform).
There's a potential regression seen in the AMDGPU test file when trying to form a SAD op though, so I added a hack to try to avoid that.
If I'm seeing the PPC changes correctly, this is a small improvement on those tests.