vec_rlnm was implemented according to the old ABI, which was wrong. The ABI team have fixed the issue (although not published yet). We need to re-implement these builtins according to the new ABI.
From (old implementation):
builtin_altivec_vrlwnm(a, b) & c;
To (new implementation):
vector unsigned int OneByte = { 0x8, 0x8, 0x8, 0x8 };
__builtin_altivec_vrlwnm(a, ((c << OneByte) | b));
Don't hard-code the names of intermediate results. I imagine this will fail even now on some build bot. Rather specify it as something like:
// CHECK-BE: %[[RES1:.+]] = shl <4 x i32
And feel free to use the saved result as the operand to the next instruction. Like:
// CHECK-BE: %[[RES2:.+]] = or <4 x i32> %[[RES1]]