The condition needs to be different for right-shifts, else we may lose information in some cases.
Can you check the equivalent combine in globalisel? I think it's just handling the left shifts but should be generalized
I took a look at it and it seems like it's a different combine. This one explicitly tries to reduce 64 bits shift to 32 bits, but matchCombineTruncOfShl/applyCombineTruncOfShl simply moves the trunc into the operand of the shift:
// Fold trunc (shl x, K) -> shl (trunc x), K // => K < VT.getScalarSizeInBits()
I don't think this can work on right shifts unless we add another trunc in front of the shl again.
Maybe another (AMDGPU-specific since I think it just benefits us?) combine would be better for 64 bits right shifts reduction? I wrote it in D136319