As explained in the comment:
For a FLAT instruction the hardware decides whether to access
global/scratch/shared memory based on the high bits of vaddr,
ignoring the offset field, so we have to ensure that when we add
remainder to vaddr it still points into the same underlying object.
The easiest way to do that is to make sure that we split the offset
into two pieces that are both >= 0 or both <= 0.
In particular FLAT (as opposed to SCRATCH and GLOBAL) instructions have
an unsigned immediate offset field, so we can't use it to help split a
negative offset.
Can you add a fixme to check if this is needed if we know the address isn't FLAT_ADDRESS?