There are some cases where the instruction that sets up the iteration
count for a tail predicated loop cannot be moved before the dlstp,
stopping tail predication entirely. This patch checks if the mov operand
can be used and if so, uses that instead.
Note for review: I am not sure if my approach to modifying getCount()
and adding TPNumElements is
the best possible approach, please give feedback if you can think of a
better one.
Maybe just store the Register instead?