Currently whenever we sink any instruction, we do clearKillFlags for every use of every use operand for that instruction, apparently there are a lot of duplication, therefore compile time penalties.
This patch collect all the interested registers first, do clearKillFlags for it all together at once at the end, so we only need to do clearKillFlags once for one register, duplication is avoided.
The speedup for one of our testcase is 10x
Suggestions are welcome.