Fast register allocator skips bundled MIs, as the main assignment
loop uses MachineBasicBlock::iterator (= MachineInstrBundleIterator)
This was causing SIInsertWaitcnts to crash which expects all
instructions to have registers assigned.
This patch makes sure to set everything inside bundle to the same
assignments done on BUNDLE header.
Could we fix allocateInstruction instead of having a somewhat parallel way of assigning registers?