For AMDGPU the insertion point for a block may not be the first
non-PHI instruction. This happens when a block contains EXEC
mask manipulation related to control flow (converging lanes).
Use SkipPHIsAndLabels to determine the block insertion point
so that the target can skip any block prologue instructions.
I would suggest we use SkipPHIsAndLabels() to determine the insertion point here. It will make sure we are inserting after prologue instructions. I would also suggest we fix the other two getFirstNonPHI() uses in this file.