Current implementation of matchSwap in SIShrinkInstructions searches the entire use_nodbg_operands set to find the possible
pattern to generate v_swap instruction. This approach will lead to a O(N^3) in compile time for SIShrinkInstructions.
But in reality, the matching pattern only exists within nearby instructions in the same basic block. This work limits the search to a maximum of
16 instructions, and has a linear compile time comsumption.