There should be no intesection between SDWA operands and potential MIs. E.g.:
v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0 v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0 v_add_u32 v3, v4, v2
In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed)
To solve this problem we keep track of every SDWA operand that should be "pulled out" - operand created from MI matched by another SDWA operand.
shared_ptr should not be used anywhere