A few more places still need to be updated, but this is most of them.
Details
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2846โ2847 | Should not we always return V_ADD_I32_e32 here? | |
2850 | Same here. | |
3893 | It needs assert(Inst.getOpcode() == AMDGPU::S_ADD_I32 || Inst.getOpcode() == AMDGPU::S_SUB_I32) |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2846โ2847 | One of these is dead. We always select S_ADD_I32/S_SUB_I32. We don't select these with uses of the SCC value, so I don't think it matters we use. I tried swapping this to always select s_add_u32 for add, but that has a similar problem later. We select s_addk_i32 from s_add_i32, and not s_add_u32. That's formed a lot later, where it's more questionable to make assumptions about how SCC is getting used. We're missing optimizations to try to use SCC conditions, but ideally there would be some. |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
3897 | Now you can send here S_ADD_U32 and you do not want to convert it into V_SUB_U32_e64. |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
3897 | S_ADD_U32 is never selected, so it never reaches here. I added it to the other switch so that in case it is selected, it will hit the assert here. |
Skip s_add_u32. It's still used in one case where the carry is used. Cleaning this up will be a more involved separate step
Should not we always return V_ADD_I32_e32 here?