Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3416 | Is this really necessary given that we're generally moving away from using G_EXTRACT at all? |
- Removed G_EXTRACT checks
- Refactored parts of code dealing with setting OP_SEL_0 and OP_SEL_1 bits.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3411 | MI cannot be null | |
3432 | Don't need = 0 | |
3439 | MI cannot be null | |
3447 | Ditto | |
3450 | Move comment into else if and fix weird line break | |
3464–3471 | This matching is a bit heavy. Seems like we're missing some combines? This at least could use some comments for the pattern being matched | |
3481–3482 | Why would this pattern appear? |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3464–3471 | The Or+Shl+And come from RegBankSelect when it breaks down G_BUILD_VECTOR(_TRUNC) instructions that form <2 x 16> operands. There are other cases where such combine could help but it's a separate issue. | |
3481–3482 | Added comment to clarify. Register is s32 but fneg that we want to match is for s16. |
Seems reasonable overall, but I am a bit worried about testing all the possible corner cases.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3457 | I don't understand how this can be safe, when you never even look at MI->getOperand(2). Did you mean to test the mask elements for equality, e.g. ShuffleMask[0] == 1 instead of ShuffleMask[0] != 0? | |
3475 | This is NegLo ^= ... | |
3478 | Are there tests for this case? If HiSrc is undef then I am a bit surprised that we would ever match m_GShl(m_Reg(HiSrc), m_SpecificICst(16)) above. Also, do we need an equivalent case for when LoSrc is undef? | |
3490 | NegHi ^= ... |
- Fix handling of G_SHUFFLE_VECTOR
- Handled case where low part of <2 x 16> could also be undef.
- Added tests for new cases
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3457 | I mistakenly thought that ShuffleMask.size() == 2 would be enough of a restriction. That was a bug. Also, a lot more cases are covered now. | |
3478 | Cases where hi part is undef are common when operations on <N x 16> vectors with odd number of elements are broken down into multiple packed instructions. Arguments for last packed instruction will be undef for high part. Added new test insert_with_undef which has case with both lo and hi as undef. As for matching above, |
Oops, this was at least mostly re-implemented. Is there anything here main doesn't have already?
MI cannot be null