Diff Detail
Event Timeline
Trivial nitpick, but essentially LGTM.
lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
386–387 | clang-format? | |
690–696 | Out of curiosity: The unsigned case of this can be implement using v_and_b32 as well, which would be more compact for small source sizes if s4 or similar was legal, since it can be encoded as VOP2. This is a micro-optimization that I believe doesn't even apply right now, but in general where is that kind of thing expected to be handled ultimately? Here in the instruction selector, or elsewhere in a separate machine instruction combining pass? | |
lib/Target/AMDGPU/AMDGPUInstructionSelector.h | ||
67 | selectG_EXT perhaps? |
lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
690–696 | It would only really be smaller if the mask value is an inline immediate. For this and the scalar 64-bit case, you could make a code size tradeoff by materializing the constant for multiple uses. We do similar things in SIShrinkInstructions, but I think most of the cases that appear there are due to the lack of context in SelectionDAG. I would probably want to try to handle it up front here. |
r364308
lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
386–387 | I took this part, but in general I think clang-format mangles BuildMI formatting |
selectG_EXT perhaps?