AMDGPU : Add V_SAD_U32 instruction patterns and it's corresponding LIT tests.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/CodeGen/AMDGPU/sad.ll | ||
---|---|---|
18–19 ↗ | (On Diff #66496) | I think we should also do a test case that selects from the result of the two sub operations, because I think this will give you different DagNodes. For example: %sub0 = sub i32 %a, %b |
You should add more tests using i32 vectors, as well as i8/i16 types. The packed sads won't be matched yet, but you should still see the legalized to i32 select to this
test/CodeGen/AMDGPU/sad.ll | ||
---|---|---|
1 ↗ | (On Diff #66539) | Remove the -mcpu=SI |
4 ↗ | (On Diff #66539) | This should be GCN. You should also check the operands |
You should also test when there are multiple uses of the components. You might want to add hasOneUse checks
Wouldn't you only want to not match the pattern when all the intermediate nodes had more than one use?
For the first pattern, there are 4 instructions in the input (min, max, add sub), so
min or max multiple use -> min, sad
min and max multiple use -> min, max, sad
sub multiple use -> sub, min, max, sad
so I think the first pattern just needs a hasOneUse on the sub. The other one should be similar
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
3411–3413 ↗ | (On Diff #66539) | I think you can remove most of the i32s. Only one should be sufficient |
3411–3413 ↗ | (On Diff #66539) | Probably just need to keep the ones on the named operands and not the results |
- Added LIT tests for i8, i16, i32 vector
- Add hasOneUse() for V_SAD pattern and added multiple use of component LIT tests.
- Remove unnecessary i32s from the V_SAD pattern.
lib/Target/AMDGPU/AMDGPUInstructions.td | ||
---|---|---|
621 ↗ | (On Diff #67765) | sub_su? |
test/CodeGen/AMDGPU/sad.ll | ||
34 ↗ | (On Diff #67765) | The sad is not supposed to be emitted if the sub has multiple uses |
42–48 ↗ | (On Diff #67765) | It's easiest to add another use by storing the vale, rather than adding a lot more instructions |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
3417–3419 ↗ | (On Diff #67897) | For this pattern the hasOneUse check belongs on the select. |
test/CodeGen/AMDGPU/sad.ll | ||
1 ↗ | (On Diff #67897) | Space before < |
70 ↗ | (On Diff #67897) | You should end the label lines with the :, this prevents it from accidentally matching for something like max_pat12 |
104–110 ↗ | (On Diff #67897) | It's simpler to use a store of a value instead of adding more complex uses for the multi use. It also helps to avoid adding instruction types similar to the ones you want to write checks for |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
3417–3419 ↗ | (On Diff #68242) | The hasOneUse belongs on the select here |
test/CodeGen/AMDGPU/sad.ll | ||
163–164 ↗ | (On Diff #68242) | I don't think it's necessary to have multiple use checks for all of the vector versions |
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
3417–3419 ↗ | (On Diff #68242) | Why do we have to apply hasOneUse on "select" instead of "sub" for the second pattern? |
- Apply hasOneUse on select for the second pattern.
- Remove checks for all of the vector versions.
You should add some more tests with some constant operands. For the sub patterns, that may require looking for add of negative constants
lib/Target/AMDGPU/AMDGPUInstructions.td | ||
---|---|---|
628 ↗ | (On Diff #68724) | select_oneuse should not go into the Properties block since it isn't commutative or associative |
test/CodeGen/AMDGPU/sad.ll | ||
101–114 ↗ | (On Diff #68724) | There's no point to testing multiple uses of the final result |
132 ↗ | (On Diff #68724) | This is to specific for a not check. Not checks can be fragile especially when checking the operands. It would be better to check for the separate subs, select and add |
This patch only adds constant test case for the first pattern. For the second pattern, it cannot handle it. As for an improvement, I will create a second patch for handling the constant case for the second pattern.
lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
3418–3419 ↗ | (On Diff #68900) | These operands should be indented to the same level as the other select operand to make it clear it's part of the select, and the add src2 should be on the next line after those |
test/CodeGen/AMDGPU/sad.ll | ||
41 ↗ | (On Diff #68900) | You should ad another test where the compared values don't match, and another where the second sub uses a different value to make sure those aren't matched |
test/CodeGen/AMDGPU/sad.ll | ||
---|---|---|
41 ↗ | (On Diff #68900) | What do you mean the compared values don't match? If the second sub uses a different value, this is incorrect logic for the v_sad_u32 second pattern? |
- Add indentation for v_sad_u32 patterns for better code understanding.
- Added one test where the compared values don't match, and another one where the second sub uses a different value to make sure those aren't matched.