This bug was exposed by the rL360395.
Details
- Reviewers
kzhuravl michel.daenzer arsenm - Commits
- rZORGa77555e09b2f: [AMDGPU] Fixed handling of imemdiate i1 literals
rZORG64d446c667b5: [AMDGPU] Fixed handling of imemdiate i1 literals
rGa77555e09b2f: [AMDGPU] Fixed handling of imemdiate i1 literals
rG64d446c667b5: [AMDGPU] Fixed handling of imemdiate i1 literals
rG05791d90c916: [AMDGPU] Fixed handling of imemdiate i1 literals
rL360689: [AMDGPU] Fixed handling of imemdiate i1 literals
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2526 | I don’t understand where this is coming from. There should be no 1-bit immediates anywhere? |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2526 | Combiner sometimes produces "xor x, true", and even "add x, true". This is not the first time we hit it, we have even implemented lowering. |
Why does this return false? A 1-bit immediate is either 0 or -1, both of which can be represented as inline constants everywhere.
FWIW, this fixes the regression I reported, without regressing any other tests I run.
Technically yes, but the query is about VOP literal, while i1 ends up in a SOP normally.
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2526 | Yes, these are extended, but we have the situation where predicate is checked before the extension. While it technically can fit VOP literal I do not believe that would be a good idea. The bool operand shall go SOP instruction, so I preffered to return false. At the very least we will cancel unsuitable pattern matching earlier this way. |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2526 | But SOP instructions have the same inline immediate values. This isn't a property of the immediate or instruction |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2526 | OK, I see this is breaking not at the point I thought. In the bool case these still end up materialized with regular int64_t operands. This is only happening in the DAG patterns. Saying this is true should be fine, since it still works with the normal immediate folding heuristics. |
test/CodeGen/AMDGPU/xor3-i1-const.ll | ||
---|---|---|
10–20 | I was unable to reduce it further. Combine conditions are peculiar. |
I don’t understand where this is coming from. There should be no 1-bit immediates anywhere?