This was causing a bug where non-truncating stores would be selected instead of truncating ones.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir | ||
---|---|---|
106 ↗ | (On Diff #210230) | This is breaking this, producing a non-truncating store instead of the correct trunc store |
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir | ||
55 ↗ | (On Diff #210230) | This is breaking this, the correct trunc store now fails to select |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
3314 ↗ | (On Diff #210230) | Should also continue? Could make this a select on the match type to the one addPredicate call |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) | Yeah, just seen this doesn't work. I'm not familiar with this code, but how do non-truncstore predicates end up on trunc stores such that it breaks? |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) | I don't exactly understand what the expected relationship between predicate bits set on a PatFrag and the PatFrags which used it is. It's possible something is wrong with the AMDGPU PatFrags. This system is pretty confusing, because each property is added in a different layer of PatFrags defined in TargetSelectionDAG.td. The AMDGPU PatFrags reproduce the same hierarchy, with the additional address space predicates. There is also the extra IsTruncStore bit which needs to be manually set on the base truncstore PatFrag. For some reason the derivative PatFrags do not have it set, as it should be implied. However, they still need to set IsStore, although I would expect these to both behave the same way. |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) | I'm also not sure the way this loop is structured makes sense. It seems to assume too much about what different predicates can be combined. The atomic predicates also don't work correctly |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) | That else is suspicious. isTruncStore() means that the predicate needs to be added but !isTruncStore() can mean either no predicate is needed or a non-trunc store predicate is needed. I believe it needs to be: if (Predicate.isTruncStore()) add relevant predicate for trunc store if (Predicate.isNonTruncStore()) add relevant predicate for non-trunc store |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) |
The input we start with is pretty weird too and it's mostly a straight import of that. Suppose we start with the pattern: (truncstorei8 node:$x) when the PatFrag is expanded, we get: (truncstore node:$x)<<code-to-check-memvt==i8>> where the <<some C++ code>> is tablegen's way of representing predicates (this isn't quite true, it uses a short-hand name instead of the C++ when it's printing but the data in the pattern is the C++ code). Then after further expansion we get: (unindexedstore node:$x)<<code-to-check-memvt==i8>><<code-to-check-for-truncating>> (st node:$x)<<code-to-check-memvt==i8>><<code-to-check-for-truncating>><<code-to-check-for-unindexed>> That last line is how it looks on input to the GlobalISelEmitter before runOnPattern(). The first thing we needed to do was abstract out the C++ as it needed to be different for DAGISel vs GISel. To do that, each predicate was given a field in PatFrag. For these three predicates it was: ValueType MemoryVT = ?; bit IsTruncStore = ?; bit IsUnindexed = ?; Only one of these is set for each PatFrag as each one directly corresponds to the original C++ code. For example: IsTruncStore = 0 emits this code for DAGISel: !cast<StoreSDNode>(N)->isTruncatingStore();
Each predicate can only be one C++ fragment. If you need multiple fragments (which happens when you have PatFrags inside PatFrag expansions) you have multiple Predicates, each with a different field set corresponding to a different fragment.
Could you elaborate on this? |
llvm/utils/TableGen/GlobalISelEmitter.cpp | ||
---|---|---|
3314 ↗ | (On Diff #210230) |
I didn't spend much time looking at it to focus on one broken thing at a time, but the AMDGPU patterns for atomic_load/atomic_store don't work. They should be easy to handle as they ignore the ordering type and just need to preserve the atomicness in the MemOperand. I think the combination of IsStore and IsAtomic wasn't behaving as expected |
Thanks, it's only exposed with another patch though so I can't add that test here.
llvm/test/TableGen/address-space-patfrags.td | ||
---|---|---|
123 ↗ | (On Diff #213137) | Just seen this incorrect comment, will fix in the commit. |