We always had global and scratch loads to LDS in the gfx9,
but did not handle it. These were available via the 'lds'
encoding bit. In gfx940 this bit was reused as 'svs' which
resulted in new '_lds' opcodes effectively pushing this
bit into the opcode, but functionally it is the same. These
instructions are also available on gfx10.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
A potentially better alternative is to use gfx940 names with _LDS_ in the mnemonic instead of a modifier. This is logically a different opcode anyway. The only downside it is not compatible with the documentation and sp3. But then it was not implemented before and therefore not used, so there shall be no compatibility problem on practice. Well, it will also be different from MUBUF. Given the difference in both semantics and addressing mode I personally would prefer it to be different opcodes. At a pseudo level it is certainly easier to have separate ops for this.
Preferences?
One more thing to note: it is already incompatible with sp3 because we prohibit unused vdst, while sp3 enforces it.
It's probably better to have separate opcodes. In general I think the way we try to force all of these subtarget changes onto the same generic pseudos is more trouble than it's worth. It requires more and more code to verify and make use of the features, and it would be cleaner to move towards separate instruction definitions per subtarget
I remember your idea about switchable instruction tables per subtarget, but this is not really that. This is more about the asm syntax compatibility: to make all gfx9/gfx10 the same for these instructions, or to follow the spec which was amended for gfx940. It was amended purely due to encoding considerations, but semantically it is still the same instructions and do exactly the same. So I believe using same pseudos it warranted here.
What I've heard on today's meeting we are leaning towards spec compatibility, so then the patch does exactly that.