See bug 39293: https://bugs.llvm.org/show_bug.cgi?id=39293
Known issues:
- This is a simple implementation assuming that lds_direct support is not needed in codegen. The implementation will have to be completely rewritten in case lds_direct support is required in the future. This is because, for example, m0 has to be added to USES for instructions accessing lds_direct.
- Not clear if lds_direct should be disabled for SDWA. AMD documentation is silent about SDWA but SP3 seems to disable it.
- AMD documentation explicitly states which opcodes do not support lds_direct. However looks like that the list is incomplete as SP3 disables lds_direct for most 'rev' opcodes.
- The enumeration of opcodes which do not support lds_direct is already long enough and may need to be expanded further. I was thinking about adding a TS flag to label the corresponding opcodes. However there are not many free bits left in TSFlags and I was reluctant to use a bit for this purpose.
What is missing:
- This patch does not enable lds_direct for v_readfirstlane_b32, v_readlane_b32 and v_writelane_b32. This will be implemented by a separate patch.
- Assembler documentation should be updated with a description of lds_support. This will be added by a separate patch.