This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Add imm variants for surface and texture instructions
ClosedPublic

Authored by asavonic on Oct 21 2021, 9:44 AM.

Details

Summary

Texture/sampler/surface target operands can be either a register or
an immediate (an index of .texref, .samplerref or .surfref).

TableGen declarations for these instructions used to have only
Int64Regs operands, so this caused issues when machine verifier
is turned on:

*** Bad machine code: Expected a register operand. ***
- function:    bar
- basic block: %bb.0  (0x55b144d99ab8)
- instruction: %4:int32regs = SULD_1D_I32_TRAP 0, killed %2:int32regs
- operand 1:   0

The solution is to duplicate these instructions for all possible
operand types (i16imm and Int64Regs). Since this would
essentially double the amount code in TableGen, the patch also
does some refactoring for the original instructions to keep
things manageable.

Diff Detail

Event Timeline

asavonic created this revision.Oct 21 2021, 9:44 AM
asavonic requested review of this revision.Oct 21 2021, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2021, 9:44 AM
tra added a comment.Oct 21 2021, 10:10 AM

LGTM in general.

I wonder if we could find a less verbose way to translate _R ->_I enums. E.g figure out if we can define _I variants as _R + fixed offset.

LGTM in general.

I wonder if we could find a less verbose way to translate _R ->_I enums. E.g figure out if we can define _I variants as _R + fixed offset.

It is possible, but we have to rename opcodes to make it work. GRAD and LEVEL variants break the ordering.

TEX_1D_F32_F32_GRAD_II	= 2995,     // <---- these 4 are ordered correctly
TEX_1D_F32_F32_GRAD_IR	= 2996,
TEX_1D_F32_F32_GRAD_RI	= 2997,
TEX_1D_F32_F32_GRAD_RR	= 2998,
TEX_1D_F32_F32_II	= 2999,             // <---- these are not ordered
TEX_1D_F32_F32_IR	= 3000,
TEX_1D_F32_F32_LEVEL_II	= 3001,
TEX_1D_F32_F32_LEVEL_IR	= 3002,
TEX_1D_F32_F32_LEVEL_RI	= 3003,
TEX_1D_F32_F32_LEVEL_RR	= 3004,
TEX_1D_F32_F32_RI	= 3005,
TEX_1D_F32_F32_RR	= 3006,

Although I think that this can make this code a bit more fragile. If someone changes the tablegen file and accidentally breaks the ordering, this pass will silently break.

Please let me know if the patch is acceptable, or we need to change anything.

tra accepted this revision.Nov 9 2021, 11:28 AM

The changes look good in general. Thank you for cleaning this up.

I'm still a bit concerned about all the boilerplate changes that we don't have good test coverage for. Granted, that's been the way since the initial implementation of texture/surface instructions and this patch is clearly an improvement.
We may need to eventually autogenerate the tests for all instruction variants, similarly to what we do for *MMA instructions. https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/NVPTX/wmma.py

This revision is now accepted and ready to land.Nov 9 2021, 11:28 AM

The changes look good in general. Thank you for cleaning this up.

I'm still a bit concerned about all the boilerplate changes that we don't have good test coverage for. Granted, that's been the way since the initial implementation of texture/surface instructions and this patch is clearly an improvement.
We may need to eventually autogenerate the tests for all instruction variants, similarly to what we do for *MMA instructions. https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/NVPTX/wmma.py

Agree. Independent test suite based on the PTX spec would be a really nice thing to have. I checked the patch several times and it seems there are no more errors, but it is really hard to verify without the tests.

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp