This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix assertion failure on mad with negative immediate addend
ClosedPublic

Authored by foad on Jun 23 2022, 5:38 AM.

Details

Summary

Without this, the new test case would fail with:

AMDGPUInstPrinter.cpp:545: void llvm::AMDGPUInstPrinter::printImmediate64(uint64_t, const llvm::MCSubtargetInfo &, llvm::raw_ostream &): Assertion `isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882' failed.

Diff Detail

Event Timeline

foad created this revision.Jun 23 2022, 5:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 5:38 AM
foad requested review of this revision.Jun 23 2022, 5:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 5:38 AM
foad added a reviewer: Restricted Project.Jun 23 2022, 5:42 AM

The problem was introduced by D127253. I'm not sure what the best fix is.

Since v_mad_u64_u32 is an unsigned instruction, the hardware will zero-extend a 32-bit literal operand to 64 bits, so perhaps this pattern added by D127253 should not be using as_i64imm (which sign extends):

def : GCNPat <
      (ThreeOpFragSDAG<mul, add> i32:$src0, i32:$src1, (i32 imm:$src2)),
      (EXTRACT_SUBREG (inst $src0, $src1, (i64 (as_i64imm $src2)), 0 /* clamp */), sub0)
      >;

On the other hand, the assert in printImmediate64 does not make much sense to me so I am happy to remove it.

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551

Perhaps I should truncate Imm to 32 bits here. I'm not sure.

foad added a comment.Jun 23 2022, 5:45 AM

Maybe we should be more careful about which 64-bit integer operands are signed/unsigned. We could have separate signed/unsigned versions of as_i64imm and printImmediate64.

The immediate could be truncated. We do not really care about upper bits anyway.

arsenm added inline comments.Jun 23 2022, 1:27 PM
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551

I think the truncation should be done at selection time. Ideally the verifier would enforce this (plus we should try to handle the s_mov_b64 case the comment references)

foad added inline comments.Jun 24 2022, 4:23 AM
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551

I don't understand the comment about s_mov_b64. How is it different from any other b64 operand, which zero-extends a 32-bit literal?

If we're going to verify stuff properly then we need to use different types for signed and unsigned 64-bit operands. I would prefer to do a short term fix first to avoid the assertion failures caused by D127253 (or revert it).

foad added a comment.Jun 24 2022, 4:54 AM

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));
 
     // In rare situations, we will have a 32-bit literal in a 64-bit
     // operand. This is technically allowed for the encoding of s_mov_b64.

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));
 
     // In rare situations, we will have a 32-bit literal in a 64-bit
     // operand. This is technically allowed for the encoding of s_mov_b64.

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));

This is a better quick fix

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

Probably laziness for querying the feature

foad updated this revision to Diff 439742.Jun 24 2022, 7:12 AM

Switch to the better quick fix.

arsenm accepted this revision.Jun 24 2022, 7:38 AM
This revision is now accepted and ready to land.Jun 24 2022, 7:38 AM
rampitec accepted this revision.Jun 24 2022, 9:09 AM

LGTM, but I really think we just need to truncate the immediate at selection.

foad added a comment.Jun 27 2022, 1:35 AM

LGTM, but I really think we just need to truncate the immediate at selection.

You mean as_i64imm should zero extend *instead* of sign extending?

This revision was landed with ongoing or failed builds.Jun 27 2022, 1:49 AM
This revision was automatically updated to reflect the committed changes.

LGTM, but I really think we just need to truncate the immediate at selection.

You mean as_i64imm should zero extend *instead* of sign extending?

In this context, yes. Because we essentially ignore high bits anyway.