This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AMDGPU/MCTargetDesc/
-
Target/
-
AMDGPU/
-
MCTargetDesc/
2/3
AMDGPUInstPrinter.cpp
-
test/CodeGen/AMDGPU/
-
CodeGen/
-
AMDGPU/
-
mad_u64_u32.ll

Differential D128435

[AMDGPU] Fix assertion failure on mad with negative immediate addend
ClosedPublic

Authored by foad on Jun 23 2022, 5:38 AM.

Download Raw Diff

Details

Reviewers

rampitec
nhaehnle
arsenm

Group Reviewers

Restricted Project

Commits

rG77e63b25f9e9: [AMDGPU] Fix assertion failure on mad with negative immediate addend

Summary

Without this, the new test case would fail with:

AMDGPUInstPrinter.cpp:545: void llvm::AMDGPUInstPrinter::printImmediate64(uint64_t, const llvm::MCSubtargetInfo &, llvm::raw_ostream &): Assertion `isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882' failed.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

foad created this revision.Jun 23 2022, 5:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 5:38 AM

Herald added subscribers: kosarev, jsilvanus, kerbowa and 8 others. · View Herald Transcript

foad requested review of this revision.Jun 23 2022, 5:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 5:38 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

The problem was introduced by D127253. I'm not sure what the best fix is.

Since v_mad_u64_u32 is an unsigned instruction, the hardware will zero-extend a 32-bit literal operand to 64 bits, so perhaps this pattern added by D127253 should not be using as_i64imm (which sign extends):

def : GCNPat <
      (ThreeOpFragSDAG<mul, add> i32:$src0, i32:$src1, (i32 imm:$src2)),
      (EXTRACT_SUBREG (inst $src0, $src1, (i64 (as_i64imm $src2)), 0 /* clamp */), sub0)
      >;

On the other hand, the assert in printImmediate64 does not make much sense to me so I am happy to remove it.

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551	Perhaps I should truncate Imm to 32 bits here. I'm not sure.

Maybe we should be more careful about which 64-bit integer operands are signed/unsigned. We could have separate signed/unsigned versions of as_i64imm and printImmediate64.

Harbormaster completed remote builds in B171584: Diff 439359.Jun 23 2022, 6:33 AM

The immediate could be truncated. We do not really care about upper bits anyway.

arsenm added inline comments.Jun 23 2022, 1:27 PM

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551	I think the truncation should be done at selection time. Ideally the verifier would enforce this (plus we should try to handle the s_mov_b64 case the comment references)

foad mentioned this in D127253: [AMDGPU] Use v_mad_u64_u32 for IMAD32.Jun 24 2022, 4:07 AM

foad added inline comments.Jun 24 2022, 4:23 AM

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
551	I don't understand the comment about s_mov_b64. How is it different from any other b64 operand, which zero-extends a 32-bit literal? If we're going to verify stuff properly then we need to use different types for signed and unsigned 64-bit operands. I would prefer to do a short term fix first to avoid the assertion failures caused by D127253 (or revert it).

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));
 
     // In rare situations, we will have a 32-bit literal in a 64-bit
     // operand. This is technically allowed for the encoding of s_mov_b64.

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

In D128435#3607680, @foad wrote:

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));
 
     // In rare situations, we will have a 32-bit literal in a 64-bit
     // operand. This is technically allowed for the encoding of s_mov_b64.

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

In D128435#3607680, @foad wrote:

Alternative quick fix:

diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b07c0a67ecf5..bd938d829953 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -542,7 +542,7 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
            STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
     O << "0.15915494309189532";
   else {
-    assert(isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882);
+    assert(isUInt<32>(Imm) || isInt<32>(Imm));

This is a better quick fix

(I don't understand why Imm == 0x3fc45f306dc9c882 was allowed in the assertion, on targets that don't have FeatureInv2PiInlineImm.)

Probably laziness for querying the feature

Switch to the better quick fix.

arsenm accepted this revision.Jun 24 2022, 7:38 AM

This revision is now accepted and ready to land.Jun 24 2022, 7:38 AM

Harbormaster completed remote builds in B171861: Diff 439742.Jun 24 2022, 8:16 AM

LGTM, but I really think we just need to truncate the immediate at selection.

In D128435#3608508, @rampitec wrote:

LGTM, but I really think we just need to truncate the immediate at selection.

You mean as_i64imm should zero extend *instead* of sign extending?

This revision was landed with ongoing or failed builds.Jun 27 2022, 1:49 AM

Closed by commit rG77e63b25f9e9: [AMDGPU] Fix assertion failure on mad with negative immediate addend (authored by foad). · Explain Why

This revision was automatically updated to reflect the committed changes.

foad added a commit: rG77e63b25f9e9: [AMDGPU] Fix assertion failure on mad with negative immediate addend.

In D128435#3611418, @foad wrote:

In D128435#3608508, @rampitec wrote:

LGTM, but I really think we just need to truncate the immediate at selection.

You mean as_i64imm should zero extend *instead* of sign extending?

In this context, yes. Because we essentially ignore high bits anyway.

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

MCTargetDesc/

AMDGPUInstPrinter.cpp

2 lines

test/

CodeGen/

AMDGPU/

mad_u64_u32.ll

25 lines

Diff 440130

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp

Show First 20 Lines • Show All 536 Lines • ▼ Show 20 Lines	void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
else if (Imm == DoubleToBits(4.0))		else if (Imm == DoubleToBits(4.0))
O << "4.0";		O << "4.0";
else if (Imm == DoubleToBits(-4.0))		else if (Imm == DoubleToBits(-4.0))
O << "-4.0";		O << "-4.0";
else if (Imm == 0x3fc45f306dc9c882 &&		else if (Imm == 0x3fc45f306dc9c882 &&
STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])		STI.getFeatureBits()[AMDGPU::FeatureInv2PiInlineImm])
O << "0.15915494309189532";		O << "0.15915494309189532";
else {		else {
assert(isUInt<32>(Imm) \|\| Imm == 0x3fc45f306dc9c882);		assert(isUInt<32>(Imm) \|\| isInt<32>(Imm));

// In rare situations, we will have a 32-bit literal in a 64-bit		// In rare situations, we will have a 32-bit literal in a 64-bit
// operand. This is technically allowed for the encoding of s_mov_b64.		// operand. This is technically allowed for the encoding of s_mov_b64.
O << formatHex(static_cast<uint64_t>(Imm));		O << formatHex(static_cast<uint64_t>(Imm));
}		}
}		}
		foadAuthorUnsubmitted Done Reply Inline Actions Perhaps I should truncate Imm to 32 bits here. I'm not sure. foad: Perhaps I should truncate Imm to 32 bits here. I'm not sure.
		arsenmUnsubmitted Not Done Reply Inline Actions I think the truncation should be done at selection time. Ideally the verifier would enforce this (plus we should try to handle the s_mov_b64 case the comment references) arsenm: I think the truncation should be done at selection time. Ideally the verifier would enforce…
		foadAuthorUnsubmitted Done Reply Inline Actions I don't understand the comment about s_mov_b64. How is it different from any other b64 operand, which zero-extends a 32-bit literal? If we're going to verify stuff properly then we need to use different types for signed and unsigned 64-bit operands. I would prefer to do a short term fix first to avoid the assertion failures caused by D127253 (or revert it). foad: I don't understand the comment about s_mov_b64. How is it different from any other b64 operand…

void AMDGPUInstPrinter::printBLGP(const MCInst *MI, unsigned OpNo,		void AMDGPUInstPrinter::printBLGP(const MCInst *MI, unsigned OpNo,
const MCSubtargetInfo &STI,		const MCSubtargetInfo &STI,
raw_ostream &O) {		raw_ostream &O) {
unsigned Imm = MI->getOperand(OpNo).getImm();		unsigned Imm = MI->getOperand(OpNo).getImm();
if (!Imm)		if (!Imm)
return;		return;

▲ Show 20 Lines • Show All 1,060 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/mad_u64_u32.ll

	Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
	; GFX11-NEXT: v_mad_u64_u32 v[0:1], null, v3, v2, 0x12d687			; GFX11-NEXT: v_mad_u64_u32 v[0:1], null, v3, v2, 0x12d687
	; GFX11-NEXT: ; return to shader part epilog			; GFX11-NEXT: ; return to shader part epilog
	%mul = mul i32 %a, %b			%mul = mul i32 %a, %b
	%add = add i32 %mul, 1234567			%add = add i32 %mul, 1234567
	%cast = bitcast i32 %add to float			%cast = bitcast i32 %add to float
	ret float %cast			ret float %cast
	}			}

				define amdgpu_ps float @mad_i32_vvi_neg(i32 %a, i32 %b) {
				; GFX9-LABEL: mad_i32_vvi_neg:
				; GFX9: ; %bb.0:
				; GFX9-NEXT: v_mov_b32_e32 v2, 0xffed2979
				; GFX9-NEXT: v_mov_b32_e32 v3, -1
				; GFX9-NEXT: v_mad_u64_u32 v[0:1], s[0:1], v0, v1, v[2:3]
				; GFX9-NEXT: ; return to shader part epilog
				;
				; GFX10-LABEL: mad_i32_vvi_neg:
				; GFX10: ; %bb.0:
				; GFX10-NEXT: v_mad_u64_u32 v[0:1], null, v0, v1, 0xffffffffffed2979
				; GFX10-NEXT: ; return to shader part epilog
				;
				; GFX11-LABEL: mad_i32_vvi_neg:
				; GFX11: ; %bb.0:
				; GFX11-NEXT: v_mov_b32_e32 v2, v1
				; GFX11-NEXT: v_mov_b32_e32 v3, v0
				; GFX11-NEXT: v_mad_u64_u32 v[0:1], null, v3, v2, 0xffffffffffed2979
				; GFX11-NEXT: ; return to shader part epilog
				%mul = mul i32 %a, %b
				%add = add i32 %mul, -1234567
				%cast = bitcast i32 %add to float
				ret float %cast
				}

	define amdgpu_ps float @mad_i32_vcv(i32 %a, i32 %c) {			define amdgpu_ps float @mad_i32_vcv(i32 %a, i32 %c) {
	; GFX9-LABEL: mad_i32_vcv:			; GFX9-LABEL: mad_i32_vcv:
	; GFX9: ; %bb.0:			; GFX9: ; %bb.0:
	; GFX9-NEXT: v_mad_u64_u32 v[0:1], s[0:1], v0, 42, v[1:2]			; GFX9-NEXT: v_mad_u64_u32 v[0:1], s[0:1], v0, 42, v[1:2]
	; GFX9-NEXT: ; return to shader part epilog			; GFX9-NEXT: ; return to shader part epilog
	;			;
	; GFX10-LABEL: mad_i32_vcv:			; GFX10-LABEL: mad_i32_vcv:
	; GFX10: ; %bb.0:			; GFX10: ; %bb.0:
	▲ Show 20 Lines • Show All 208 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix assertion failure on mad with negative immediate addendClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 440130

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp

llvm/test/CodeGen/AMDGPU/mad_u64_u32.ll

[AMDGPU] Fix assertion failure on mad with negative immediate addend
ClosedPublic