This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Target/AMDGPU/
-
lib/
-
Target/
-
AMDGPU/
-
AsmParser/
1/3
AMDGPUAsmParser.cpp
1/3
SIInstrInfo.cpp
-
Utils/
-
AMDGPUBaseInfo.h
1
AMDGPUBaseInfo.cpp

Differential D140883

[AMDGPU] Simplify getNumFlatOffsetBits. NFC.
ClosedPublic

Authored by foad on Jan 3 2023, 5:04 AM.

Download Raw Diff

Details

Reviewers

matejam
piotr
scott.linder

Group Reviewers

Restricted Project

Commits

rGf460c6658107: [AMDGPU] Simplify getNumFlatOffsetBits. NFC.

Summary

Previously we considered this field to be either N-bit unsigned or
N+1-bit signed, depending on the instruction. I think it's conceptually
simpler to say that the field is always N+1-bit signed, but some
instructions do not allow negative values.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

foad created this revision.Jan 3 2023, 5:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 5:04 AM

Herald added subscribers: kosarev, kerbowa, hiraditya and 6 others. · View Herald Transcript

foad requested review of this revision.Jan 3 2023, 5:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 5:04 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

foad added reviewers: Restricted Project, matejam, piotr.Jan 3 2023, 5:05 AM

kosarev added inline comments.Jan 3 2023, 5:42 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7968	Is this actually an improvement? Wouldn't it be conceptually simpler to pass `FlatVariant` and get the actual width and, if needed, signedness?

Harbormaster completed remote builds in B205443: Diff 485962.Jan 3 2023, 5:58 AM

Is this actually an improvement?

I think so (of course!) because it simplifies getNumFlatOffsetBits without making its callers more complicated.

Wouldn't it be conceptually simpler to pass FlatVariant and get the actual width and, if needed, signedness?

Not sure what you mean by "actual width" here. I would not want to revert to getNumFlatOffsetBits returning different values for different flat variants on the same subtarget. But I suppose an additional patch to make getNumFlatOffsetBits return the "AllowNegative" flag as well as the field width might be an improvement.

arsenm added inline comments.Jan 3 2023, 6:35 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7942	I don't really like the AllowNegative naming. SignBitIgnored?
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
2507–2510	Can return ternary operator

By actual width I mean what can be passed to isIntN()/isUInt() without adjustments. The AllowNegative thing feels a bit alien to the ISA, effectively adding an extra concept while not eliminating the need for adjustments. Signedness, in contrast, looks rather natural and familiar, and if (Signed ? isIntN(...) : isUIntN(...)) is probably easier to understand than if (!isIntN(...) || (!AllowNegative && Op.getImm() < 0)).

The AllowNegative thing feels a bit alien to the ISA, effectively adding an extra concept while not eliminating the need for adjustments.

Then I think we simply disagree about which one is conceptually simpler.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7942	I don't think "ignored" is right. In at least some cases (like hasNegativeScratchOffsetBug) it is definitely not ignored but leads to wrong behaviour.

In D140883#4023012, @foad wrote:

Is this actually an improvement?

I think so (of course!) because it simplifies getNumFlatOffsetBits without making its callers more complicated.

I agree that it is a bit simpler to conceptualize it as a signed field with a common size, especially as we already have to think about the "negative scratch offset bug" for some "signed" versions of the field. At that point, might as well just say it is always a signed field, and negative values are sometimes illegal.

Wouldn't it be conceptually simpler to pass FlatVariant and get the actual width and, if needed, signedness?

Not sure what you mean by "actual width" here. I would not want to revert to getNumFlatOffsetBits returning different values for different flat variants on the same subtarget. But I suppose an additional patch to make getNumFlatOffsetBits return the "AllowNegative" flag as well as the field width might be an improvement.

+1 to returning AllowNegative from getNumFlatOffsetBits, especially with structured bindings making it less awkward:

auto [OffsetSize, AllowNegative] = AMDGPU::getNumFlatOffsetBits(getSTI(), TSFlags);

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
4119
4120–4121	If the MSB is not always ignored, this comment should be updated/deleted

I agree that it is a bit simpler to conceptualize it as a signed field with a common size, especially as we already have to think about the "negative scratch offset bug" for some "signed" versions of the field. At that point, might as well just say it is always a signed field, and negative values are sometimes illegal.

The advantage of the original code is that it encapsulates (apart from the signedness and bug-handling bits, sadly) these specifics in a single place, getNumFlatOffsetBits(), whereas the proposed version effectively lets them leak to the surrounding logic. So on generating the error particularly the new code translates the signed non-negatives back to the precision-and-signedness representation.

So on generating the error particularly the new code translates the signed non-negatives back to the precision-and-signedness representation.

I only did that to make the patch NFC including the text of error messages. Really I'd prefer to reword the error message too, to something like "expected an N-bit signed offset" or "expected a non-negative N-bit signed offset".

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
4120–4121	In the absence of hardware bugs this might be true; I really don't know. I will see if I can find some good documentation. OTOH it is not a very useful comment for us, since the compiler never needs to rely on the bit being "ignored and forced to zero".

"expected a non-negative N-bit signed offset"

Might be worth ruminating on whether that would be an improvement as well before moving forward with this patch, on which, apart from noting that silicon-level bugs are probably not the best reasoning for user-exposed conceptual models, I've decided to have no opinion. :)

Things to take into consideration: that the offset bug will retire at some point and the chance for the hardware to support other offset representations in the future.

In D140883#4025487, @kosarev wrote:

I agree that it is a bit simpler to conceptualize it as a signed field with a common size, especially as we already have to think about the "negative scratch offset bug" for some "signed" versions of the field. At that point, might as well just say it is always a signed field, and negative values are sometimes illegal.

The advantage of the original code is that it encapsulates (apart from the signedness and bug-handling bits, sadly) these specifics in a single place, getNumFlatOffsetBits(), whereas the proposed version effectively lets them leak to the surrounding logic. So on generating the error particularly the new code translates the signed non-negatives back to the precision-and-signedness representation.

I'm not sure I follow the argument here, as the only thing being encapsulated is the field size as a function of the signedness, but you immediately say the signedness is sadly not encapsulated. What is being encapsulated more completely in the old version?

In D140883#4028873, @kosarev wrote:

"expected a non-negative N-bit signed offset"

Might be worth ruminating on whether that would be an improvement as well before moving forward with this patch, on which, apart from noting that silicon-level bugs are probably not the best reasoning for user-exposed conceptual models, I've decided to have no opinion. :)

Things to take into consideration: that the offset bug will retire at some point and the chance for the hardware to support other offset representations in the future.

I think this is the most compelling argument for leaving things as they are, along with the fact that anything we change here adds a small gap between the compiler writer/user's model and the model that the hardware specs document.

In D140883#4029620, @scott.linder wrote:

In D140883#4025487, @kosarev wrote:

I agree that it is a bit simpler to conceptualize it as a signed field with a common size, especially as we already have to think about the "negative scratch offset bug" for some "signed" versions of the field. At that point, might as well just say it is always a signed field, and negative values are sometimes illegal.

The advantage of the original code is that it encapsulates (apart from the signedness and bug-handling bits, sadly) these specifics in a single place, getNumFlatOffsetBits(), whereas the proposed version effectively lets them leak to the surrounding logic. So on generating the error particularly the new code translates the signed non-negatives back to the precision-and-signedness representation.

I'm not sure I follow the argument here, as the only thing being encapsulated is the field size as a function of the signedness, but you immediately say the signedness is sadly not encapsulated. What is being encapsulated more completely in the old version?

In D140883#4028873, @kosarev wrote:

"expected a non-negative N-bit signed offset"

Might be worth ruminating on whether that would be an improvement as well before moving forward with this patch, on which, apart from noting that silicon-level bugs are probably not the best reasoning for user-exposed conceptual models, I've decided to have no opinion. :)

Things to take into consideration: that the offset bug will retire at some point and the chance for the hardware to support other offset representations in the future.

I think this is the most compelling argument for leaving things as they are, along with the fact that anything we change here adds a small gap between the compiler writer/user's model and the model that the hardware specs document.

Also, just to clarify, even in light of the arguments against it I still vote for the change being made.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 10 2023, 8:04 AM

foad added a reviewer: scott.linder.Jan 11 2023, 1:07 AM

LGTM

I'm open to renaming things, but I think AllowNegative is at least as clear as anything I can think of.

This revision is now accepted and ready to land.Jan 11 2023, 11:25 AM

Closed by commit rGf460c6658107: [AMDGPU] Simplify getNumFlatOffsetBits. NFC. (authored by foad). · Explain WhyJan 12 2023, 2:40 AM

This revision was automatically updated to reflect the committed changes.

foad added a commit: rGf460c6658107: [AMDGPU] Simplify getNumFlatOffsetBits. NFC..

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

AsmParser/

AMDGPUAsmParser.cpp

23 lines

SIInstrInfo.cpp

18 lines

Utils/

AMDGPUBaseInfo.h

7 lines

AMDGPUBaseInfo.cpp

6 lines

Diff 488562

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,110 Lines • ▼ Show 20 Lines

bool AMDGPUAsmParser::validateFlatOffset(const MCInst &Inst,

const auto &Op = Inst.getOperand(OpNum);

if (!hasFlatOffsets() && Op.getImm() != 0) {

Error(getFlatOffsetLoc(Operands),

"flat offset modifier is not supported on this GPU");

return false;

}

// For FLAT segment the offset must be positive;

scott.linderUnsubmitted

Not Done

return false;

}

- // For FLAT segment the offset must be positive;

+ // For FLAT segment the offset must not be negative;

// MSB is ignored and forced to zero.

scott.linder:

// MSB is ignored and forced to zero.

if (TSFlags & (SIInstrFlags::FlatGlobal | SIInstrFlags::FlatScratch)) {

unsigned OffsetSize = AMDGPU::getNumFlatOffsetBits(getSTI());

scott.linderUnsubmitted

Not Done

If the MSB is not always ignored, this comment should be updated/deleted

scott.linder: If the MSB is not always ignored, this comment should be updated/deleted

foadAuthorUnsubmitted

Done

In the absence of hardware bugs this might be true; I really don't know. I will see if I can find some good documentation.

OTOH it is not a very useful comment for us, since the compiler never needs to rely on the bit being "ignored and forced to zero".

foad: In the absence of hardware bugs this //might// be true; I really don't know. I will see if I…

unsigned OffsetSize = AMDGPU::getNumFlatOffsetBits(getSTI(), true);

bool AllowNegative =

if (!isIntN(OffsetSize, Op.getImm())) {

TSFlags & (SIInstrFlags::FlatGlobal | SIInstrFlags::FlatScratch);

if (!isIntN(OffsetSize, Op.getImm()) || (!AllowNegative && Op.getImm() < 0)) {

Error(getFlatOffsetLoc(Operands),

Twine("expected a ") + Twine(OffsetSize) + "-bit signed offset");

Twine("expected a ") +

(AllowNegative ? Twine(OffsetSize) + "-bit signed offset"

: Twine(OffsetSize - 1) + "-bit unsigned offset"));

return false;

}

} else {

unsigned OffsetSize = AMDGPU::getNumFlatOffsetBits(getSTI(), false);

if (!isUIntN(OffsetSize, Op.getImm())) {

Error(getFlatOffsetLoc(Operands),

Twine("expected a ") + Twine(OffsetSize) + "-bit unsigned offset");

return false;

}

return true;

}

SMLoc AMDGPUAsmParser::getSMEMOffsetLoc(const OperandVector &Operands) const {

// Start with second operand because SMEM Offset cannot be dst or src0.

for (unsigned i = 2, e = Operands.size(); i != e; ++i) {

AMDGPUOperand &Op = ((AMDGPUOperand &)*Operands[i]);

▲ Show 20 Lines • Show All 5,159 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,933 Lines • ▼ Show 20 Lines	bool SIInstrInfo::isLegalFLATOffset(int64_t Offset, unsigned AddrSpace,
if (!ST.hasFlatInstOffsets())		if (!ST.hasFlatInstOffsets())
return false;		return false;

if (ST.hasFlatSegmentOffsetBug() && FlatVariant == SIInstrFlags::FLAT &&		if (ST.hasFlatSegmentOffsetBug() && FlatVariant == SIInstrFlags::FLAT &&
(AddrSpace == AMDGPUAS::FLAT_ADDRESS \|\|		(AddrSpace == AMDGPUAS::FLAT_ADDRESS \|\|
AddrSpace == AMDGPUAS::GLOBAL_ADDRESS))		AddrSpace == AMDGPUAS::GLOBAL_ADDRESS))
return false;		return false;

bool Signed = FlatVariant != SIInstrFlags::FLAT;		bool AllowNegative = FlatVariant != SIInstrFlags::FLAT;
		arsenmUnsubmitted Not Done Reply Inline Actions I don't really like the AllowNegative naming. SignBitIgnored? arsenm: I don't really like the AllowNegative naming. SignBitIgnored?
		foadAuthorUnsubmitted Done Reply Inline Actions I don't think "ignored" is right. In at least some cases (like hasNegativeScratchOffsetBug) it is definitely not ignored but leads to wrong behaviour. foad: I don't think "ignored" is right. In at least some cases (like hasNegativeScratchOffsetBug) it…
if (ST.hasNegativeScratchOffsetBug() &&		if (ST.hasNegativeScratchOffsetBug() &&
FlatVariant == SIInstrFlags::FlatScratch)		FlatVariant == SIInstrFlags::FlatScratch)
Signed = false;		AllowNegative = false;
if (ST.hasNegativeUnalignedScratchOffsetBug() &&		if (ST.hasNegativeUnalignedScratchOffsetBug() &&
FlatVariant == SIInstrFlags::FlatScratch && Offset < 0 &&		FlatVariant == SIInstrFlags::FlatScratch && Offset < 0 &&
(Offset % 4) != 0) {		(Offset % 4) != 0) {
return false;		return false;
}		}

unsigned N = AMDGPU::getNumFlatOffsetBits(ST, Signed);		unsigned N = AMDGPU::getNumFlatOffsetBits(ST);
return Signed ? isIntN(N, Offset) : isUIntN(N, Offset);		return isIntN(N, Offset) && (AllowNegative \|\| Offset >= 0);
}		}

// See comment on SIInstrInfo::isLegalFLATOffset for what is legal and what not.		// See comment on SIInstrInfo::isLegalFLATOffset for what is legal and what not.
std::pair<int64_t, int64_t>		std::pair<int64_t, int64_t>
SIInstrInfo::splitFlatOffset(int64_t COffsetVal, unsigned AddrSpace,		SIInstrInfo::splitFlatOffset(int64_t COffsetVal, unsigned AddrSpace,
uint64_t FlatVariant) const {		uint64_t FlatVariant) const {
int64_t RemainderOffset = COffsetVal;		int64_t RemainderOffset = COffsetVal;
int64_t ImmField = 0;		int64_t ImmField = 0;
bool Signed = FlatVariant != SIInstrFlags::FLAT;		bool AllowNegative = FlatVariant != SIInstrFlags::FLAT;
if (ST.hasNegativeScratchOffsetBug() &&		if (ST.hasNegativeScratchOffsetBug() &&
FlatVariant == SIInstrFlags::FlatScratch)		FlatVariant == SIInstrFlags::FlatScratch)
Signed = false;		AllowNegative = false;

const unsigned NumBits = AMDGPU::getNumFlatOffsetBits(ST, Signed);		const unsigned NumBits = AMDGPU::getNumFlatOffsetBits(ST) - 1;
if (Signed) {		if (AllowNegative) {
		kosarevUnsubmitted Not Done Reply Inline Actions Is this actually an improvement? Wouldn't it be conceptually simpler to pass `FlatVariant` and get the actual width and, if needed, signedness? kosarev: Is this actually an improvement? Wouldn't it be conceptually simpler to pass `FlatVariant` and…
// Use signed division by a power of two to truncate towards 0.		// Use signed division by a power of two to truncate towards 0.
int64_t D = 1LL << (NumBits - 1);		int64_t D = 1LL << NumBits;
RemainderOffset = (COffsetVal / D) * D;		RemainderOffset = (COffsetVal / D) * D;
ImmField = COffsetVal - RemainderOffset;		ImmField = COffsetVal - RemainderOffset;

if (ST.hasNegativeUnalignedScratchOffsetBug() &&		if (ST.hasNegativeUnalignedScratchOffsetBug() &&
FlatVariant == SIInstrFlags::FlatScratch && ImmField < 0 &&		FlatVariant == SIInstrFlags::FlatScratch && ImmField < 0 &&
(ImmField % 4) != 0) {		(ImmField % 4) != 0) {
// Make ImmField a multiple of 4		// Make ImmField a multiple of 4
RemainderOffset += ImmField % 4;		RemainderOffset += ImmField % 4;
▲ Show 20 Lines • Show All 765 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

	Show First 20 Lines • Show All 1,259 Lines • ▼ Show 20 Lines
	/// \return The encoding that can be used for a 32-bit literal offset in an SMRD			/// \return The encoding that can be used for a 32-bit literal offset in an SMRD
	/// instruction. This is only useful on CI.s			/// instruction. This is only useful on CI.s
	std::optional<int64_t> getSMRDEncodedLiteralOffset32(const MCSubtargetInfo &ST,			std::optional<int64_t> getSMRDEncodedLiteralOffset32(const MCSubtargetInfo &ST,
	int64_t ByteOffset);			int64_t ByteOffset);

	/// For FLAT segment the offset must be positive;			/// For FLAT segment the offset must be positive;
	/// MSB is ignored and forced to zero.			/// MSB is ignored and forced to zero.
	///			///
	/// \return The number of bits available for the offset field in flat			/// \return The number of bits available for the signed offset field in flat
	/// instructions.			/// instructions. Note that some forms of the instruction disallow negative
	unsigned getNumFlatOffsetBits(const MCSubtargetInfo &ST, bool Signed);			/// offsets.
				unsigned getNumFlatOffsetBits(const MCSubtargetInfo &ST);

	/// \returns true if this offset is small enough to fit in the SMRD			/// \returns true if this offset is small enough to fit in the SMRD
	/// offset field. \p ByteOffset should be the offset in bytes and			/// offset field. \p ByteOffset should be the offset in bytes and
	/// not the encoded offset.			/// not the encoded offset.
	bool isLegalSMRDImmOffset(const MCSubtargetInfo &ST, int64_t ByteOffset);			bool isLegalSMRDImmOffset(const MCSubtargetInfo &ST, int64_t ByteOffset);

	bool splitMUBUFOffset(uint32_t Imm, uint32_t &SOffset, uint32_t &ImmOffset,			bool splitMUBUFOffset(uint32_t Imm, uint32_t &SOffset, uint32_t &ImmOffset,
	const GCNSubtarget *Subtarget,			const GCNSubtarget *Subtarget,
	▲ Show 20 Lines • Show All 122 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

Show First 20 Lines • Show All 2,496 Lines • ▼ Show 20 Lines	std::optional<int64_t> getSMRDEncodedLiteralOffset32(const MCSubtargetInfo &ST,
if (!isCI(ST) \|\| !isDwordAligned(ByteOffset))		if (!isCI(ST) \|\| !isDwordAligned(ByteOffset))
return std::nullopt;		return std::nullopt;

int64_t EncodedOffset = convertSMRDOffsetUnits(ST, ByteOffset);		int64_t EncodedOffset = convertSMRDOffsetUnits(ST, ByteOffset);
return isUInt<32>(EncodedOffset) ? std::optional<int64_t>(EncodedOffset)		return isUInt<32>(EncodedOffset) ? std::optional<int64_t>(EncodedOffset)
: std::nullopt;		: std::nullopt;
}		}

unsigned getNumFlatOffsetBits(const MCSubtargetInfo &ST, bool Signed) {		unsigned getNumFlatOffsetBits(const MCSubtargetInfo &ST) {
// Address offset is 12-bit signed for GFX10, 13-bit for GFX9 and GFX11+.		// Address offset is 12-bit signed for GFX10, 13-bit for GFX9 and GFX11+.
if (AMDGPU::isGFX10(ST))		if (AMDGPU::isGFX10(ST))
return Signed ? 12 : 11;		return 12;

return Signed ? 13 : 12;		return 13;
		arsenmUnsubmitted Not Done Reply Inline Actions Can return ternary operator arsenm: Can return ternary operator
}		}

// Given Imm, split it into the values to put into the SOffset and ImmOffset		// Given Imm, split it into the values to put into the SOffset and ImmOffset
// fields in an MUBUF instruction. Return false if it is not possible (due to a		// fields in an MUBUF instruction. Return false if it is not possible (due to a
// hardware bug needing a workaround).		// hardware bug needing a workaround).
//		//
// The required alignment ensures that individual address components remain		// The required alignment ensures that individual address components remain
// aligned if they are aligned to begin with. It also ensures that additional		// aligned if they are aligned to begin with. It also ensures that additional
▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Simplify getNumFlatOffsetBits. NFC.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 488562

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h

llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

[AMDGPU] Simplify getNumFlatOffsetBits. NFC.
ClosedPublic