Download Raw Diff

Details

Reviewers

ABataev
grosbach

Commits

rGbf66003a4f91: [MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives.

Summary

PTX does not support negative values in .bNN data directives and we must
emit such values in a way that ptxas can parse.

MCAsmInfo can now specify whether the target can emit negative values.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tra created this revision.Jul 8 2020, 1:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2020, 1:36 PM

Herald added subscribers: sanjoy.google, bixia, hiraditya, jholewinski. · View Herald Transcript

tra mentioned this in D82881: [DEBUGINFO]Fix debug info for packed bitfields..Jul 8 2020, 1:47 PM

Harbormaster failed remote builds in B63480: Diff 276541!Jul 8 2020, 2:31 PM

Herald added a subscriber: ormris. · View Herald TranscriptJul 8 2020, 2:31 PM

Updated existing test which produced data directive w/ negative value.

Harbormaster completed remote builds in B63514: Diff 276592.Jul 8 2020, 4:40 PM

Ping!

hfinkel added a subscriber: hfinkel.Jul 13 2020, 6:27 PM

hfinkel added inline comments.

llvm/lib/MC/MCExpr.cpp
71	Will uint64_t always be correct here? Shouldn't this depend on SizeInBytes (like the hex printing does)?

tra marked an inline comment as done.Jul 13 2020, 7:23 PM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	MCConstantExpr::getValue() returns int64_t so casting it to uint64_t should be safe. I guess I can find the matching unsigned type using std::make_unsigned. E.g: using unsigned_t = typename std::make_unsigned<decltype(Value)>::type; OS << static_cast<unsigned_t>(v);

Use std::make_unsigned to find the matching unsigned type.

hfinkel added inline comments.Jul 13 2020, 7:29 PM

llvm/lib/MC/MCExpr.cpp
71	I'm not worried about the cast being unsafe, at the C++ level, I'm worried about it printing a number larger than the relevant directive actually accepts. In your test, the directive is .b64, which presumably takes a 64-bit integer, so everything's fine. What if it were .b8 and the printed argument were 18446744073709551613 (or whatever)?

Harbormaster completed remote builds in B64086: Diff 277639.Jul 13 2020, 8:01 PM

Mask out unwanted bits in the unsigned representation of the Value.

tra marked an inline comment as done.Jul 14 2020, 10:08 AM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	Got it. I've masked out the unwanted bits.

ABataev added inline comments.Jul 14 2020, 10:15 AM

llvm/lib/MC/MCExpr.cpp
71	Does cuda gdb correctly handle the bitfields with this fix? Can you read/write values to/from bitfield in the debugger?

Harbormaster failed remote builds in B64179: Diff 277887!Jul 14 2020, 10:59 AM

tra marked an inline comment as done.Jul 14 2020, 11:17 AM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp

Sort of. It seems to handle field bits within a byte and knows correct bit field length, but struggles with the field which crosses the byte boundary.

E.g:

    struct s {
      unsigned char a : 3;
      unsigned char b : 6;
    } __attribute__((__packed__)) b;

(cuda-gdb) set var b.a=7   # This sets the field correctly.
(cuda-gdb) p b
$7 = {
  a = 7 '\a',
  b = 0 '\000'
}
(cuda-gdb) x/2bx &b
0x7fffca800100: 0x07    0x00
(cuda-gdb) set var b.a=8
warning: Value does not fit in 3 bits. 

(cuda-gdb) set var b.b=0x3f   # This one loses the top bit.
(cuda-gdb) p b
$8 = {
  a = 0 '\000',
  b = 31 '\037'   # Only 5 bits are set.
}
(cuda-gdb) x/2bx &b
0x7fffca800100: 0xf8    0x00  # Should've been 0xf8 0x01

ABataev added inline comments.Jul 14 2020, 11:20 AM

llvm/lib/MC/MCExpr.cpp
71	Hm, did you try to emit it as hexadecimal?

hfinkel added inline comments.Jul 14 2020, 12:15 PM

llvm/lib/MC/MCExpr.cpp
71	Hm, did you try to emit it as hexadecimal? Good point. Maybe we want the option to just force hex printing instead of this new case? Then we can just reuse the existing logic above for that.

tra marked an inline comment as done.Jul 14 2020, 12:19 PM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	Makes no difference. I've run ptxas on .ptx with the constant represented as decimal or as hex. ptxas accepted both and produced bit-for-bit identical binaries.

hfinkel added inline comments.Jul 14 2020, 12:26 PM

llvm/lib/MC/MCExpr.cpp
71	In that case, I recommend that we go with Alexey's suggestion. Remove this logic and make the condition above `if (PrintInHex \|\| (MAI && !MAI->supportsSignedData()))` (or something like that).

tra marked an inline comment as done.Jul 14 2020, 12:29 PM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	I'm concerned that the extra `0x` and always-on zero-padding will noticeably increase the size of PTX with the debug info as DWARF output seems to be predominantly `.b8`. Large builds like tensorflow already struggle with their binary size (we already run into ELF reloc overflows in debug builds unless we limit the number of GPUs we target) and this will contribute to the issue.

hfinkel added inline comments.Jul 14 2020, 12:35 PM

llvm/lib/MC/MCExpr.cpp
71	I'm concerned that the extra 0x and always-on zero-padding will noticeably increase the size of PTX with the debug info as DWARF output seems to be predominantly .b8. Large builds like tensorflow already struggle with their binary size (we already run into ELF reloc overflows in debug builds unless we limit the number of GPUs we target) and this will contribute to the issue. Ah, okay. I can understand that. Can you make a test case where this affects the printing of a .b8 directive?

ABataev added inline comments.Jul 14 2020, 12:38 PM

llvm/lib/MC/MCExpr.cpp
71	Maybe just print the value as hexadecimal if it is negative?

ABataev added inline comments.Jul 14 2020, 12:41 PM

llvm/lib/MC/MCExpr.cpp
71	Also, I can try to check if another my patch for bitfields works correctly.

tra marked an inline comment as done.Jul 14 2020, 12:42 PM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	The DWARF so far is the only way I've found to get MC to generate constant values in PTX. Global variables use appropriate .s8/.u8 directives. Arrays always use .b8 with unsigned byte values. Nothing else other than DWARF produces .bXX directives, AFAICT. MC tests run into the problem that they seem to rely on assembling/disassembling something and we can do neither for PTX. I'm not particularly familiar with MC, so I may be missing something. Ideas/suggestions are welcome.

ABataev added inline comments.Jul 14 2020, 12:55 PM

llvm/lib/MC/MCExpr.cpp
71	Yes, DWARF is the only place that produces .bXX directives.

Print negative values as hex.

tra marked an inline comment as done.Jul 14 2020, 1:11 PM

tra added inline comments.

llvm/lib/MC/MCExpr.cpp
71	Maybe just print the value as hexadecimal if it is negative? Good idea. Done.

Harbormaster completed remote builds in B64217: Diff 277956.Jul 14 2020, 2:03 PM

tra edited the summary of this revision. (Show Details)Jul 20 2020, 11:02 AM

This revision was not accepted when it landed; it landed in state Needs Review.Jul 20 2020, 9:19 PM

Closed by commit rGbf66003a4f91: [MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives. (authored by tra). · Explain Why

This revision was automatically updated to reflect the committed changes.

Diff 279415

llvm/include/llvm/MC/MCAsmInfo.h

Show First 20 Lines • Show All 203 Lines • ▼ Show 20 Lines	protected:
/// current section. If a data directive is set to null, smaller data		/// current section. If a data directive is set to null, smaller data
/// directives will be used to emit the large sizes. Defaults to "\t.byte\t",		/// directives will be used to emit the large sizes. Defaults to "\t.byte\t",
/// "\t.short\t", "\t.long\t", "\t.quad\t"		/// "\t.short\t", "\t.long\t", "\t.quad\t"
const char *Data8bitsDirective;		const char *Data8bitsDirective;
const char *Data16bitsDirective;		const char *Data16bitsDirective;
const char *Data32bitsDirective;		const char *Data32bitsDirective;
const char *Data64bitsDirective;		const char *Data64bitsDirective;

		/// True if data directives support signed values
		bool SupportsSignedData = true;

/// If non-null, a directive that is used to emit a word which should be		/// If non-null, a directive that is used to emit a word which should be
/// relocated as a 64-bit GP-relative offset, e.g. .gpdword on Mips. Defaults		/// relocated as a 64-bit GP-relative offset, e.g. .gpdword on Mips. Defaults
/// to nullptr.		/// to nullptr.
const char *GPRel64Directive = nullptr;		const char *GPRel64Directive = nullptr;

/// If non-null, a directive that is used to emit a word which should be		/// If non-null, a directive that is used to emit a word which should be
/// relocated as a 32-bit GP-relative offset, e.g. .gpword on Mips or .gprel32		/// relocated as a 32-bit GP-relative offset, e.g. .gpword on Mips or .gprel32
/// on Alpha. Defaults to nullptr.		/// on Alpha. Defaults to nullptr.
▲ Show 20 Lines • Show All 211 Lines • ▼ Show 20 Lines	public:
bool hasSubsectionsViaSymbols() const { return HasSubsectionsViaSymbols; }		bool hasSubsectionsViaSymbols() const { return HasSubsectionsViaSymbols; }

// Data directive accessors.		// Data directive accessors.

const char *getData8bitsDirective() const { return Data8bitsDirective; }		const char *getData8bitsDirective() const { return Data8bitsDirective; }
const char *getData16bitsDirective() const { return Data16bitsDirective; }		const char *getData16bitsDirective() const { return Data16bitsDirective; }
const char *getData32bitsDirective() const { return Data32bitsDirective; }		const char *getData32bitsDirective() const { return Data32bitsDirective; }
const char *getData64bitsDirective() const { return Data64bitsDirective; }		const char *getData64bitsDirective() const { return Data64bitsDirective; }
		bool supportsSignedData() const { return SupportsSignedData; }
const char *getGPRel64Directive() const { return GPRel64Directive; }		const char *getGPRel64Directive() const { return GPRel64Directive; }
const char *getGPRel32Directive() const { return GPRel32Directive; }		const char *getGPRel32Directive() const { return GPRel32Directive; }
const char *getDTPRel64Directive() const { return DTPRel64Directive; }		const char *getDTPRel64Directive() const { return DTPRel64Directive; }
const char *getDTPRel32Directive() const { return DTPRel32Directive; }		const char *getDTPRel32Directive() const { return DTPRel32Directive; }
const char *getTPRel64Directive() const { return TPRel64Directive; }		const char *getTPRel64Directive() const { return TPRel64Directive; }
const char *getTPRel32Directive() const { return TPRel32Directive; }		const char *getTPRel32Directive() const { return TPRel32Directive; }

/// Targets can implement this method to specify a section to switch to if the		/// Targets can implement this method to specify a section to switch to if the
▲ Show 20 Lines • Show All 236 Lines • Show Last 20 Lines

llvm/lib/MC/MCExpr.cpp

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	void MCExpr::print(raw_ostream &OS, const MCAsmInfo *MAI, bool InParens) const {			void MCExpr::print(raw_ostream &OS, const MCAsmInfo *MAI, bool InParens) const {
	switch (getKind()) {			switch (getKind()) {
	case MCExpr::Target:			case MCExpr::Target:
	return cast<MCTargetExpr>(this)->printImpl(OS, MAI);			return cast<MCTargetExpr>(this)->printImpl(OS, MAI);
	case MCExpr::Constant: {			case MCExpr::Constant: {
	auto Value = cast<MCConstantExpr>(*this).getValue();			auto Value = cast<MCConstantExpr>(*this).getValue();
	auto PrintInHex = cast<MCConstantExpr>(*this).useHexFormat();			auto PrintInHex = cast<MCConstantExpr>(*this).useHexFormat();
	auto SizeInBytes = cast<MCConstantExpr>(*this).getSizeInBytes();			auto SizeInBytes = cast<MCConstantExpr>(*this).getSizeInBytes();
				if (Value < 0 && MAI && !MAI->supportsSignedData())
				PrintInHex = true;
	if (PrintInHex)			if (PrintInHex)
	switch (SizeInBytes) {			switch (SizeInBytes) {
	default:			default:
	OS << "0x" << Twine::utohexstr(Value);			OS << "0x" << Twine::utohexstr(Value);
	break;			break;
	case 1:			case 1:
	OS << format("0x%02" PRIx64, Value);			OS << format("0x%02" PRIx64, Value);
	break;			break;
	case 2:			case 2:
	OS << format("0x%04" PRIx64, Value);			OS << format("0x%04" PRIx64, Value);
	break;			break;
	case 4:			case 4:
	OS << format("0x%08" PRIx64, Value);			OS << format("0x%08" PRIx64, Value);
	break;			break;
	case 8:			case 8:
	OS << format("0x%016" PRIx64, Value);			OS << format("0x%016" PRIx64, Value);
	break;			break;
	}			}
	else			else
	OS << Value;			OS << Value;
				hfinkelUnsubmitted Not Done Reply Inline Actions Will uint64_t always be correct here? Shouldn't this depend on SizeInBytes (like the hex printing does)? hfinkel: Will uint64_t always be correct here? Shouldn't this depend on SizeInBytes (like the hex…
				traAuthorUnsubmitted Done Reply Inline Actions MCConstantExpr::getValue() returns int64_t so casting it to uint64_t should be safe. I guess I can find the matching unsigned type using std::make_unsigned. E.g: using unsigned_t = typename std::make_unsigned<decltype(Value)>::type; OS << static_cast<unsigned_t>(v); tra: MCConstantExpr::getValue() returns int64_t so casting it to uint64_t should be safe. I guess I…
				hfinkelUnsubmitted Not Done Reply Inline Actions I'm not worried about the cast being unsafe, at the C++ level, I'm worried about it printing a number larger than the relevant directive actually accepts. In your test, the directive is .b64, which presumably takes a 64-bit integer, so everything's fine. What if it were .b8 and the printed argument were 18446744073709551613 (or whatever)? hfinkel: I'm not worried about the cast being unsafe, at the C++ level, I'm worried about it printing a…
				traAuthorUnsubmitted Done Reply Inline Actions Got it. I've masked out the unwanted bits. tra: Got it. I've masked out the unwanted bits.
				ABataevUnsubmitted Not Done Reply Inline Actions Does cuda gdb correctly handle the bitfields with this fix? Can you read/write values to/from bitfield in the debugger? ABataev: Does cuda gdb correctly handle the bitfields with this fix? Can you read/write values to/from…
				traAuthorUnsubmitted Done Reply Inline Actions Sort of. It seems to handle field bits within a byte and knows correct bit field length, but struggles with the field which crosses the byte boundary. E.g: struct s { unsigned char a : 3; unsigned char b : 6; } __attribute__((__packed__)) b; (cuda-gdb) set var b.a=7 # This sets the field correctly. (cuda-gdb) p b $7 = { a = 7 '\a', b = 0 '\000' } (cuda-gdb) x/2bx &b 0x7fffca800100: 0x07 0x00 (cuda-gdb) set var b.a=8 warning: Value does not fit in 3 bits. (cuda-gdb) set var b.b=0x3f # This one loses the top bit. (cuda-gdb) p b $8 = { a = 0 '\000', b = 31 '\037' # Only 5 bits are set. } (cuda-gdb) x/2bx &b 0x7fffca800100: 0xf8 0x00 # Should've been 0xf8 0x01 tra: Sort of. It seems to handle field bits within a byte and knows correct bit field length, but…
				ABataevUnsubmitted Not Done Reply Inline Actions Hm, did you try to emit it as hexadecimal? ABataev: Hm, did you try to emit it as hexadecimal?
				hfinkelUnsubmitted Not Done Reply Inline Actions Hm, did you try to emit it as hexadecimal? Good point. Maybe we want the option to just force hex printing instead of this new case? Then we can just reuse the existing logic above for that. hfinkel: > Hm, did you try to emit it as hexadecimal? Good point. Maybe we want the option to just…
				traAuthorUnsubmitted Done Reply Inline Actions I'm concerned that the extra `0x` and always-on zero-padding will noticeably increase the size of PTX with the debug info as DWARF output seems to be predominantly `.b8`. Large builds like tensorflow already struggle with their binary size (we already run into ELF reloc overflows in debug builds unless we limit the number of GPUs we target) and this will contribute to the issue. tra: I'm concerned that the extra `0x` and always-on zero-padding will noticeably increase the size…
				traAuthorUnsubmitted Done Reply Inline Actions Makes no difference. I've run ptxas on .ptx with the constant represented as decimal or as hex. ptxas accepted both and produced bit-for-bit identical binaries. tra: Makes no difference. I've run ptxas on .ptx with the constant represented as decimal or as hex.
				hfinkelUnsubmitted Not Done Reply Inline Actions In that case, I recommend that we go with Alexey's suggestion. Remove this logic and make the condition above `if (PrintInHex \|\| (MAI && !MAI->supportsSignedData()))` (or something like that). hfinkel: In that case, I recommend that we go with Alexey's suggestion. Remove this logic and make the…
				hfinkelUnsubmitted Not Done Reply Inline Actions I'm concerned that the extra 0x and always-on zero-padding will noticeably increase the size of PTX with the debug info as DWARF output seems to be predominantly .b8. Large builds like tensorflow already struggle with their binary size (we already run into ELF reloc overflows in debug builds unless we limit the number of GPUs we target) and this will contribute to the issue. Ah, okay. I can understand that. Can you make a test case where this affects the printing of a .b8 directive? hfinkel: > I'm concerned that the extra 0x and always-on zero-padding will noticeably increase the size…
				ABataevUnsubmitted Not Done Reply Inline Actions Maybe just print the value as hexadecimal if it is negative? ABataev: Maybe just print the value as hexadecimal if it is negative?
				ABataevUnsubmitted Not Done Reply Inline Actions Also, I can try to check if another my patch for bitfields works correctly. ABataev: Also, I can try to check if another my patch for bitfields works correctly.
				traAuthorUnsubmitted Done Reply Inline Actions Maybe just print the value as hexadecimal if it is negative? Good idea. Done. tra: > Maybe just print the value as hexadecimal if it is negative? Good idea. Done.
				traAuthorUnsubmitted Done Reply Inline Actions The DWARF so far is the only way I've found to get MC to generate constant values in PTX. Global variables use appropriate .s8/.u8 directives. Arrays always use .b8 with unsigned byte values. Nothing else other than DWARF produces .bXX directives, AFAICT. MC tests run into the problem that they seem to rely on assembling/disassembling something and we can do neither for PTX. I'm not particularly familiar with MC, so I may be missing something. Ideas/suggestions are welcome. tra: The DWARF so far is the only way I've found to get MC to generate constant values in PTX.
				ABataevUnsubmitted Not Done Reply Inline Actions Yes, DWARF is the only place that produces .bXX directives. ABataev: Yes, DWARF is the only place that produces .bXX directives.
	return;			return;
	}			}
	case MCExpr::SymbolRef: {			case MCExpr::SymbolRef: {
	const MCSymbolRefExpr &SRE = cast<MCSymbolRefExpr>(*this);			const MCSymbolRefExpr &SRE = cast<MCSymbolRefExpr>(*this);
	const MCSymbol &Sym = SRE.getSymbol();			const MCSymbol &Sym = SRE.getSymbol();
	// Parenthesize names that start with $ so that they don't look like			// Parenthesize names that start with $ so that they don't look like
	// absolute names.			// absolute names.
	bool UseParens =			bool UseParens =
	▲ Show 20 Lines • Show All 909 Lines • Show Last 20 Lines

llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	NVPTXMCAsmInfo::NVPTXMCAsmInfo(const Triple &TheTriple,
Data16bitsDirective = nullptr; // not supported		Data16bitsDirective = nullptr; // not supported
Data32bitsDirective = ".b32 ";		Data32bitsDirective = ".b32 ";
Data64bitsDirective = ".b64 ";		Data64bitsDirective = ".b64 ";
ZeroDirective = ".b8";		ZeroDirective = ".b8";
AsciiDirective = nullptr; // not supported		AsciiDirective = nullptr; // not supported
AscizDirective = nullptr; // not supported		AscizDirective = nullptr; // not supported
SupportsQuotedNames = false;		SupportsQuotedNames = false;
SupportsExtendedDwarfLocDirective = false;		SupportsExtendedDwarfLocDirective = false;
		SupportsSignedData = false;

// @TODO: Can we just disable this?		// @TODO: Can we just disable this?
WeakDirective = "\t// .weak\t";		WeakDirective = "\t// .weak\t";
GlobalDirective = "\t// .globl\t";		GlobalDirective = "\t// .globl\t";

UseIntegratedAssembler = false;		UseIntegratedAssembler = false;
}		}

llvm/test/DebugInfo/NVPTX/packed_bitfields.ll

	; RUN: llc < %s -mtriple=nvptx64-nvidia-cuda \| FileCheck %s			; RUN: llc < %s -mtriple=nvptx64-nvidia-cuda \| FileCheck %s

	; Produced at -O0 from:			; Produced at -O0 from:
	; struct {			; struct {
	; char : 3;			; char : 3;
	; char a : 6;			; char a : 6;
	; } __attribute__((__packed__)) b;			; } __attribute__((__packed__)) b;

	; Note that DWARF 2 counts bit offsets backwards from the high end of			; Note that DWARF 2 counts bit offsets backwards from the high end of
	; the storage unit to the high end of the bit field.			; the storage unit to the high end of the bit field.

	; CHECK: .section .debug_info			; CHECK: .section .debug_info
	; CHECK: .b8 3 // Abbrev {{.*}} DW_TAG_structure_type			; CHECK: .b8 3 // Abbrev {{.*}} DW_TAG_structure_type
	; CHECK: .b8 3 // DW_AT_decl_line			; CHECK: .b8 3 // DW_AT_decl_line
	; CHECK-NEXT: .b8 1 // DW_AT_byte_size			; CHECK-NEXT: .b8 1 // DW_AT_byte_size
	; CHECK-NEXT: .b8 6 // DW_AT_bit_size			; CHECK-NEXT: .b8 6 // DW_AT_bit_size
	; CHECK-NEXT: .b64 -1 // DW_AT_bit_offset			; Negative offset must be encoded as an unsigned integer.
				; CHECK-NEXT: .b64 0xffffffffffffffff // DW_AT_bit_offset
	; CHECK-NEXT: .b8 2 // DW_AT_data_member_location			; CHECK-NEXT: .b8 2 // DW_AT_data_member_location

	%struct.anon = type { i16 }			%struct.anon = type { i16 }

	@b = global %struct.anon zeroinitializer, align 1, !dbg !0			@b = global %struct.anon zeroinitializer, align 1, !dbg !0

	!llvm.dbg.cu = !{!2}			!llvm.dbg.cu = !{!2}
	!llvm.module.flags = !{!10, !11, !12, !13}			!llvm.module.flags = !{!10, !11, !12, !13}
	Show All 17 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 279415

llvm/include/llvm/MC/MCAsmInfo.h

llvm/lib/MC/MCExpr.cpp

llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp

llvm/test/DebugInfo/NVPTX/packed_bitfields.ll

This is an archive of the discontinued LLVM Phabricator instance.

[MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 279415

llvm/include/llvm/MC/MCAsmInfo.h

llvm/lib/MC/MCExpr.cpp

llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp

llvm/test/DebugInfo/NVPTX/packed_bitfields.ll

[MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives.
ClosedPublic