Download Raw Diff

Details

Reviewers

craig.topper
RKSimon
pengfei

Commits

rG8d520973b02b: [X86] Use indirect addressing for high 2GB of x32 address space

Summary

Instructions that take immediate addresses sign-extend their operands, so cannot be used when we actually need zero extension. Use indirect addressing to avoid problems.

The functions in the test are a modified versions of the functions by the same names in large-constants.ll, with i64 types changed to i32.

Fixes #55061

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hvdijk created this revision.Apr 25 2022, 10:14 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 10:14 AM

Herald added subscribers: StephenFan, pengfei, hiraditya. · View Herald Transcript

hvdijk requested review of this revision.Apr 25 2022, 10:14 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptApr 25 2022, 10:14 AM

Clean up test slightly to load consecutive i32 values like the i64 version did

Harbormaster completed remote builds in B161206: Diff 424960.Apr 25 2022, 11:39 AM

efriedma added a subscriber: efriedma.Apr 25 2022, 12:20 PM

efriedma added inline comments.

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	The reasoning here seems strange. For example, suppose I write `void f(int a) { ((char*)0x80000000)[a] = a; }`. That has a base register, but sign-extension is wrong. I guess you're trying to allow negative pointer offsets here, but I think SelectionDAG is throwing away the distinction you need here. (At the IR level, it's easy to distinguish between the base of a GEP and the offset.)

hvdijk added inline comments.Apr 25 2022, 12:24 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	That does the right thing, that's okay to allow. That results in `movb %dil, -2147483648(%edi)`, and the fact that `%edi` is part of the address means `%edi - 2147483648` is calculated as a 32-bit value, and then zero-extended, so it calculates the exact same thing as `%edi + 2147483648` would if it were possible to do that directly. x86 is weird.

efriedma added inline comments.Apr 25 2022, 12:31 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	Oh, wait, nevermind, I think I see what you mean. The issue isn't the register; it's the prefix indicating 32-bit addressing, and that only gets emitted if there's a register operand. It seems like you should be able to fix the encoding somehow; the prefix has nothing to do with the operands. Or maybe there's some reason you can't... but in that case, please add a comment explaining.

hvdijk added inline comments.Apr 25 2022, 12:58 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	Huh, that seems to actually work when I try it, but it's something that GCC doesn't generate, something that GNU objdump disassembles confusingly as something involving %eiz, and something that llvm-objdump disassembles as if there is no address size override. Will work on that, I didn't think that was possible, thanks for the pointer.

craig.topper added inline comments.Apr 25 2022, 1:19 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	I think %eiz mean there is a SIB byte but the index field in the SIB byte is 0b100(no index) and there is a shorter encoding that doesn't use a SIB byte. It's there to disambiguate two encodings that would otherwise print the same string.

hvdijk added inline comments.Apr 25 2022, 3:26 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	Trying to get it done, I'm realising this is something that isn't specific to ILP32, this is something we could and should do in LP64 mode as well, except that in LP64 mode we only generate more-complicated-than-necessary code (the same more-complicated-than-necessary code that I am generating for ILP32 here), not wrong code. A change to support this in both ILP32 and LP64 modes will probably end up being larger than I would have liked, but I will give it a go and see if the end result looks manageable enough.

Add a comment explaining why we do not use address size overrides here.

hvdijk added inline comments.Apr 26 2022, 5:19 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	Unfortunately, although I was able to get something that mostly appeared to work, both for LP64 and for ILP32 mode, by modifying getAddressOperands (X86ISelDAGToDAG.cpp) to set X86::EIZ as an index register just like the GNU disassembler shows it -- which already results in the intended machine code, and also correct textual assembly that we are able to assemble to the intended machine code -- the big problem with it that I was unable to resolve without massively invasive changes was the fact that EIZ is not part of the register classes, meaning that this approach causes MIR verification to fail. I have added a comment hoping to explain that. It seems like something worth pursuing as a later change at some point; for now, I think it is better to stick with this change so that we have a small simple change that results in correct code, even if it is suboptimal.

Harbormaster completed remote builds in B161486: Diff 425346.Apr 26 2022, 6:05 PM

craig.topper added inline comments.Apr 26 2022, 10:23 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	Instead of using X86::EIZ, could we suppress the sign extending of the immediate and zero extend it instead. Then detect that the immediate isn't an isInt<32> in X86MCCodeEmitter? I guess we'd need to do something for the assembly printer too to print %eiz. So maybe that isn't a good idea.

hvdijk added inline comments.Apr 26 2022, 11:57 PM

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714	That's possible too, right. Or we could detect that the immediate isn't an isInt<32> during MachineInstr to MCInst lowering and address it there. Wherever we detect it, we do have to make sure we do not truncate the displacement to an i32 before we do that. That is, we will need to change `Disp`'s type in `X86ISelAddressMode` either way, but if we do not encode `EIZ` in the `MachineInstr`, we need to change the VT of our `TargetConstant`s as well to keep track of the original value. If we change it somewhere at the MCInst level, we also have the option of printing these instructions with an explicit `addr32` prefix, rather than using `EIZ`, if that makes things easier for us.

glaubitz added a subscriber: glaubitz.Sep 16 2023, 2:45 AM

@hvdijk reverse-ping - what happended with this patch?

In D124406#4646976, @RKSimon wrote:

@hvdijk reverse-ping - what happended with this patch?

There was some talk about possible alternative approaches, but nothing that was clearly better, and nothing that I think anyone was pushing for me to update this patch to use (correct me if I am wrong). But it doesn't have approvals so I can't push it. It still applies to current LLVM unmodified (even with the typed pointers in the test: that's handled gracefully).

RKSimon mentioned this in rG3b7dfda79de2: [X86] Add test cases for gnux32 large constants Issue #55061.Sep 20 2023, 4:31 AM

Please can you rebase? I've added the current large-constant-x32 codegen at 3b7dfda79de2

Rebased.

I'd be happy with this as an initial fix for the x32 issue and we raise a ticket to do this more generally in X86MCCodeEmitter to improve x64 handling as well in the future.

What do the other reviewers think?

Harbormaster completed remote builds in B257487: Diff 557167.Sep 21 2023, 4:23 AM

LGTM - let's get this initial fix in and we can investigate improvements later on

This revision is now accepted and ready to land.Oct 11 2023, 7:27 AM

Closed by commit rG8d520973b02b: [X86] Use indirect addressing for high 2GB of x32 address space (authored by hvdijk). · Explain WhyOct 11 2023, 11:21 AM

This revision was automatically updated to reflect the committed changes.

hvdijk added a commit: rG8d520973b02b: [X86] Use indirect addressing for high 2GB of x32 address space.

Diff 557685

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp

Show First 20 Lines • Show All 1,693 Lines • ▼ Show 20 Lines	if (Val != 0 &&
!X86::isOffsetSuitableForCodeModel(Val, M,		!X86::isOffsetSuitableForCodeModel(Val, M,
AM.hasSymbolicDisplacement()))		AM.hasSymbolicDisplacement()))
return true;		return true;
// In addition to the checks required for a register base, check that		// In addition to the checks required for a register base, check that
// we do not try to use an unsafe Disp with a frame index.		// we do not try to use an unsafe Disp with a frame index.
if (AM.BaseType == X86ISelAddressMode::FrameIndexBase &&		if (AM.BaseType == X86ISelAddressMode::FrameIndexBase &&
!isDispSafeForFrameIndex(Val))		!isDispSafeForFrameIndex(Val))
return true;		return true;
		// In ILP32 (x32) mode, pointers are 32 bits and need to be zero-extended to
		// 64 bits. Instructions with 32-bit register addresses perform this zero
		// extension for us and we can safely ignore the high bits of Offset.
		// Instructions with only a 32-bit immediate address do not, though: they
		// sign extend instead. This means only address the low 2GB of address space
		// is directly addressable, we need indirect addressing for the high 2GB of
		// address space.
		// TODO: Some of the earlier checks may be relaxed for ILP32 mode as the
		// implicit zero extension of instructions would cover up any problem.
		// However, we have asserts elsewhere that get triggered if we do, so keep
		// the checks for now.
		// TODO: We would actually be able to accept these, as well as the same
		// addresses in LP64 mode, by adding the EIZ pseudo-register as an operand
		efriedmaUnsubmitted Not Done Reply Inline Actions The reasoning here seems strange. For example, suppose I write `void f(int a) { ((char)0x80000000)[a] = a; }`. That has a base register, but sign-extension is wrong. I guess you're trying to allow negative pointer offsets here, but I think SelectionDAG is throwing away the distinction you need here. (At the IR level, it's easy to distinguish between the base of a GEP and the offset.) efriedma:* The reasoning here seems strange. For example, suppose I write `void f(int a) {…
		hvdijkAuthorUnsubmitted Done Reply Inline Actions That does the right thing, that's okay to allow. That results in `movb %dil, -2147483648(%edi)`, and the fact that `%edi` is part of the address means `%edi - 2147483648` is calculated as a 32-bit value, and then zero-extended, so it calculates the exact same thing as `%edi + 2147483648` would if it were possible to do that directly. x86 is weird. hvdijk: That does the right thing, that's okay to allow. That results in `movb %dil, -2147483648(%edi)`…
		efriedmaUnsubmitted Not Done Reply Inline Actions Oh, wait, nevermind, I think I see what you mean. The issue isn't the register; it's the prefix indicating 32-bit addressing, and that only gets emitted if there's a register operand. It seems like you should be able to fix the encoding somehow; the prefix has nothing to do with the operands. Or maybe there's some reason you can't... but in that case, please add a comment explaining. efriedma: Oh, wait, nevermind, I think I see what you mean. The issue isn't the register; it's the…
		hvdijkAuthorUnsubmitted Done Reply Inline Actions Huh, that seems to actually work when I try it, but it's something that GCC doesn't generate, something that GNU objdump disassembles confusingly as something involving %eiz, and something that llvm-objdump disassembles as if there is no address size override. Will work on that, I didn't think that was possible, thanks for the pointer. hvdijk: Huh, that seems to actually work when I try it, but it's something that GCC doesn't generate…
		craig.topperUnsubmitted Not Done Reply Inline Actions I think %eiz mean there is a SIB byte but the index field in the SIB byte is 0b100(no index) and there is a shorter encoding that doesn't use a SIB byte. It's there to disambiguate two encodings that would otherwise print the same string. craig.topper: I think %eiz mean there is a SIB byte but the index field in the SIB byte is 0b100(no index)…
		hvdijkAuthorUnsubmitted Done Reply Inline Actions Trying to get it done, I'm realising this is something that isn't specific to ILP32, this is something we could and should do in LP64 mode as well, except that in LP64 mode we only generate more-complicated-than-necessary code (the same more-complicated-than-necessary code that I am generating for ILP32 here), not wrong code. A change to support this in both ILP32 and LP64 modes will probably end up being larger than I would have liked, but I will give it a go and see if the end result looks manageable enough. hvdijk: Trying to get it done, I'm realising this is something that isn't specific to ILP32, this is…
		hvdijkAuthorUnsubmitted Done Reply Inline Actions Unfortunately, although I was able to get something that mostly appeared to work, both for LP64 and for ILP32 mode, by modifying getAddressOperands (X86ISelDAGToDAG.cpp) to set X86::EIZ as an index register just like the GNU disassembler shows it -- which already results in the intended machine code, and also correct textual assembly that we are able to assemble to the intended machine code -- the big problem with it that I was unable to resolve without massively invasive changes was the fact that EIZ is not part of the register classes, meaning that this approach causes MIR verification to fail. I have added a comment hoping to explain that. It seems like something worth pursuing as a later change at some point; for now, I think it is better to stick with this change so that we have a small simple change that results in correct code, even if it is suboptimal. hvdijk: Unfortunately, although I was able to get something that mostly appeared to work, both for LP64…
		craig.topperUnsubmitted Not Done Reply Inline Actions Instead of using X86::EIZ, could we suppress the sign extending of the immediate and zero extend it instead. Then detect that the immediate isn't an isInt<32> in X86MCCodeEmitter? I guess we'd need to do something for the assembly printer too to print %eiz. So maybe that isn't a good idea. craig.topper: Instead of using X86::EIZ, could we suppress the sign extending of the immediate and zero…
		hvdijkAuthorUnsubmitted Done Reply Inline Actions That's possible too, right. Or we could detect that the immediate isn't an isInt<32> during MachineInstr to MCInst lowering and address it there. Wherever we detect it, we do have to make sure we do not truncate the displacement to an i32 before we do that. That is, we will need to change `Disp`'s type in `X86ISelAddressMode` either way, but if we do not encode `EIZ` in the `MachineInstr`, we need to change the VT of our `TargetConstant`s as well to keep track of the original value. If we change it somewhere at the MCInst level, we also have the option of printing these instructions with an explicit `addr32` prefix, rather than using `EIZ`, if that makes things easier for us. hvdijk: That's possible too, right. Or we could detect that the immediate isn't an isInt<32> during…
		// to get an address size override to be emitted. However, this
		// pseudo-register is not part of any register class and therefore causes
		// MIR verification to fail.
		if (Subtarget->isTarget64BitILP32() && !isUInt<31>(Val) &&
		!AM.hasBaseOrIndexReg())
		return true;
}		}
AM.Disp = Val;		AM.Disp = Val;
return false;		return false;

}		}

bool X86DAGToDAGISel::matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM,		bool X86DAGToDAGISel::matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM,
bool AllowSegmentRegForX32) {		bool AllowSegmentRegForX32) {
SDValue Address = N->getOperand(1);		SDValue Address = N->getOperand(1);

// load gs:0 -> GS segment register.		// load gs:0 -> GS segment register.
// load fs:0 -> FS segment register.		// load fs:0 -> FS segment register.
▲ Show 20 Lines • Show All 4,677 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/large-constants-x32.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-linux-gnux32 \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-linux-gnux32 \| FileCheck %s

	define void @constant_expressions() {			define void @constant_expressions() {
	; CHECK-LABEL: constant_expressions:			; CHECK-LABEL: constant_expressions:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl -289477652, %eax			; CHECK-NEXT: movl $-289477652, %eax # imm = 0xEEBEEBEC
	; CHECK-NEXT: movl -289477644, %ecx			; CHECK-NEXT: movl (%eax), %ecx
	; CHECK-NEXT: addl -289477648, %eax			; CHECK-NEXT: movl $-289477644, %edx # imm = 0xEEBEEBF4
	; CHECK-NEXT: addl -289477636, %ecx			; CHECK-NEXT: movl (%edx), %edx
	; CHECK-NEXT: addl %eax, %ecx			; CHECK-NEXT: movl $-289477648, %esi # imm = 0xEEBEEBF0
	; CHECK-NEXT: movl %ecx, -289477652			; CHECK-NEXT: addl (%esi), %ecx
				; CHECK-NEXT: movl $-289477636, %esi # imm = 0xEEBEEBFC
				; CHECK-NEXT: addl (%esi), %edx
				; CHECK-NEXT: addl %ecx, %edx
				; CHECK-NEXT: movl %edx, (%eax)
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%0 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 0) to i32*)			%0 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 0) to i32*)
	%1 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 4) to i32*)			%1 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 4) to i32*)
	%2 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 8) to i32*)			%2 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 8) to i32*)
	%3 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 16) to i32*)			%3 = load i32, i32* inttoptr (i32 add (i32 -289477652, i32 16) to i32*)
	%4 = add i32 %0, %1			%4 = add i32 %0, %1
	%5 = add i32 %2, %3			%5 = add i32 %2, %3
	%6 = add i32 %4, %5			%6 = add i32 %4, %5
	store i32 %6, i32* inttoptr (i32 add (i32 -289477652, i32 0) to i32*)			store i32 %6, i32* inttoptr (i32 add (i32 -289477652, i32 0) to i32*)
	ret void			ret void
	}			}


	define void @constant_expressions2() {			define void @constant_expressions2() {
	; CHECK-LABEL: constant_expressions2:			; CHECK-LABEL: constant_expressions2:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl -289477652, %eax			; CHECK-NEXT: movl $-289477652, %eax # imm = 0xEEBEEBEC
	; CHECK-NEXT: movl -289477644, %ecx			; CHECK-NEXT: movl (%eax), %ecx
	; CHECK-NEXT: addl -289477648, %eax			; CHECK-NEXT: movl $-289477644, %edx # imm = 0xEEBEEBF4
	; CHECK-NEXT: addl -289477640, %ecx			; CHECK-NEXT: movl (%edx), %edx
	; CHECK-NEXT: addl %eax, %ecx			; CHECK-NEXT: movl $-289477648, %esi # imm = 0xEEBEEBF0
	; CHECK-NEXT: movl %ecx, -289477652			; CHECK-NEXT: addl (%esi), %ecx
				; CHECK-NEXT: movl $-289477640, %esi # imm = 0xEEBEEBF8
				; CHECK-NEXT: addl (%esi), %edx
				; CHECK-NEXT: addl %ecx, %edx
				; CHECK-NEXT: movl %edx, (%eax)
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%0 = load i32, i32* inttoptr (i32 -289477652 to i32*)			%0 = load i32, i32* inttoptr (i32 -289477652 to i32*)
	%1 = load i32, i32* inttoptr (i32 -289477648 to i32*)			%1 = load i32, i32* inttoptr (i32 -289477648 to i32*)
	%2 = load i32, i32* inttoptr (i32 -289477644 to i32*)			%2 = load i32, i32* inttoptr (i32 -289477644 to i32*)
	%3 = load i32, i32* inttoptr (i32 -289477640 to i32*)			%3 = load i32, i32* inttoptr (i32 -289477640 to i32*)
	%4 = add i32 %0, %1			%4 = add i32 %0, %1
	%5 = add i32 %2, %3			%5 = add i32 %2, %3
	%6 = add i32 %4, %5			%6 = add i32 %4, %5
	store i32 %6, i32* inttoptr (i32 -289477652 to i32*)			store i32 %6, i32* inttoptr (i32 -289477652 to i32*)
	ret void			ret void
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use indirect addressing for high 2GB of x32 address space
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 557685

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp

llvm/test/CodeGen/X86/large-constants-x32.ll

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use indirect addressing for high 2GB of x32 address spaceClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 557685

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp

llvm/test/CodeGen/X86/large-constants-x32.ll

[X86] Use indirect addressing for high 2GB of x32 address space
ClosedPublic