This is an archive of the discontinued LLVM Phabricator instance.

[X86] Use indirect addressing for high 2GB of x32 address space
ClosedPublic

Authored by hvdijk on Apr 25 2022, 10:14 AM.

Details

Summary

Instructions that take immediate addresses sign-extend their operands, so cannot be used when we actually need zero extension. Use indirect addressing to avoid problems.

The functions in the test are a modified versions of the functions by the same names in large-constants.ll, with i64 types changed to i32.

Fixes #55061

Diff Detail

Event Timeline

hvdijk created this revision.Apr 25 2022, 10:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2022, 10:14 AM
hvdijk requested review of this revision.Apr 25 2022, 10:14 AM
hvdijk updated this revision to Diff 424960.Apr 25 2022, 10:40 AM

Clean up test slightly to load consecutive i32 values like the i64 version did

efriedma added inline comments.
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

The reasoning here seems strange. For example, suppose I write void f(int a) { ((char*)0x80000000)[a] = a; }. That has a base register, but sign-extension is wrong.

I guess you're trying to allow negative pointer offsets here, but I think SelectionDAG is throwing away the distinction you need here. (At the IR level, it's easy to distinguish between the base of a GEP and the offset.)

hvdijk added inline comments.Apr 25 2022, 12:24 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

That does the right thing, that's okay to allow. That results in movb %dil, -2147483648(%edi), and the fact that %edi is part of the address means %edi - 2147483648 is calculated as a 32-bit value, and then zero-extended, so it calculates the exact same thing as %edi + 2147483648 would if it were possible to do that directly. x86 is weird.

efriedma added inline comments.Apr 25 2022, 12:31 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

Oh, wait, nevermind, I think I see what you mean. The issue isn't the register; it's the prefix indicating 32-bit addressing, and that only gets emitted if there's a register operand.

It seems like you should be able to fix the encoding somehow; the prefix has nothing to do with the operands. Or maybe there's some reason you can't... but in that case, please add a comment explaining.

hvdijk added inline comments.Apr 25 2022, 12:58 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

Huh, that seems to actually work when I try it, but it's something that GCC doesn't generate, something that GNU objdump disassembles confusingly as something involving %eiz, and something that llvm-objdump disassembles as if there is no address size override. Will work on that, I didn't think that was possible, thanks for the pointer.

craig.topper added inline comments.Apr 25 2022, 1:19 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

I think %eiz mean there is a SIB byte but the index field in the SIB byte is 0b100(no index) and there is a shorter encoding that doesn't use a SIB byte. It's there to disambiguate two encodings that would otherwise print the same string.

hvdijk added inline comments.Apr 25 2022, 3:26 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

Trying to get it done, I'm realising this is something that isn't specific to ILP32, this is something we could and should do in LP64 mode as well, except that in LP64 mode we only generate more-complicated-than-necessary code (the same more-complicated-than-necessary code that I am generating for ILP32 here), not wrong code. A change to support this in both ILP32 and LP64 modes will probably end up being larger than I would have liked, but I will give it a go and see if the end result looks manageable enough.

hvdijk updated this revision to Diff 425346.Apr 26 2022, 5:10 PM

Add a comment explaining why we do not use address size overrides here.

hvdijk added inline comments.Apr 26 2022, 5:19 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

Unfortunately, although I was able to get something that mostly appeared to work, both for LP64 and for ILP32 mode, by modifying getAddressOperands (X86ISelDAGToDAG.cpp) to set X86::EIZ as an index register just like the GNU disassembler shows it -- which already results in the intended machine code, and also correct textual assembly that we are able to assemble to the intended machine code -- the big problem with it that I was unable to resolve without massively invasive changes was the fact that EIZ is not part of the register classes, meaning that this approach causes MIR verification to fail. I have added a comment hoping to explain that. It seems like something worth pursuing as a later change at some point; for now, I think it is better to stick with this change so that we have a small simple change that results in correct code, even if it is suboptimal.

craig.topper added inline comments.Apr 26 2022, 10:23 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

Instead of using X86::EIZ, could we suppress the sign extending of the immediate and zero extend it instead. Then detect that the immediate isn't an isInt<32> in X86MCCodeEmitter? I guess we'd need to do something for the assembly printer too to print %eiz. So maybe that isn't a good idea.

hvdijk added inline comments.Apr 26 2022, 11:57 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1714

That's possible too, right. Or we could detect that the immediate isn't an isInt<32> during MachineInstr to MCInst lowering and address it there. Wherever we detect it, we do have to make sure we do not truncate the displacement to an i32 before we do that. That is, we will need to change Disp's type in X86ISelAddressMode either way, but if we do not encode EIZ in the MachineInstr, we need to change the VT of our TargetConstants as well to keep track of the original value.

If we change it somewhere at the MCInst level, we also have the option of printing these instructions with an explicit addr32 prefix, rather than using EIZ, if that makes things easier for us.

@hvdijk reverse-ping - what happended with this patch?

hvdijk added a comment.EditedSep 18 2023, 1:29 AM

@hvdijk reverse-ping - what happended with this patch?

There was some talk about possible alternative approaches, but nothing that was clearly better, and nothing that I think anyone was pushing for me to update this patch to use (correct me if I am wrong). But it doesn't have approvals so I can't push it. It still applies to current LLVM unmodified (even with the typed pointers in the test: that's handled gracefully).

Please can you rebase? I've added the current large-constant-x32 codegen at 3b7dfda79de2

hvdijk updated this revision to Diff 557167.Sep 21 2023, 3:27 AM

Rebased.

I'd be happy with this as an initial fix for the x32 issue and we raise a ticket to do this more generally in X86MCCodeEmitter to improve x64 handling as well in the future.

What do the other reviewers think?

RKSimon accepted this revision.Oct 11 2023, 7:27 AM

LGTM - let's get this initial fix in and we can investigate improvements later on

This revision is now accepted and ready to land.Oct 11 2023, 7:27 AM