Page MenuHomePhabricator

[X86] Zero-extend pointers to i64 for x86_64
ClosedPublic

Authored by hvdijk on Nov 12 2020, 5:15 AM.

Details

Summary

For LP64 mode, this has no effect as pointers are already 64 bits.
For ILP32 mode (x32), this extension is specified by the ABI.

Diff Detail

Event Timeline

hvdijk created this revision.Nov 12 2020, 5:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2020, 5:15 AM
hvdijk requested review of this revision.Nov 12 2020, 5:15 AM
pengfei added inline comments.Nov 12 2020, 9:56 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
3077

Is some risk here? It is used to truncate i1 only before.

llvm/test/CodeGen/X86/musttail-varargs.ll
348

I have some doubt on the implementation. Is caller always needs zeroext for the pointer even if it knows the upper bits are zero? Form line 309, the callers seems know the upper bits are zero.

pengfei added inline comments.Nov 12 2020, 10:08 PM
llvm/test/CodeGen/X86/x32-function_pointer-2.ll
16–17

This file should not be affected, right?

llvm/test/CodeGen/X86/x86-64-sret-return.ll
13

I saw other tests all use 64-bit instructions. In which case we may use 32-bit instruction?

hvdijk added inline comments.Nov 13 2020, 1:20 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
3077

If LocVT and ValVT are different, it is the responsibility of this function to take a LocVT DAG node and convert it to a ValVT DAG node, and LLVM will assert if it fails to do so. If isExtInLoc() returns true, LocVT and ValVT will be different, and there is no other code to handle that conversion, so any non-bit-vector extensions of return values would already be broken and would fail that assert. I think the fact that it did not cause problems before is simply because it never came up before for any type other than bit vectors.

llvm/test/CodeGen/X86/musttail-varargs.ll
348

This is a general missing optimization in LLVM that affects targets other than X86 as well, you are correct that this movl %edi, %edi is not needed if the high bits of %rdi are already guaranteed to be zero. LLVM can lose that information. I have not looked at this case in detail, but I have previously seen this be a problem where a node is a copy of a function parameter: since it is not the function parameter, merely a copy of its value, the fact that it is a zero-extended i32 value is not available. Since the generated code is correct, just suboptimal, I did not think it is necessary to fix that at the same time.

llvm/test/CodeGen/X86/x32-function_pointer-2.ll
16–17

This file is affected because callq can take either the copy of *%rsi like it did before, or *%rsi directly since it knows it is already zero-extended.

llvm/test/CodeGen/X86/x86-64-sret-return.ll
13

When we need to copy one 64-bit register to another 64-bit register, we can use the 64-bit move instructions to preserve the high bits, or the 32-bit move instructions to zero out the high bits. When the high bits are known to be zero, the 64-bit move instructions and the 32-bit move instructions have the same effect, since zeroing out the high bits when they are already zero does nothing.

xbolva00 added inline comments.
llvm/test/CodeGen/X86/sibcall.ll
83–84

Adjust fixme?

hvdijk updated this revision to Diff 305095.Nov 13 2020, 4:11 AM

Remove no longer necessary FIXME comment

hvdijk marked an inline comment as done.Nov 13 2020, 4:12 AM
hvdijk added inline comments.
llvm/test/CodeGen/X86/sibcall.ll
83–84

Good spot. This comment applied to the movl %edi, %eax that is no longer generated, so I have simply removed the comment.

pengfei added inline comments.Nov 13 2020, 5:02 AM
llvm/test/CodeGen/X86/x32-function_pointer-2.ll
16–17

I see. But this test should also pass without the patch since you just loosen check conditions.
I think it's better to make the conditions more strict. Or at least [[REG:.*]] in line 16 is not needed.

hvdijk updated this revision to Diff 305148.Nov 13 2020, 7:56 AM
hvdijk marked an inline comment as done.

Remove no longer used [[REG:.*]] from test

llvm/test/CodeGen/X86/x32-function_pointer-2.ll
16–17

Since both %rsi and %r[[REG]] are equally valid, without any reason why LLVM should prefer one over the other, I did not want to have a check that only permitted one of them since later changes to LLVM could arbitrarily change it; ideally, the check would be for *%r{si|[[REG]]}, but I think FileCheck does not support that. So out of those options, I would say remove the [[REG:.*]]; I have updated the patch to do that.

Hi Harald, thanks for thoroughly answering my questions. I still have doubts intention of this patch. Is there any bug related to it?
I had a look at the ABI and found it says "ILP32 binaries ... should conform to small code model or small position independent code model ...". IMHO, the ILP32 mode is designed for performance which always assumes the address bit 63~32 all zero and uses 32 bits register to reduce code size. Although the extension guarantees the correctness, I think it may go against the original intention of the design.
Nevertheless, I'm not expert on ABI, add @hjl.tools for a review.

hvdijk added a comment.EditedNov 14 2020, 12:49 AM

Hi Harald, thanks for thoroughly answering my questions. I still have doubts intention of this patch. Is there any bug related to it?

I run an x32 system. Things work well for things built with GCC, but break badly when using LLVM, because the ABI as implemented is incompatible between GCC and LLVM, as GCC-generated code will sometimes assume that the high bits of pointer parameters have been zeroed out as required by the ABI. This is especially visible when using Rust applications, as the Rust compiler is LLVM-based but most of the rest of my system is built with GCC. Pretty much any non-trivial Rust application crashes, for instance ripgrep, or the Rust compiler itself.

IMHO, the ILP32 mode is designed for performance which always assumes the address bit 63~32 all zero and uses 32 bits register to reduce code size.

Using 32 bits registers is not always possible, and when possible increases, not reduces, code size. When instructions can take either a 32-bit register or a 64-bit register, the 64-bit register is what you get by default, the 32-bit register requires an extra byte. For instance, mov (%edi),%eax is encoded as 0x67 0x8B 0x07, but mov (%rdi), %eax is encoded as just 0x8B 0x07. The idea of the ABI, I believe, is that called functions should be able to use 64 bit registers for memory operations. That said, I have not done any work to ensure that 64-bit registers are used where possible; both GCC and LLVM with my patch do generate code that unnecessarily uses 32-bit registers.

Was there ever a follow-up on https://lkml.org/lkml/2018/12/10/1145 ?

None that I am aware of. The current Linux kernel (5.9) fully supports x32 and does not mark it as broken, deprecated, or anything like that in any way.

hjl.tools added a comment.EditedNov 14 2020, 7:04 AM

Hi Harald, thanks for thoroughly answering my questions. I still have doubts intention of this patch. Is there any bug related to it?

I run an x32 system. Things work well for things built with GCC, but break badly when using LLVM, because the ABI as implemented is incompatible between GCC and LLVM, as GCC-generated code will sometimes assume that the high bits of pointer parameters have been zeroed out as required by the ABI. This is especially visible when using Rust applications, as the Rust compiler is LLVM-based but most of the rest of my system is built with GCC. Pretty much any non-trivial Rust application crashes, for instance ripgrep, or the Rust compiler itself.

IMHO, the ILP32 mode is designed for performance which always assumes the address bit 63~32 all zero and uses 32 bits register to reduce code size.

ILP32 psABI has


10.1 Parameter Passing

When a value of pointer type is returned or passed in a register, bits 32 to 63 shall be zero.

It is required for ILP32 since many system calls are shared between ILP32 and LP64.
Here is my x32 port from 2016.

ILP32 psABI has


10.1 Parameter Passing

When a value of pointer type is returned or passed in a register, bits 32 to 63 shall be zero.

It is required for ILP32 since many system calls are shared between ILP32 and LP64.
Here is my x32 port from 2016.

Thanks H.J. It's amazing you already did many works on x32 support. But these patches seem not been merged to mainline. What's the reason for not merging them? Does that mean the trunk code still has some flaws on x32 ABI supporting?

It is required for ILP32 since many system calls are shared between ILP32 and LP64.
Here is my x32 port from 2016.

Thanks H.J. It's amazing you already did many works on x32 support. But these patches seem not been merged to mainline. What's the reason for not merging them? Does that mean the trunk code still has some flaws on x32 ABI supporting?

I never finished x32 work. You can use my branch as a reference when working on x32.

I never finished x32 work. You can use my branch as a reference when working on x32.

I wish I'd known about your branch before I started my work that this is part of (https://lists.llvm.org/pipermail/llvm-dev/2020-October/146049.html) :) Thanks, that may prove very useful for the problems that still remain.

@pengfei With the confirmation that we do need this, could you take another look? Does this look okay now?

pengfei accepted this revision.Nov 29 2020, 5:24 PM

@pengfei With the confirmation that we do need this, could you take another look? Does this look okay now?

I'm OK with this patch. Can you please check if expensive check is happy with it?

This revision is now accepted and ready to land.Nov 29 2020, 5:24 PM
This revision was automatically updated to reflect the committed changes.

I'm OK with this patch. Can you please check if expensive check is happy with it?

Thanks, I saw tests pass when expensive checks are enabled so pushed it.