This is an archive of the discontinued LLVM Phabricator instance.

[X86] Always use rip-relative addressing on 64-bit when rematerializing all zeros/ones registers using a folded load.
ClosedPublic

Authored by craig.topper on Feb 22 2021, 11:01 AM.

Details

Summary

Previously we only used RIP relative when PIC was enabled. But
we know we're in small/kernel code model here so we should
be able to always use RIP-relative which will give a smaller
encoding.

None of our lit tests are affected by this for some reason. Maybe the
update_test_checks.py is too aggressive with regular expressions?
I'll work on a proper test before committing, but I wanted to make
sure I wasn't missing anything that would prevent us from making this
change.

Here's a godbolt link that demonstrates the current codegen https://godbolt.org/z/j3158o
Note in the non-PIC version the load from .LCPI0_0 doesn't use
RIP-relative addressing, but if you change the constant in the
source from 0.0 to 1.0 it will become RIP-relative.

Diff Detail

Event Timeline

craig.topper created this revision.Feb 22 2021, 11:01 AM
craig.topper requested review of this revision.Feb 22 2021, 11:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2021, 11:01 AM
MaskRay added a comment.EditedFeb 22 2021, 6:01 PM

LG.

% myllvm-mc -show-encoding a.s
        .text
.LCPI0_0:
        .quad   0
        ucomisd .LCPI0_0, %xmm0                 # encoding: [0x66,0x0f,0x2e,0x04,0x25,A,A,A,A]
                                        #   fixup A - offset: 5, value: .LCPI0_0, kind: reloc_signed_4byte
        ucomisd .LCPI0_0(%rip), %xmm0           # encoding: [0x66,0x0f,0x2e,0x05,A,A,A,A]
                                        #   fixup A - offset: 4, value: .LCPI0_0-4, kind: reloc_riprel_4byte

None of our lit tests are affected by this for some reason. Maybe the update_llc_test_checks.py is too aggressive with regular expressions?

Confirmed.

; CHECK-NEXT: ucomisd {{\.LCPI.*}}, %xmm0

.* hides the (%rip) difference.

FYI I raised https://bugs.llvm.org/show_bug.cgi?id=45091 about better exposing pointer offsets on x86 as we've missed some regressions in the past from this but nothing has happened on it

None of our lit tests are affected by this for some reason. Maybe the update_llc_test_checks.py is too aggressive with regular expressions?

Confirmed.

; CHECK-NEXT: ucomisd {{\.LCPI.*}}, %xmm0

.* hides the (%rip) difference.

It's not true. I looked at the code in llvm/utils/UpdateTestChecks/asm.py, the script can differentiate the rip from the pure address, e.g

cd llvm/utils/UpdateTestChecks/
python
>>> from UpdateTestChecks import asm
>>> class T():
...   x86_scrub_rip = True
...
>>> asm.scrub_asm_x86('ucomisd .LCPI0_0(%rip), %xmm0', T)
'ucomisd {{.*}}(%rip), %xmm0'
>>> asm.scrub_asm_x86('ucomisd .LCPI0_0, %xmm0', T)
'ucomisd {{\\.LCPI.*}}, %xmm0'

So the update_llc_test_checks.py isn't too aggressive to eliminate the difference.
I also check a test llvm/test/CodeGen/X86/vec_reassociate.ll, in which we can see the differences between X86 and X64.

FYI I raised https://bugs.llvm.org/show_bug.cgi?id=45091 about better exposing pointer offsets on x86 as we've missed some regressions in the past from this but nothing has happened on it

It seems the script also handles these cases by

# Detect stack spills and reloads and hide their exact offset and whether
# they used the stack pointer or frame pointer.
asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r'{{[-0-9]+}}(%\1{{[sb]}}p)\2', asm)

None of our lit tests are affected by this for some reason. Maybe the update_llc_test_checks.py is too aggressive with regular expressions?

Confirmed.

; CHECK-NEXT: ucomisd {{\.LCPI.*}}, %xmm0

.* hides the (%rip) difference.

It's not true. I looked at the code in llvm/utils/UpdateTestChecks/asm.py, the script can differentiate the rip from the pure address, e.g

cd llvm/utils/UpdateTestChecks/
python
>>> from UpdateTestChecks import asm
>>> class T():
...   x86_scrub_rip = True
...
>>> asm.scrub_asm_x86('ucomisd .LCPI0_0(%rip), %xmm0', T)
'ucomisd {{.*}}(%rip), %xmm0'
>>> asm.scrub_asm_x86('ucomisd .LCPI0_0, %xmm0', T)
'ucomisd {{\\.LCPI.*}}, %xmm0'

So the update_llc_test_checks.py isn't too aggressive to eliminate the difference.
I also check a test llvm/test/CodeGen/X86/vec_reassociate.ll, in which we can see the differences between X86 and X64.

Oh, I misunderstood @MaskRay 's comments. It's true {{\\.LCPI.*}} hides the difference, but we can regenerate the tests by the script.

FYI I raised https://bugs.llvm.org/show_bug.cgi?id=45091 about better exposing pointer offsets on x86 as we've missed some regressions in the past from this but nothing has happened on it

It seems the script also handles these cases by

# Detect stack spills and reloads and hide their exact offset and whether
# they used the stack pointer or frame pointer.
asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r'{{[-0-9]+}}(%\1{{[sb]}}p)\2', asm)

May be we can add an extra option like

if getattr(args, 'x86_keep_offset', False):
  asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r'{{[-0-9]+}}(%\1{{[sb]}}p)\2', asm)

update_llc_test_checks.py already has several '-x86_scrub_*- command switches that covers some of this

What do we want to do with this patch?

It appears to be stuck on our test check patterns hiding too much address math - is that something we should move away from by default? I proposed something similar on https://bugs.llvm.org/show_bug.cgi?id=45091 - would it help if we got that dealt with first?

Rebase on top of D99460 where I weakened regular expressions we get test failures now.

RKSimon accepted this revision.Mar 29 2021, 2:33 AM

LGTM

This revision is now accepted and ready to land.Mar 29 2021, 2:33 AM
This revision was landed with ongoing or failed builds.Mar 29 2021, 10:06 AM
This revision was automatically updated to reflect the committed changes.