This is an archive of the discontinued LLVM Phabricator instance.

[X86] don't allow X86_64 PIC mode addresses to be used as immediates
AbandonedPublic

Authored by rapidsna on Nov 1 2018, 5:34 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

rapidsna created this revision.Nov 1 2018, 5:34 PM

While that seems like a place to change, I would have thought that the code just a little farther down would have caught this....

const GlobalValue *GV = GA->getGlobal();
// If we require an extra load to get this address, as in PIC mode, we
// can't accept it.
if (isGlobalStubReference(Subtarget.classifyGlobalReference(GV)))
  return;

but I was remembering my older out-of-tree backend where I have made changes. An older classifyGlobalReference was for both functions and data. Today, we have both classifyGlobalReference and classifyGlobalFunctionReference. Today's classifyGlobalReference then ends up in TargetMachine.cpp/shouldAssumeDSOLocal which then sends you off into classifyLocalReference. With no GOTPCREL, isGlobalStubReference() returns false.

I wonder if it needs to check GV to see if it is function with a dyn_cast_or_null<function> and call classifyGlobalFunctionReference instead?

Thanks for having look at this. I had also looked at that part, but the problem here has to do with local linkage functions which do not use GOTOFF relocations so classifyLocalReference() just returns MO_NO_FLAG. That's the reason why the if (isGlobalStubReference(Subtarget.classifyGlobalReference(GV))) does not catch this. However, these local linkage functions still need to be accessed in a RIP-relative way in the PIC mode and thus should not be used as immediate values.

I think the logic in the code is that (1) return the function for any PIC mode addresses, and if not (2) assuming non-PIC mode, see if we still cannot use this address as an immediate value, . So I think it makes sense to fix this at (1) before going down to (2).

I will appreciate any comments.

Yes I see. Thanks for double checking.

I still wonder if the call to classifyGlobalReference shouldn't be a call to classifyGlobalFunctionReference if the GV is a function, but that seems like a separate question (which we don't need to answer at the moment).

rnk requested changes to this revision.Nov 27 2018, 11:10 AM

I don't think this is the correct fix, and it would need a test.

This is the code in question:

#include <stdio.h>
int offset = 0;
static void foo2() { printf("foo2 is called\n"); }
void foo() {
  __asm__ volatile("movq %0,%%gs:(%1)" : : "ir"((void *)&foo2), "r"(offset));
  printf("foo is called\n");
}

Between the two alternative constraints, i and r, LLVM chooses i, which, if you don't look inside the assembly, seems reasonable. Then you end up generating, essentially, "MOV 64-bit immediate to GS offset 0".

I think the correct fix is probably somewhere in the inline assembly specific codepath, not this highly general one. We absolutely want to fold displacements into RIP relative address modes.

This revision now requires changes to proceed.Nov 27 2018, 11:10 AM
rapidsna added a comment.EditedNov 30 2018, 9:23 PM

@rnk Thanks for looking into this.

I'm not sure I understand:

The change I made is inside the function void X86TargetLowering::LowerAsmOperandForConstraint(, and under case 'i': { which specifically handles operands of the inline assembly constraint i. Inside this routine DOES have the PIC checking routine in order not to emit immediates in the case of PIC modes. There are already checks for other PIC styles but a check for RIPRel is missing. So I still believe this is the right place to fix this problem.

I will be happy to have any comment and will soon add the test code.

rnk added a comment.Dec 3 2018, 3:44 PM

@rnk Thanks for looking into this.

I'm not sure I understand:

The change I made is inside the function void X86TargetLowering::LowerAsmOperandForConstraint(, and under case 'i': { which specifically handles operands of the inline assembly constraint i. Inside this routine DOES have the PIC checking routine in order not to emit immediates in the case of PIC modes. There are already checks for other PIC styles but a check for RIPRel is missing. So I still believe this is the right place to fix this problem.

I see, that sounds reasonable then. Sorry, the diff is uploaded without -U9999 context, so I assumed it was in one of a more general routine.

rapidsna updated this revision to Diff 179500.EditedDec 25 2018, 11:52 PM

Thanks for looking into this @rnk. I am adding a unit test for this patch. Let me know if the test looks okay.

rnk accepted this revision.Dec 26 2018, 1:11 PM

lgtm, thanks! Do you need someone to commit this?

This revision is now accepted and ready to land.Dec 26 2018, 1:11 PM

@rnk thanks for looking into this!! yes, I need someone to commit it on my behalf.

rnk requested changes to this revision.Dec 27 2018, 11:29 AM

It looks like this was actually intentionally removed back in rL107727 to fix https://llvm.org/pr4752. This change breaks the test that was added for it, llvm/test/CodeGen/X86/2010-07-06-asm-RIP.ll. I guess, since this is GCC inline asm, the question is, how does GCC handle the 'i' constraint? I suspect that this new behavior may be closer to GCC, but it'll require more analysis.

This revision now requires changes to proceed.Dec 27 2018, 11:29 AM
rnk added a comment.Dec 27 2018, 11:42 AM

Here is the original test case from the PR:

#include <stdio.h>
int offset = 0;
static void foo2() { printf("foo2 is called\n"); }
void foo() {
  __asm__ volatile("movq %0,%%gs:(%1)" : : "ir"((void *)&foo2), "r"(offset));
  printf("foo is called\n");
}

If you compile with GCC, you will see that it generates the same assembly that LLVM does:

foo:
.LFB15:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movl    offset(%rip), %eax
#APP
# 21 "t.c" 1
        movq $foo2,%gs:(%eax)
# 0 "" 2
#NO_APP
        leaq    .LC1(%rip), %rdi
        call    puts@PLT
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc

So, I think LLVM is already correct here. You can confirm on compiler explorer: https://gcc.godbolt.org/z/rOHCb6

rapidsna abandoned this revision.Dec 27 2018, 6:23 PM

So with 'i' it should actually work as a PC-relative mode. Gotcha, that totally makes sense. Thanks for confirming this @rnk.