This is an archive of the discontinued LLVM Phabricator instance.

[InlineAsm] Support call function label in x86 inline asm with PIC
Needs ReviewPublic

Authored by xiangzhangllvm on Sep 7 2021, 5:58 PM.

Details

Summary

In Linux PIC model, Global Address (GA) of Global Variable (GV) will be got from loading GOT slot.
Different instructions has different understanding about using Global Address.

For example in X86:

  1. We assign a value to GV, we can use "MOV GV, ...", it corresping to "MOV (Global Address) ...".
  2. We got related address of GV, we can use "LEA GV, ...", it corresping to "LEA (Global Address) ...".
  3. But if we call the label (GV can be a label), we can use "CALL GV", it didn't equal with "CALL (Global Address)".

So, it is obvious that Global Address of Global Variable should be specially handled case by case.
This is done in normal IR/MIR which has "single" purpose use of the Global Address.
But things changed in inline asm which represented by only one IR/MIR but may contains
a lot of instructions with mult-purpose on same or different Global Address.
What is more, llvm didn't distinguish the instructions in inline asm IR/MIR.

TODO: The other targets may also has this problem, we need to fix them too. It is an arch defect for llvm inline asm.

Take a concrete example: test.c: (clang t.c -fasm-blocks -S -fpic -emit-llvm test.c )

extern void sincos();
int GV=3;
int Arr[10] = {1,};
void foo() {
 asm {
   lea r8, Arr
   mov rax, GV
   lea r9, sincos
   call sincos
   ret }
}

it will generate following wrong code:

movq    sincos@GOTPCREL(%rip), %rsi
movq    GV@GOTPCREL(%rip), %rdx
movq    Arr@GOTPCREL(%rip), %rcx
#APP

leaq    (%rcx), %r8
movq    (%rdx), %rax         //  Not like "MOV GV", CALL don't get the context of the Function sincos
leaq    (%rsi), %r9          //  Not like "LEA sincos", LEA can get the address without Dereference
callq   *(%rsi)              //  It should directly call the sincos --> callq   %rsi  --> in normal case (non-large code model) it should be "**call sincos@PLT**"
retq

#NO_APP

Diff Detail

Event Timeline

xiangzhangllvm created this revision.Sep 7 2021, 5:58 PM
xiangzhangllvm requested review of this revision.Sep 7 2021, 5:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2021, 5:58 PM
xiangzhangllvm edited the summary of this revision. (Show Details)Sep 7 2021, 6:45 PM

The fail test lower-em-sjlj-indirect-setjmp.ll has no relation with this patch. (remove this patch it still fail)

xiangzhangllvm edited the summary of this revision. (Show Details)

Update for clang-format

Update {NUM:P} in LangRef

I have a general question. Can we handle it in X86DAGToDAGISel::PreprocessISelDAG()? We can create a new "X86ISD::Wrapper" and replace the corresponding memory operand of the INLINEASM node with the "X86ISD::Wrapper" node. Here is the pseudo example.

Transform the DAG

t0: ch = EntryToken
    t12: i64 = X86ISD::WrapperRIP TargetGlobalAddress:i64<void (...)* @bar> 0 [TF=5]
  t14: i64,ch = load<(load (s64) from got)> t0, t12, undef:i64
t10: ch,glue = inlineasm t0, TargetExternalSymbol:i64'call qword ptr ${0:P}
      ret', MDNode:ch<0x6b0f848>, TargetConstant:i64<13>, TargetConstant:i32<196622>, t14, TargetConstant:i32<12>, Register:i32 $df, TargetConstant:i32<12>, Register:i16 $fpsw, TargetConstant:i32<12>, Register:i32 $eflags

to

t0: ch = EntryToken
    t12: i64 = X86ISD::WrapperRIP TargetGlobalAddress:i64<void (...)* @bar> 0 [TF=5]
  t14: i64,ch = load<(load (s64) from got)> t0, t12, undef:i64
  t120: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<void (...)* @bar> 0
t10: ch,glue = inlineasm t0, TargetExternalSymbol:i64'call qword ptr ${0:P}
      ret', MDNode:ch<0x6b0f848>, TargetConstant:i64<13>, TargetConstant:i32<196622>, t120, TargetConstant:i32<12>, Register:i32 $df, TargetConstant:i32<12>, Register:i16 $fpsw, TargetConstant:i32<12>, Register:i32 $eflags
llvm/docs/LangRef.rst
5047

Is it accurate? It seems only in PIC mode and in Intel dialect, we print the symbol operand.

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
1189

The syntax should be ensured by front-end when emit IR. Right?

1222

Drop brace.

1246

Is it possible that there are multiple memory operands?

1253

This code is X86 target specific. It assume the sibmem address mod of X86. I think we should refine this function in X86 target.

1254

Why do we assume the 2nd operand is index operand? Is it always true?

1255

Why do we assume index register is noreg?

1401

Maybe create a target function to determine if we want to transform operands for call instruction. The function return false by default.

1486

Drop brace.

1511

Drop the brace.

llvm/lib/Target/X86/X86AsmPrinter.cpp
611

Do we have any test case when MI->getOperand(OpNo) is local symbol?

llvm/test/CodeGen/X86/inline-asm-call.ll
5

Is it necessary to mix -code-model=large with i386-unknown?

gcc -mcmodel=large -m32 t2.c -m32
cc1: error: code model ‘large’ not supported in the 32 bit mode

47

Add "nounwind" to avoid CFI instructions.

82

rsi hold the GOTbase. What does $sincos@GOT means? Is it the sincos entry offset to GOT base?

pengfei added inline comments.Oct 2 2021, 8:08 AM
llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
1441

Maybe we can use j here? Changing to J while leaving i looks odd.

1455

ditto.

xiangzhangllvm added a comment.EditedOct 7 2021, 6:02 PM

I have a general question. Can we handle it in X86DAGToDAGISel::PreprocessISelDAG()? We can create a new "X86ISD::Wrapper" and replace the corresponding memory operand of the

PreprocessISelDAG is my first idea. The reason I not did it there because in IR/SDNode the GV is corresponding to only 1 value/node, we may have different uses of this GV (e.g. call,lea ...) which was differently lowering according to their users (IRs/SDNodes). For InlineASM, it is different, muti-uses may happened in one User (InlineAsm IR/SDNode ), it is hard distinguish them in lowering.
Another reason I do it here is I want to limited the affection of the change. Here is a good place, It has finish IR/SDNode optimizations, and is just emitting the Inline ASM, all the related info is easy to get.

Add Warpper is a way to reslove the 1 "value/node" problem, but it still need Lowering to add "PLT" flags for call GV which I remember it add by checking its user (not clear by wrapper).
Let me take a look. Thanks for the suggestion and all your reviews.

xiangzhangllvm added inline comments.Oct 7 2021, 11:54 PM
llvm/docs/LangRef.rst
5047

I