This is an archive of the discontinued LLVM Phabricator instance.

[InlineAsm][bugfix] Correct function addressing in inline asm
ClosedPublic

Authored by xiangzhangllvm on Sep 14 2022, 11:30 PM.

Details

Summary

In Linux PIC model, there are 4 cases about value/label addressing:

Case 1: Function call or Label jmp inside the module.
Case 2: Data access (such as global variable, static variable) inside the module.
Case 3: Function call or Label jmp outside the module.
Case 4: Data access (such as global variable) outside the module.

Due to current llvm inline asm architecture designed to not "recognize" the asm
code, there are quite troubles for us to treat mem addressing differently for
same value/adress used in different instuctions.
For example, in pic model, call a func may in plt way or direclty pc-related,
but lea/mov a function adress may use got.

This patch fix/refine the case 1 and case 3 in inline asm.
(I prefer to fix/refine case 2 and case 4 problems later, in another patch)
Due to currently inline asm didn't support jmp the outsider lable, this patch
mainly focus on fix the function call addressing bugs in inline asm.

Example:
Before this fix we can't successfully/correctly build the test https://godbolt.org/z/jGvEET3q3 with clang
(The test is the source file of lit test llvm/test/CodeGen/X86/inline-asm-function-call-pic.ll )

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2022, 11:30 PM
xiangzhangllvm requested review of this revision.Sep 14 2022, 11:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2022, 11:30 PM
llvm/test/CodeGen/X86/inline-asm-p-constraint.ll
14

@jonpa I think here you may mis-understand lea, so I updated here.
And I think the mov (%rdi) should be mov %rdi in constrain of "p" you add before.

At first I updated the 'p' action to "direct action" (without load: mov %rdi), but it affect systemZ tests which I am not familiar with. So I want to discuss with you about the action of "p" here.

"p" should be address which equal with "direct mem (without load)".
So in DAG build (pls refer the change in SelectionDAGBuilder.cpp) the OpInfo.isIndirect should always be false;

Update comments

xiangzhangllvm edited the summary of this revision. (Show Details)Sep 15 2022, 12:12 AM
xiangzhangllvm edited the summary of this revision. (Show Details)
xiangzhangllvm edited the summary of this revision. (Show Details)Sep 15 2022, 12:17 AM
xiangzhangllvm edited the summary of this revision. (Show Details)Sep 15 2022, 12:22 AM
xiangzhangllvm edited the summary of this revision. (Show Details)Sep 15 2022, 12:30 AM
jonpa added inline comments.
llvm/test/CodeGen/X86/inline-asm-p-constraint.ll
14

Yes, as far as I understand "p" (address) is not "indirect" - the address is specified at the source level and not used to access memory. It's when you pass an address to a memory operand the "indirect" flag is added.

a few style comments

llvm/include/llvm/CodeGen/TargetLowering.h
3533

Add a description comment

3533

Add a description comment

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

Add a description comment

llvm/include/llvm/IR/InlineAsm.h
339

update assertion message?

356–357

add an assertion message

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
1325

auto *TGA = dyn_cast<>

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8884

isa_and_nonnull<Function> ?

+ @nickdesaulniers who once worked for branch issue of inline asm.

clang/test/CodeGen/ms-inline-asm-functions.c
25 ↗(On Diff #460309)

Why is it affected? Is missing a * before the label?

35 ↗(On Diff #460309)

Due to currently inline asm didn't support jmp the outsider lable, this patch
mainly focus on fix the function call addressing bugs in inline asm.

This seems a counter example.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8839

No need to define function that only called once.

8889

Can we just pass the string and handle it in the function?

clang/test/CodeGen/ms-inline-asm-functions.c
25 ↗(On Diff #460309)

The dllimport mark it is a inter linkage, so we can directly call this function.

llvm/test/CodeGen/X86/inline-asm-p-constraint.ll
14

Yes, but now I pass 'p' for "ptr %Ptr" in ""mov $1", it still load the context of "ptr %Ptr".
So this is a problem.
I prefer to fix it.

clang/test/CodeGen/ms-inline-asm-functions.c
35 ↗(On Diff #460309)

This will be bug in pic model.

llvm/include/llvm/CodeGen/TargetLowering.h
3533

Sure, I'll add it, thanks.

llvm/include/llvm/IR/InlineAsm.h
339

Ok, let me update it. Before this patch the function is belong to Mem kind.
Currently we split them.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8839

The other target must have some bugs, so I create this function for further use.

8884

OK, good style!

8889

Yes, fold upper code into a function call is better!

LuoYuanke added inline comments.Sep 17 2022, 1:02 AM
llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

It seems each target has it's own defintion for the return value. Is it interpreted by target when it is used?

llvm/test/CodeGen/X86/inline-asm-function-call-pic.ll
13

Remove the original line number of the source code?

LuoYuanke added inline comments.Sep 17 2022, 1:12 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8892

What's the difference between C_Address and C_Memory?

llvm/test/CodeGen/X86/inline-asm-function-call-pic.ll
72

Do we need these attribute for the test case?

xiangzhangllvm added inline comments.Sep 17 2022, 6:05 PM
llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

Yes, Each target has it's own definition of this function to handle call instruction. This is a virtual function on the base class TargetSubtargetInfo. The target will call their self's classifyGlobalFunctionReference for handle function.

315

Sure

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8892

This is added by jonpa, It mean address (mem without load), I think the implement of it is still need to refine, pls refer our discussion on test llvm/test/CodeGen/X86/inline-asm-p-constraint.ll.

llvm/test/CodeGen/X86/inline-asm-function-call-pic.ll
72

Let me update it.

xiangzhangllvm edited the summary of this revision. (Show Details)Sep 17 2022, 6:06 PM
xiangzhangllvm marked 11 inline comments as done.Sep 20 2022, 12:47 AM
xiangzhangllvm added inline comments.
clang/test/CodeGen/ms-inline-asm-functions.c
25 ↗(On Diff #460309)

typo: inter linkage --> internal linkage

RKSimon added inline comments.Sep 26 2022, 3:28 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8754

Pass SDValue by value not reference

xiangzhangllvm added inline comments.Sep 26 2022, 5:36 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8754

Make sense! Here we not change the Op, thanks for reviewing!

xiangzhangllvm edited the summary of this revision. (Show Details)
xiangzhangllvm marked an inline comment as done.
pengfei added inline comments.Sep 27 2022, 2:06 AM
clang/test/CodeGen/ms-inline-asm-functions.c
25 ↗(On Diff #460309)

I understand it on the opposite side, the dllimport means it is always an external function, so it must be an indirect call here.
https://github.com/MicrosoftDocs/cpp-docs/blob/main/docs/build/importing-into-an-application-using-declspec-dllimport.md

llvm/lib/IR/InlineAsm.cpp
65
xiangzhangllvm added inline comments.Sep 27 2022, 3:44 AM
clang/test/CodeGen/ms-inline-asm-functions.c
25 ↗(On Diff #460309)

Thank you point out it! Seems I mix it with "dllexport", the dllexport mean this is self module symbol. let me check here.

llvm/lib/IR/InlineAsm.cpp
65

Good catch! let's enhance here. Thank you so much!

escape dllimport function call

ping, our customer are waiting for this fix. thanks.

pengfei added inline comments.Oct 9 2022, 2:26 AM
llvm/include/llvm/CodeGen/TargetLowering.h
3536

How about the target branch is used by other instructions? E.g.
call void asm "lea r8, $0\0A\09call qword ptr ${0:P}\0A\09ret"

3540

tranch -> branch

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

Where's X86 implementation? I didn't find it.

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
411

This change seems not related. Better not to modify it.

1321–1322

This seems need re-format?

1324

Didn't see we check AdjustFlag here.

1325

Comment not addressed.

llvm/lib/Target/X86/X86ISelLowering.cpp
32342

I don't think we have call $1,... in all case.

32352–32357

The code seems fragile. How about using regex:

std::string Call = formatv("call.*\\${?{0}", OpNo);
for (auto &AsmStr : AsmStrs)
  if (Regex(Call).match(AsmStr))
    return true;
return false;
llvm/test/CodeGen/X86/inline-asm-function-call-pic.ll
32

The comments don't match with actual one, please remove it.

63

ditto.

xiangzhangllvm added inline comments.Oct 9 2022, 6:05 PM
llvm/include/llvm/CodeGen/TargetLowering.h
3536

We shouldn't generate the different kind uses (target branch use and data use) with same operand index. Even they are some value.
For example the intel inline asm will generate different index for different uses (even same value).
For user who manually write the index should appoint different index (with different constraints) for different kind uses.

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

Exist code, Not in the patch:
Target/X86/X86Subtarget.cpp: X86Subtarget::classifyGlobalFunctionReference

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
411

Sorry, this refine should be at line 1325

1321–1322

Clang format ?

llvm/lib/Target/X86/X86ISelLowering.cpp
32342

here handle 3 case: $1, ${1: and $1 in the end of string.

32352–32357

good idea! Let me have a try, thanks a lot!

pengfei added inline comments.Oct 9 2022, 8:17 PM
llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
315

Got it, thanks!

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
1321–1322

Yes.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8843

I see the GlobalISel handles argument and result separately https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp#L289-L290. No sure if there's problem to handle together here, e.g., a constrain is both argument and output https://godbolt.org/z/98sne5h4x or other corner cases?

xiangzhangllvm added inline comments.Oct 9 2022, 11:27 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8843

Here ArgNo is different conception with the "ArgNo" in https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp#L289-L290.
It mean the index of OpInfo, it both work for GlobalISel and non-GlobalISel.
Maybe "OpNo" is more clear. Let me rename it .

RKSimon added inline comments.Oct 10 2022, 5:25 AM
llvm/include/llvm/CodeGen/TargetLowering.h
3543

Why not ArrayRef<StringRef> ?

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8755

SDValue has a bool operator so we don't need to use getNode()

if (Op && Op.getOpcode() == ISD::GlobalAddress) {
xiangzhangllvm added inline comments.Oct 10 2022, 6:00 PM
llvm/include/llvm/CodeGen/TargetLowering.h
3543

Because we collect SmallVector<StringRef> AsmStrs in
SelectionDAGBuilder.cpp: (line 8808)

IA->collectAsmStrs(AsmStrs);

So here we keep use the same type for AsmStrs.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
8755

That is better, let me try, thanks so much!

xiangzhangllvm marked 11 inline comments as done.Oct 10 2022, 11:07 PM
xiangzhangllvm added inline comments.
llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
1321–1322

Do clang format. No change.

llvm/lib/Target/X86/X86ISelLowering.cpp
32352–32357

Enhance in getInstrStrFromOpNo.

xiangzhangllvm marked an inline comment as done.
xiangzhangllvm added inline comments.Oct 12 2022, 1:02 AM
llvm/lib/IR/InlineAsm.cpp
65

The StringRef has bug about do split for empty string. So I handle the empty case.

llvm/lib/Target/X86/X86ISelLowering.cpp
32352–32357

Enhance in getInstrStrFromOpNo.

32352–32357

The regex can't handle the "$1" from "$12", and not clear for readability, So I choose to enhance in getInstrStrFromOpNo (remove the label affect too).

pengfei added inline comments.Oct 12 2022, 1:18 AM
llvm/lib/IR/InlineAsm.cpp
72

When split failed, the size of AsmStrs is 1 rather than empty.

xiangzhangllvm added inline comments.Oct 12 2022, 1:57 AM
llvm/lib/IR/InlineAsm.cpp
72

That is really a defect for inline asm use various delimiters : (
Let me update here, thanks.

I think there is no big problem after so long time reviews.
Currently only X86 backend implement TLI.isInlineAsmTargetBranch(AsmStrs, OpNo).
It is safe for other targets to go the old way (however, it is not suggested to keep the bug) if they do not implement TLI.isInlineAsmTargetBranch.

pengfei accepted this revision.Oct 12 2022, 7:50 PM

LGTM :)

This revision is now accepted and ready to land.Oct 12 2022, 7:50 PM

LGTM :)

Thank you so much!!
And also thanks for all the review !!

This revision was landed with ongoing or failed builds.Oct 13 2022, 7:18 PM
This revision was automatically updated to reflect the committed changes.
uabelho added inline comments.
llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
317

Shouldn't the already existing X86 version of this method be marked with "override" now that it exist in the base class?

Same with the version you added in AArch64 in
https://reviews.llvm.org/rGd0269dd059b7a9f080f76e1c54285fe58c8c938e
?

Bots currently fail:
https://lab.llvm.org/buildbot/#/builders/57/builds/22874

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h
317

Yes, thank you, sorry for not see this warning in time.