This is an archive of the discontinued LLVM Phabricator instance.

[X86][MS-InlineAsm] Add constraint *m for memory access w/ global var
ClosedPublic

Authored by skan on Nov 3 2021, 4:41 AM.

Details

Summary

Constraint *m should be used when the address of a variable is passed
as a value. And the constraint is missing for MS inline assembly when sth
is written to the address of the variable.

The missing would cause FE delete the definition of the static varible,
and then result in "undefined reference to xxx" issue.

Diff Detail

Event Timeline

skan created this revision.Nov 3 2021, 4:41 AM
skan requested review of this revision.Nov 3 2021, 4:41 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 3 2021, 4:41 AM
skan updated this revision to Diff 384401.Nov 3 2021, 4:45 AM

clang-format the diff

How about the GV is function pointer? I believe @xiangzhangllvm has lots of experience on it :)

clang/test/CodeGen/ms-inline-asm-static-variable.c
9

IIRC, we have limitation on the number. Can you check if it works when we have more than 10 global variables.

clang/test/CodeGen/ms-inline-asm-variables.c
35

Unrelated change.

llvm/test/CodeGen/X86/ms-inline-asm-array.ll
16–23

These seem unnecessary.

skan marked 3 inline comments as done.Nov 3 2021, 4:58 PM
skan added inline comments.
clang/test/CodeGen/ms-inline-asm-static-variable.c
9

sorry, which number?

clang/test/CodeGen/ms-inline-asm-variables.c
35

Just delete a blank line, which is done by auto clang-format. I think it's okay.

skan marked 2 inline comments as done.Nov 3 2021, 4:59 PM
skan added inline comments.
llvm/test/CodeGen/X86/ms-inline-asm-array.ll
16–23

This is necessary b/c we need to check the inline-asm IR can be rerwritten to correct assmbly after constraint "*m" is added.

xiangzhangllvm added inline comments.Nov 3 2021, 6:37 PM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1759

Let me generally tell out my understand here, (If wrong PLS correct me)
Here from the comments we can see, the old code want to keep the origin symbol of global variable to let linker (relocation) handle it. Here you describe it with a pointer (with decl), it change to form of $ID <--> (decl), So which need constrain it with "*m". But if the pointer can not be access from BaseReg(Rip) + Index(Ip) how do you descript the pointer you generate out ?

2554–2555

The change here looks too arbitrary. For global address it is ok to drop the base, it mainly fetch from offset. but if here not global variable?

pengfei added inline comments.Nov 3 2021, 8:50 PM
clang/test/CodeGen/ms-inline-asm-static-variable.c
9

$0. I mean if we have arr1, arr2, ..., we should have $1, $2, ... but the max number is $9 if I remember it correctly.

skan marked 3 inline comments as done.Nov 3 2021, 10:18 PM
skan added inline comments.
clang/test/CodeGen/ms-inline-asm-static-variable.c
9

We does not touch that part of the code. If there was this limitation before, there is now.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1759

I think you may misunderstand this code.

This code handles the memory that can not be represented by Disp(RIP) b/c there is already a BaseReg or IndexReg there.

Before this patch, the memory is represented like arr[edx*4] and there is no identifer bound to it. And after this patch, we bind the memory to identifer arr.

2554–2555

RIP and IndexReg can never be used together according to design of X86 instruction. The BaseReg is set to RIP b/c we add the constaint "*m" in the MS-inline assembly. So we can drop it safely.

I acknowledge it's not the best way to do it, but it's simplest. Similary, you can see the line 2560-2562. RSP can never be a IndexReg, but we just swap the BaseReg w/ IndexReg b/c it is not handled well in previous phases.

xiangzhangllvm added inline comments.Nov 3 2021, 11:46 PM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1759

Yes, that is why I mean t change ( arr[edx*4] ) to form of $ID <--> (decl). So what the problem if we let the old form ( arr[edx*4] ) being ?

2554–2555

RIP and IndexReg can never be used together according to design of X86 instruction.

Good answer!

skan marked 3 inline comments as done.Nov 4 2021, 1:18 AM
skan added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1759

Then the definition of arr would be deleted by C Front end, and result in "undefined reference to xxx" issue during link stage.

xiangzhangllvm accepted this revision.Nov 4 2021, 2:16 AM
This revision is now accepted and ready to land.Nov 4 2021, 2:16 AM
thakis added a subscriber: thakis.Nov 4 2021, 7:26 PM

Looks like this breaks tests on Mac: http://45.33.8.238/mac/38266/step_11.txt

Please take a look and revert for now if it takes a while to fix.

skan added a comment.Nov 4 2021, 8:00 PM

Looks like this breaks tests on Mac: http://45.33.8.238/mac/38266/step_11.txt

Please take a look and revert for now if it takes a while to fix.

I can quickly fix it.

It's the issue of ambiguous triple. You are using the -mtriple=x86_64-apple-mac* while I am using -mtriple=x86_64-unknown-linux-gnu

skan added a comment.Nov 4 2021, 8:09 PM

Looks like this breaks tests on Mac: http://45.33.8.238/mac/38266/step_11.txt

Please take a look and revert for now if it takes a while to fix.

I can quickly fix it.

It's the issue of ambiguous triple. You are using the -mtriple=x86_64-apple-mac* while I am using -mtriple=x86_64-unknown-linux-gnu

Fixed by 6d03227c16ee

@skan This change seems don't work for -fpic, see https://godbolt.org/z/h3nWoerPe
I don't have any idea to handle these cases, I suggest we should revert this patch first.