This is an archive of the discontinued LLVM Phabricator instance.

[llvm-ml] Add support for extern proc
ClosedPublic

Authored by ayzhao on May 13 2022, 1:27 PM.

Details

Summary

EXTERN PROC isn't really well documented in MSVC, so after poking around
it seems as if it's just a regular extern symbol.

Interestingly enough, under MSVC the following is allowed:

extern foo:proc

mov eax, foo

MSVC will output:

mov eax, 0

while llvm-ml will currently output:

mov eax, dword ptr [foo]

(since foo is an extern)

Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.

Diff Detail

Event Timeline

ayzhao created this revision.May 13 2022, 1:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 1:27 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
ayzhao requested review of this revision.May 13 2022, 1:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 1:27 PM
epastor added inline comments.May 13 2022, 1:41 PM
llvm/lib/MC/MCParser/MasmParser.cpp
6038

Can we check this before doing the lookup? It seems odd to do the complicated check first.

ayzhao updated this revision to Diff 429346.May 13 2022, 1:48 PM

Reorder if statements

ayzhao marked an inline comment as done.May 13 2022, 1:48 PM
epastor accepted this revision.May 13 2022, 2:03 PM
This revision is now accepted and ready to land.May 13 2022, 2:03 PM
epastor added inline comments.May 13 2022, 2:04 PM
llvm/test/tools/llvm-ml/indirect_branch.asm
270

This (and the label) could probably be hoisted outside of the ifdef.

ayzhao updated this revision to Diff 429350.May 13 2022, 2:07 PM

move label outside ifdef

ayzhao marked an inline comment as done.May 13 2022, 2:07 PM
This revision was landed with ongoing or failed builds.May 13 2022, 2:21 PM
This revision was automatically updated to reflect the committed changes.
hans added a subscriber: hans.May 16 2022, 3:57 AM

Interestingly enough, under MSVC the following is allowed:

extern foo:proc

mov eax, foo

MSVC will output:

mov eax, 0

while llvm-ml will currently output:

mov eax, dword ptr [foo]

(since foo is an extern)

Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.

Maybe I'm just not familiar enough with MASM, but both outputs seem odd to me. I would expect MSVC's output to be accompanied by a relocation, but it seems that's not the case?

"mov eax, foo" reads like "move the address of foo into eax" to me, which seems like it would be a sensible thing to do. What llvm-ml does is more like "load from foo into eax" which seems odd if "foo" is a function.

In any case, if we can't support it in a sensible way, can we at least error about it? And there should be a test.

Interestingly enough, under MSVC the following is allowed:

extern foo:proc

mov eax, foo

MSVC will output:

mov eax, 0

while llvm-ml will currently output:

mov eax, dword ptr [foo]

(since foo is an extern)

Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.

Maybe I'm just not familiar enough with MASM, but both outputs seem odd to me. I would expect MSVC's output to be accompanied by a relocation, but it seems that's not the case?

"mov eax, foo" reads like "move the address of foo into eax" to me, which seems like it would be a sensible thing to do. What llvm-ml does is more like "load from foo into eax" which seems odd if "foo" is a function.

It seems that the instruction for "move the address of foo into eax" should have a rip relative address anyways, just because foo is an extern. See: https://godbolt.org/z/Wsjq3KdvT

Based on that, I think llvm-ml is doing the right thing currently.

In any case, if we can't support it in a sensible way, can we at least error about it? And there should be a test.

Interestingly enough, under MSVC the following is allowed:

extern foo:proc

mov eax, foo

MSVC will output:

mov eax, 0

while llvm-ml will currently output:

mov eax, dword ptr [foo]

(since foo is an extern)

Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.

Maybe I'm just not familiar enough with MASM, but both outputs seem odd to me. I would expect MSVC's output to be accompanied by a relocation, but it seems that's not the case?

"mov eax, foo" reads like "move the address of foo into eax" to me, which seems like it would be a sensible thing to do. What llvm-ml does is more like "load from foo into eax" which seems odd if "foo" is a function.

It seems that the instruction for "move the address of foo into eax" should have a rip relative address anyways, just because foo is an extern. See: https://godbolt.org/z/Wsjq3KdvT

Based on that, I think llvm-ml is doing the right thing currently.

Better godbolt link that removes some unwanted inlining: https://godbolt.org/z/Kz4oTeYG5

In any case, if we can't support it in a sensible way, can we at least error about it? And there should be a test.

hans added a comment.May 17 2022, 5:11 AM

Okay, but even if llvm-ml's output is better, it should still have a test. And since it differs from MSVC's output an error or at least a warning is probably a good idea.