This is an archive of the discontinued LLVM Phabricator instance.

[x86][inline-asm][llvm]Amend size directive deduction mechanism of unsized memory operands
AbandonedPublic

Authored by coby on Apr 28 2017, 12:16 AM.

Details

Summary

This is an extension of the work being carried by the following change:
https://reviews.llvm.org/D26586
This commit handles cases where the size qualifier of an indirect memory reference operand in Intel syntax is missing (e.g. "vaddps xmm1, xmm2, [a]").
GCC will deduce the size qualifier based on the possible matches:
"vaddps xmm1, xmm2, [a]" matches only “XMMWORD PTR” qualifier.
"vaddps xmm1, xmm2, [a]{1to4}" matches only “DWORD PTR” qualifier.
"mov rax, [a]" matches only "QWORD PTR"

Currently, size directive will be deduced based on the size of the memory operand (apart from those cases which were handled by D26586).
For example:
"vaddps xmm1, xmm2, [a]"
"char a;" will imply "BYTE PTR" qualifier
"short a;" will imply "WORD PTR" qualifier.

This commit aligns LLVM to GCC’s behavior.

This is the LLVM part of the review.
Clang part can be found here:
https://reviews.llvm.org/D32638

Diff Detail

Repository
rL LLVM

Event Timeline

coby created this revision.Apr 28 2017, 12:16 AM
coby edited the summary of this revision. (Show Details)Apr 28 2017, 12:20 AM
rnk edited edge metadata.May 3 2017, 4:38 PM

GCC doesn't support Intel inline asm, so I'm confused how we can be aligning to their behavior. Are we matching gas, gcc, or msvc here?

coby added a comment.EditedMay 3 2017, 11:24 PM

@rnk

consider the following code:

char c;
void foo() {
__asm("mov rax, [c]");
}

compiled with -masm=intel, using GCC 7.1
is lowered to (intel style disassembly):
mov rax, qword ptr c_address
similarly - you can achieve the same effect by using the following directive (and the 'noprefix' option) - ".intel_syntax noprefix"

Now, as GCC merely pass the bucket to AS, you may say it is AS we are matching
Equivalently, you may say we actually match GCC, as inline-asm is (also) a high-level-source-code entity - i.e. we temper inline-asm here only - we do not affect actions being carried at an assembly level.
I'm more keen to support the second statement, but it's hardly matters.

rnk added a comment.May 4 2017, 8:41 AM

I think we had a user file this bug at https://bugs.llvm.org//show_bug.cgi?id=28266

Is this change testable from LLVM?

rnk added a subscriber: mcrosier.May 4 2017, 9:06 AM

rL166316 was the commit that originally added the AOK_SizeDirective logic for inline asm. @mcrosier said "This allows the backend to do proper instruction selection." I think now that we have the unsized mem operand matching loop, we don't need this rewrite at all, it's just getting in the way.

rnk added a comment.May 4 2017, 9:31 AM
In D32636#746140, @rnk wrote:

rL166316 was the commit that originally added the AOK_SizeDirective logic for inline asm. @mcrosier said "This allows the backend to do proper instruction selection." I think now that we have the unsized mem operand matching loop, we don't need this rewrite at all, it's just getting in the way.

I take this back, we do need it. I had a vague memory that MSVC doesn't do any of this memory operand size disambiguation based on frontend information, but that's not true. There are a few instructions that take registers, but can have varying memory operand size, like movzx:

void t13() {
  char i = 1;
  short j = 2;
  __asm movzx eax, [i]
  __asm movzx eax, [j]
}

MSVC's object file disassembles as:

00000000: 0F B6 05 00 00 00  movzx       eax,byte ptr [_i]
          00
00000007: 0F B7 05 00 00 00  movzx       eax,word ptr [_j]
          00

We need the size from the frontend to disambiguate this.

rnk added a comment.May 4 2017, 11:35 AM

Does rL302179 solve the problems you were having? It should reduce the number of places we need to do these ugly rewrites, and make inline asm parsing more consistent with standalone asm parsing.

coby abandoned this revision.Jun 26 2017, 6:25 AM

superseded by rL302179