This is an archive of the discontinued LLVM Phabricator instance.

Win64: Don't use REX prefix for direct tail calls
ClosedPublic

Authored by hans on Sep 8 2016, 11:46 AM.

Details

Summary

The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.

Diff Detail

Repository
rL LLVM

Event Timeline

hans updated this revision to Diff 70733.Sep 8 2016, 11:46 AM
hans retitled this revision from to Win64: Don't use REX prefix for direct tail calls.
hans updated this object.
hans added reviewers: majnemer, rnk.
hans added a subscriber: llvm-commits.

Hmm, looks like we are over REXing there too:

void __declspec(dllimport) h();
void f() {
  h();
}

turns into:

f:                                      # @f
.Ltmp0:
.seh_proc f
# BB#0:                                 # %entry
.Ltmp1:
	.seh_endprologue
	rex64 jmpq	*__imp_h(%rip)  # TAILCALL
                                        # encoding: [0x48,0xff,0x25,A,A,A,A]
                                        #   fixup A - offset: 3, value: __imp_h-4, kind: reloc_riprel_4byte
	.seh_handlerdata
	.text
.Ltmp2:
	.seh_endproc

My reading indicates that the unwinder will consider this as a tail call. If anything, our REX prefix is probably confusing it.

I imagine that MSVC omits it here?

hans added a comment.Sep 8 2016, 1:45 PM

Hmm, looks like we are over REXing there too:

void __declspec(dllimport) h();
void f() {
  h();
}

turns into:

f:                                      # @f
.Ltmp0:
.seh_proc f
# BB#0:                                 # %entry
.Ltmp1:
	.seh_endprologue
	rex64 jmpq	*__imp_h(%rip)  # TAILCALL
                                        # encoding: [0x48,0xff,0x25,A,A,A,A]
                                        #   fixup A - offset: 3, value: __imp_h-4, kind: reloc_riprel_4byte
	.seh_handlerdata
	.text
.Ltmp2:
	.seh_endproc

My reading indicates that the unwinder will consider this as a tail call. If anything, our REX prefix is probably confusing it.

I imagine that MSVC omits it here?

No, they emit it:

d:\src\tmp>type a.cc
void __declspec(dllimport) h();
void f() {
  h();
}

d:\src\tmp>cl /c /Ox a.cc && dumpbin /disasm a.obj
Microsoft (R) C/C++ Optimizing Compiler Version 18.00.31101 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

a.cc
Microsoft (R) COFF/PE Dumper Version 12.00.31101.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file a.obj

File Type: COFF OBJECT

?f@@YAXXZ (void __cdecl f(void)):
  0000000000000000: 48 FF 25 00 00 00  jmp         qword ptr [__imp_?h@@YAXXZ]
                    00

  Summary

          64 .debug$S
          2F .drectve
           7 .text$mn

It seems that not all indirect jumps should be REX prefixed: https://github.com/dotnet/coreclr/blob/master/src/unwinder/amd64/unwinder_amd64.cpp#L1378

Hmm, it seems for this specific one they can handle both with and without REX prefix.

Does TAILJMPm64_REX emit something that would match https://github.com/dotnet/coreclr/blob/master/src/unwinder/amd64/unwinder_amd64.cpp#L1389 ?

Yes, I believe TAILJMPm64_REX is what we use in your dllimport example, and it seems that matches both if-statements here.

Perhaps the unwinder used to be stricter but then they had to change it to accept more non-rexed versions?

majnemer accepted this revision.Sep 8 2016, 3:20 PM
majnemer edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 8 2016, 3:20 PM
This revision was automatically updated to reflect the committed changes.