This is an archive of the discontinued LLVM Phabricator instance.

This patch attempts to primitive support for Win64 asan
ClosedPublic

Authored by wang0109 on Jun 1 2016, 3:43 PM.

Details

Summary

Some known issues are:

  • When "head" include instructions that involve branching, the "cut and paste" approach may break down in a way that function interception still work but calling back the original function does not work.
  • The jmp [rip -8] saves some bytes in the "head" but finding the safe zone of 0xCC is not implemented yet. So it may stomp on preceding codes.
  • The shadow offset is not working yet on Win64. More complexity maybe involved since there are some differences regarding virtual address space between Window 8 and Windows 8.1/10.

Diff Detail

Repository
rL LLVM

Event Timeline

wang0109 updated this revision to Diff 59295.Jun 1 2016, 3:43 PM
wang0109 retitled this revision from to This patch attempts to primitive support for Win64 asan.
wang0109 updated this object.
wang0109 added reviewers: etienneb, rnk.
wang0109 added a subscriber: llvm-commits.
kcc added a subscriber: kcc.Jun 1 2016, 3:55 PM
kcc added inline comments.
lib/asan/asan_interceptors.cc
733 ↗(On Diff #59295)

define SANITIZER_WIN64 similar to SANITIZER_LINUX and use it instead of _WIN64

734 ↗(On Diff #59295)

Then define SANITIZER_INTERCEPT_MEMCPY similar to SANITIZER_INTERCEPT_MEMCHR
Make it 0 under SANITIZER_WIN64, and there explain why.

lib/interception/interception_win.cc
46 ↗(On Diff #59295)

I hope someone else can review this code :)

wang0109 updated this revision to Diff 59326.Jun 1 2016, 6:12 PM
  • Define the platform SANITIZER_WINDOWS64

and the double interception was due to memcpy and memmov
has the same address on windows 64. So don't try to intercept
twice.

kcc added a comment.Jun 1 2016, 6:24 PM

LGTM for non-windows code.
Please let someone else review the code in lib/interception/interception_win.cc

etienneb added inline comments.Jun 2 2016, 5:06 AM
lib/interception/interception_win.cc
48 ↗(On Diff #59326)

I'm used to look instructions encoding on this site: http://ref.x86asm.net/

0x68: PUSH	imm16/32
0xC7: MOV	r/m16/32/64	imm16/32
0x44:     MOD:  [sib]+disp8
0x24:     SIB:     ESP (no scale index)
0x04:     offset (+4)
0xC3: RETN

It seems right to me.

68 ↗(On Diff #59326)

*(jmp_from + 5) -> jmp_from[5]
and below

80 ↗(On Diff #59326)

0xFF: JMP (opcode 4, encoded in 0x25)
0x25: [RIP/EIP]+disp32
... : offset -8

82 ↗(On Diff #59326)

*jmp_from + x -> jmp_from[x]

303 ↗(On Diff #59326)

indent is strange here.

lib/sanitizer_common/sanitizer_platform_interceptors.h
88 ↗(On Diff #59326)

problamatic -> problematic

wang0109 updated this revision to Diff 59397.Jun 2 2016, 9:15 AM
  • Update diff to fix array access format and typos, also improve some comments
rnk added inline comments.Jun 2 2016, 9:27 AM
lib/interception/interception_win.cc
68–71 ↗(On Diff #59326)

Would this be cleaner with array indexing?

jmp_from[5] = '\xc7';
jmp_from[6] = '\x44;
jmp_from[7] = '\x24;
...

Obviously doesn't apply to the store of the immediate.

85 ↗(On Diff #59326)

ditto

177–178 ↗(On Diff #59326)

These can be 4 byte checks, the displacement is in the fifth byte and doesn't need to be checked.

183 ↗(On Diff #59326)

We don't need three exclamation points, we know the table above isn't complete, it's just good enough for what we're trying to do.

275 ↗(On Diff #59326)

Can we do this now? There really isn't much difference between the 32-bit and 64-bit code, just different trampolines, which I think you could abstract pretty easily.

293 ↗(On Diff #59326)

It's probably simpler to use this code here:

jmp [rip+5] # not sure about precise offset
.quad target_address

To help with the 32/64 code merge, I'd create two entry points, WriteTrampolineJumpInstruction and WriteInterceptorJumpInstruction which behave differently on 32 and 64bit. On 64-bit, they would both delegate to WriteIndirectJumpInstruction, which would take two pointers, a code pointer a pointer to the address to jump through. A prototype:

void WriteTrampolineJumpInstruction(char *jmp_from, char *to) {
  // Emit an indirect jump through immediately following bytes:
  // jmp_from:
  //   jmp [rip + 5]
  //   .quad to
  // Store the address.
  uptr *indirect_target = (uptr *)(jmp_from + 6);
  *indirect_target = (uptr)to;
  // Write the indirect jump.
  WriteIndirectJumpInstruction(jmp_from, indirect_target);
}

void WriteInterceptorJumpInstruction(char *jmp_from, char *to) {
  // Emit an indirect jump through immediately preceding bytes:
  //   .quad to
  // jmp_from:
  //   jmp [rip - 8]
  // Store the address.
  uptr *indirect_target = (uptr *)(jmp_from - 8);
  *indirect_target = (uptr)to;
  // Write the indirect jump.
  WriteIndirectJumpInstruction(jmp_from, indirect_target);
}

For 32-bit they would both write a normal 5 byte jump

wang0109 updated this revision to Diff 59434.Jun 2 2016, 11:26 AM
  • Merge implementations of 32bit/64bit windows interception.
rnk edited edge metadata.Jun 3 2016, 12:56 PM

This is pretty close

lib/interception/interception_win.cc
38 ↗(On Diff #59434)

This will be unused in the win64 build, so you should do:

#ifndef SANITIZER_WINDOWS64
static void WriteJumpInstruction(...) { ... }
#else
static void WriteIndirectJumpInstruction(...) { ... }
#endif

Otherwise you'll trigger warnings on Linux.

46 ↗(On Diff #59434)

Why NOLINT?

302 ↗(On Diff #59434)

Can you write this in terms of constants that you've computed conditionally above, like you do for the trampoline?

319 ↗(On Diff #59434)

What is the linter complaining about here? Mash clang-format if it's line length.

wang0109 added inline comments.Jun 4 2016, 11:25 AM
lib/interception/interception_win.cc
46 ↗(On Diff #59434)

If I remember correctly, it's over 80-characters in width. I'll double check.

wang0109 updated this revision to Diff 59671.Jun 5 2016, 11:44 AM
wang0109 edited edge metadata.
  • update diff to remove some NOLINT and add kExtraPrevBytes
rnk accepted this revision.Jun 6 2016, 8:37 AM
rnk edited edge metadata.

Looks good, but there's some formatting issues around the VirtualProtect calls.

Etienne, do you mind landing this patch? We should probably bug sabre@nondot.org for commit access at this point.

lib/interception/interception_win.cc
319–320 ↗(On Diff #59671)

I'd recommend getting clang-format: http://clang.llvm.org/docs/ClangFormat.html
It has editor integrations for VS, Vim, emacs, and BBEdit(?). I'm surprised Sublime isn't in that list... Anyway, it's really worth taking the time to get it wired up to some editor keypress to avoid discussing code formatting.

This revision is now accepted and ready to land.Jun 6 2016, 8:37 AM
etienneb edited edge metadata.Jun 6 2016, 8:50 AM

Etienne, do you mind landing this patch?

Ack. landing...

wang0109 updated this revision to Diff 59749.Jun 6 2016, 10:45 AM
wang0109 edited edge metadata.
  • update diff to use clang-format for VirtualProtect
This revision was automatically updated to reflect the committed changes.
pcc added a subscriber: pcc.Jun 6 2016, 11:48 AM
pcc added inline comments.
compiler-rt/trunk/lib/interception/interception_win.cc
267

Do we ever actually need to do this on Windows? It looks like most of the functions we are intercepting are C standard library functions which we could easily implement ourselves.

rnk added inline comments.Jun 6 2016, 12:09 PM
compiler-rt/trunk/lib/interception/interception_win.cc
267

Sure, but we'd have to remove all uses of REAL in asan_interceptors.cc and sanitizer_common_interceptors.inc, which would be a lot of work and would mean Windows has special rules for writing interceptors.

pcc added inline comments.Jun 6 2016, 12:32 PM
compiler-rt/trunk/lib/interception/interception_win.cc
267

Sure, but we'd have to remove all uses of REAL in asan_interceptors.cc and sanitizer_common_interceptors.inc

Not necessarily, we could define REAL as something like #define REAL(x) real_##x on Windows, and provide sanitizer_common implementations of whichever real_foo functions are needed on Windows.

would mean Windows has special rules for writing interceptors.

That's unfortunate, but it seems less fragile than trying to pattern match every possible instruction sequence here.