This is an archive of the discontinued LLVM Phabricator instance.

[sanitizers] [windows] Correctly override functions with backward jmps
ClosedPublic

Authored by zero9178 on Nov 10 2022, 8:07 AM.

Details

Summary

To reproduce: Download and run the latest Firefox ASAN build (https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.mozilla-central.latest.firefox.win64-asan-opt/artifacts/public/build/target.zip) on Windows 11 (version 10.0.22621 Build 22621); it will crash on launch. Note that this doesn't seem to crash on another Windows 11 VM I've tried, so I'm not sure how reproducible it is across machines, but it reproduces on my machine every time.

The problem seems to be that when overriding the memset function in OverrideFunctionWithRedirectJump(), the relative_offset is stored as a uptr. Per the Intel x64 instruction set reference (https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf - warning: large PDF), on page 646 the jmp instruction (specifically the near jump flavors that start with E9, which are the ones the OverrideFunctionWithRedirectJump() considers) treats the offset as a signed displacement. This causes an incorrect value to be stored for REAL(memset) which points to uninitialized memory, and a crash the next time that gets called.

The fix is to simply treat that offset as signed. I have also added a test case.

Bug: https://github.com/llvm/llvm-project/issues/58846

Diff Detail

Event Timeline

gregstoll created this revision.Nov 10 2022, 8:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 8:07 AM
Herald added subscribers: Enna1, dberris. · View Herald Transcript
gregstoll requested review of this revision.Nov 10 2022, 8:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 8:07 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript

Attempt to fix linting

Another attempt to fix linting

gregstoll updated this revision to Diff 474614.Nov 10 2022, 2:35 PM

Another attempt at fixing linting

gregstoll retitled this revision from [compiler-rt] [windows] Correctly override functions with backward jmps to [sanitizers] [windows] Correctly override functions with backward jmps.Nov 10 2022, 6:08 PM
gregstoll edited the summary of this revision. (Show Details)
gregstoll added a reviewer: vitalybuka.

Overall, the patch does look reasonable - but it doesn't seem to work as is.

compiler-rt/lib/interception/interception_win.cpp
741

There's no i32 type defined here - there is s32 though.

compiler-rt/lib/interception/tests/interception_win_test.cpp
447

This is a typo here, there's no FunctionPrefixNode, it should be -None right?

Even with that fixed, the new test doesn't seem to pass for me. Can you verify - that the newly added test does fail before applying the functional fix, and that it passes afterwards?

Overall, the patch does look reasonable - but it doesn't seem to work as is.

Sorry about this! I've had trouble getting LLVM building on my machine, and was hoping the Phabricator builds would verify that the test passed. I'll work more on getting a local build going so I can address these issues.

FWIW, after updating to Windows 11 today I started experiencing the same crashes with ASAN builds, remembered this patch and applied it to my local clang build. Works without issues now with the patch applied.

FWIW, after updating to Windows 11 today I started experiencing the same crashes with ASAN builds, remembered this patch and applied it to my local clang build. Works without issues now with the patch applied.

Thanks for the data point! Are you able to look into running and fixing the tests so that they cover this issue (so that they fail before applying the functional change and succeed afterwards)?

FWIW, after updating to Windows 11 today I started experiencing the same crashes with ASAN builds, remembered this patch and applied it to my local clang build. Works without issues now with the patch applied.

Thanks for the data point! Are you able to look into running and fixing the tests so that they cover this issue (so that they fail before applying the functional change and succeed afterwards)?

Sure, I'll try to find some time for it soon.
@gregstoll would you mind if I commandeer the revision if I can fix the tests?

FWIW, after updating to Windows 11 today I started experiencing the same crashes with ASAN builds, remembered this patch and applied it to my local clang build. Works without issues now with the patch applied.

Thanks for the data point! Are you able to look into running and fixing the tests so that they cover this issue (so that they fail before applying the functional change and succeed afterwards)?

Sure, I'll try to find some time for it soon.
@gregstoll would you mind if I commandeer the revision if I can fix the tests?

Nope, go for it! I was hoping to get back to it sometime but that probably won't happen until the new year, and I'd really just rather have it fixed 🙂

zero9178 commandeered this revision.Dec 5 2022, 3:22 PM
zero9178 updated this revision to Diff 480262.
zero9178 added a reviewer: gregstoll.

Address review comments:

  • Make use of s32 and sptr
  • Adjust test x86 code to correctly jump to the beginning of the code
  • Fix test to correctly override the relative jump
gregstoll accepted this revision.Dec 5 2022, 5:33 PM

Thanks a bunch!

This revision is now accepted and ready to land.Dec 5 2022, 5:33 PM

@mstorsjo do you have any further comments on the state of the tests? I'd like to land this on Monday if possible

mstorsjo accepted this revision.Dec 7 2022, 2:20 AM

LGTM, this seems to run fine for me in x86_64 mode - I didn't build and run the tests in 32 bit mode though, but I presume that you did.