This is an archive of the discontinued LLVM Phabricator instance.

Use binary write mode in WriteToFile function to avoid appended \r characters on Windows
ClosedPublic

Authored by tuktuk on Mar 29 2019, 12:40 PM.

Details

Summary

When using libfuzzer on Windows, in the contents of a crash sample, bytes that can be mistaken for a \n are replaced by a \r\n sequence. As a consequence, crashes are not reproducible. This patch will open files in binary mode to fix this issue. The patch does not affect POSIX systems.

Patch by tuktuk

Diff Detail

Event Timeline

tuktuk created this revision.Mar 29 2019, 12:40 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 29 2019, 12:40 PM
Herald added subscribers: Restricted Project, llvm-commits. · View Herald Transcript
vitalybuka accepted this revision.Mar 29 2019, 12:46 PM
This revision is now accepted and ready to land.Mar 29 2019, 12:46 PM

Oh, can you add a test so it fail without the patch on Windows?

Oh, can you add a test so it fail without the patch on Windows?

Hello,

I am happy to provide a simple example but I do not know the code base enough to turn it into a test.

With the following fuzz target:

// fuzz_target.cc
#include <cstdint>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
  if (Data[0] == '\n') {
    __builtin_trap();
  }
  return 0;
}

The produced crash is not reproducible:

>"C:\Program Files\LLVM\bin\clang++.exe" -fsanitize=address,fuzzer 
fuzz_target.cc -o fuzzer.exe

> fuzzer.exe
SUMMARY: libFuzzer: deadly signal
Test unit written to ./crash-adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
Base64: Cg==

> fuzzer.exe crash-adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
Running: crash-adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
Executed crash-adc83b19e793491b1c6ea0fd8b46cd9f32e592fc in 2 ms
***
*** NOTE: fuzzing was not performed, you have only
***       executed the target code on a fixed set of inputs.
***

Indeed the contents of the produced file is "\r\n", not "\n".

The expected result is the one obtained with the following fuzz target:

// fuzz_target.cc
#include <cstdint>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
  if (Data[0] == 'a') {
    __builtin_trap();
  }
  return 0;
}

The produced crash is reproducible as expected:

> "C:\Program Files\LLVM\bin\clang++.exe" -fsanitize=address,fuzzer 
fuzz_target.cc -o fuzzer.exe

> fuzzer.exe
SUMMARY: libFuzzer: deadly signal
Test unit written to ./crash-86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
Base64: YQ==

>fuzzer.exe crash-86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
fuzzer: Running 1 inputs 1 time(s) each.
Running: crash-86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
==12580== ERROR: libFuzzer: deadly signal

Something like that but I would recommend something more generic

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
  if (Size < 1000000)
   return 0;

  // or any other simple hash
  Sum = Size;
  for (i = 0 to Size )
     Sum += data[i]

  if (Sum % 100 == 19)
      abort();

  return 0;
}

rm -rf SOME_UNIQUE_PREFIX*
run with -max_len=2000000 -artifact_prefix=SOME_UNIQUE_PREFIX
it should crash quickly

retry on SOME_UNIQUE_PREFIX*
so if any character in input is altered by Write/Read it should stop being reproducible.

There is a problem with that generic solution which is that nothing in it forces libfuzzer to generate multiple byte values and in my experiments the data I get is always very poorly distributed, so in practice it is not that generic.
I tried a fuzz target that would force libfuzzer to have at least one instance of every possible byte value, but libfuzzer wouldn't manage to generate appropriate data.
However I can confirm that the following fuzz target can be used to differentiate between pre-patch and patched versions following the (non-)reproducibility test we talked about:

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

// Tests preservation of crashing inputs when writing to disk.
#include <cstdint>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    static bool Found[256];

    // Make sure Data is quite long

    if (Size < 1000) {
        return 0;
    }

    // Make sure Data has at least one '\n' byte value

    // Checking that Data has all possible byte values would be more satisfying
    // but libfuzzer doesn't seem powerful enough for this right now

    for (size_t i = 0; i < 256; ++i) {
        Found[i] = false;
    }
    for (size_t i = 0; i < Size; ++i) {
        Found[Data[i]] = true;
    }
    if (!Found['\n']) {
        return 0;
    }

    // Crash upon some hash value from Size and Data

    // Reproducibility will only happen if hash value is the same

    size_t Sum = Size;
    for (size_t i = 0; i < Size; ++i) {
        Sum += Data[i];
    }
    if (Sum % 100 == 19) {
        __builtin_trap();
    }

    return 0;
}

I spent some time trying to understand llvm lit syntax but I couldn't produce a satisfying test file that I would be able to run from my Windows machine. Could you do this part? Thanks a lot.

Can you try the test on Windows?

tuktuk added a comment.Apr 1 2019, 1:53 PM

Looks good to me! Here are the results without / with patch.

D:\llvm-project\compiler-rt> python ..\build\Release\bin\llvm-lit.py --param build_config=x64 --param build_mode=Release --param llvm_site_config=../build/test/lit.site.cfg test\fuzzer\reload.test
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:32: note: lsan feature unavailable
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:39: note: msan feature unavailable
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:57: note: linux feature unavailable
-- Testing: 1 tests, single process --
FAIL: libFuzzer :: reload.test (1 of 1)
Testing Time: 25.05s
********************
Failing Tests (1):
    libFuzzer :: reload.test

  Unexpected Failures: 1

D:\llvm-project\compiler-rt> python ..\build\Release\bin\llvm-lit.py --param build_config=x64 --param build_mode=Release --param llvm_site_config=../build/test/lit.site.cfg test\fuzzer\reload.test
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:32: note: lsan feature unavailable
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:39: note: msan feature unavailable
llvm-lit.py: D:/llvm-project/compiler-rt/test/fuzzer/lit.cfg:57: note: linux feature unavailable
-- Testing: 1 tests, single process --
PASS: libFuzzer :: reload.test (1 of 1)
Testing Time: 9.71s
  Expected Passes    : 1

Execution time is always similar with multiple tries, I am not sure why it takes longer to fail without patch than to succeed with patch.

vitalybuka edited the summary of this revision. (Show Details)Apr 5 2019, 12:58 PM

If you like you can update your name in the summary, I can't see it in profile, and I will submit the patch

tuktuk added a comment.Apr 5 2019, 1:12 PM

It's good as it is, you can submit the patch. Thanks for your time!

This revision was automatically updated to reflect the committed changes.