This is an archive of the discontinued LLVM Phabricator instance.

Initial changes for porting libFuzzer to Windows.
AbandonedPublic

Authored by mpividori on Nov 21 2016, 9:59 PM.

Details

Reviewers
kcc
zturner
rnk
Summary

+ Divide the IO functions's implementation in 2 different files for differences between POSIX and Windows API.
+ Implement external functions for Windows using weak symbols with alias.
+ Implement utils functions for Windows, using vectored exception handlers as equivalent of POSIX signals. Implementation of timers is incomplete.
+ Implement memmem, missing in Windows.

Diff Detail

Repository
rL LLVM

Event Timeline

mpividori updated this revision to Diff 78833.Nov 21 2016, 9:59 PM
mpividori retitled this revision from to Initial changes for porting libFuzzer to Windows..
mpividori updated this object.
mpividori added a reviewer: zturner.
mpividori set the repository for this revision to rL LLVM.
mpividori added a subscriber: llvm-commits.

+kcc for general stuff
+rnk for the exception handling stuff.

In the future, when you generate patches, you should pass -U 999999 to git. This way it will generate full context for the diff.

lib/Fuzzer/CMakeLists.txt
12

I mentioned below that we should not have the preprocessor definitions use to conditionally compile code or not, but instead use CMake. Here is how you would do it. You might want to skip thsi comment for now and come back to it after you read the rest of the comments.

set(FUZZER_SOURCES
    FuzzerCrossOver.cpp
    FuzzerDriver.cpp
    FuzzerIO.cpp
    FuzzerLoop.cpp
    FuzzerMutate.cpp
    FuzzerSHA1.cpp
    FuzzerTracePC.cpp
    FuzzerTraceState.cpp
    FuzzerUtil.cpp
)

if (CMAKE_SYSTEM_NAME MATCHES "Windows")
    list(APPEND FUZZER_SOURCES
        FuzzerUtilWindows.cpp
        FuzzerIOWindows.cpp
        FuzzerExtFunctionsWindows.cpp
    )    
else()
    list(APPEND FUZZER_SOURCES
        FuzzerUtilPosix.cpp
        FuzzerIOPosix.cpp
    )

    if (CMAKE_SYSTEM_NAME MATCHES "Linux")
        FuzzerUtilLinux.cpp
        FuzzerExtFunctionsLinux.cpp
    elseif(CMAKE_SYSTEM_NAME MATCHES "Darwin")
        FuzzerUtilDarwin.cpp
        FuzzerExtFunctionsApple.cpp
    endif()
endif()

add_library(LLVMFuzzerNoMainObjects OBJECT ${FUZZER_SOURCES})

This way you can remove all the #if <foo> from the individual source files.

lib/Fuzzer/FuzzerDefs.h
31

I think you will need to add

#define LIBFUZZER_POSIX 0
#define LIBFUZZER_LINUX 0
#define LIBFUZZER_APPLE 0

similarly, in the previous 2 branches of the #if, you will need to #define LIBFUZZER_WINDOWS 0

lib/Fuzzer/FuzzerExtFunctionsWindows.cpp
13

Can you remove this and do it at the CMake level instead? It seems that the other 2 files were already working this way, but maybe we should clean it all up. I'll have more specific comments later on in the CMakeLists.txt file.

lib/Fuzzer/FuzzerIOPosix.cpp
14

Same thing here, this file should not even be sent to the compiler if it's not a posix system. More comments later in CMakeLists.txt

lib/Fuzzer/FuzzerIOWindows.cpp
30

Check the code in llvm/lib/Support/Windows/Path.inc for the status() function. There are some subtleties that you should be aware of that this code doesn't handle. You can probably just rip the code straight from there, changing W to A so we don't have to worry about widening the paths.

58–65

This looks wrong to me. We should be expecting a path that ends with either a slash or that doesn't end with a slash. It looks like you're trying to check here "is this the root directory". But I'm not sure why that matters. For example, we should translate all of the following:

D:\ -> D:\*
D:\foo -> D:\foo\*
d:\foo\ -> D:\foo\*

And I also think we should be asserting that the path is not empty, because that wouldn't make any sense.

So I think the correct check should be something like:

assert(!Path.empty());
if (Path.back() != '\\')
  Path.push_back('\\');

Path.push_back('*');

Let me know if I'm misunderstood though.

69

Some Windows api functions return nullptr if it fails, and some return INVALID_HANDLE_VALUE. You need to always check which one is returned to indicate an error. This one returns INVALID_HANDLE_VALUE, so the check here is incorrect.

85–95

The same comment here applies as above. We shouldn't be opening a handle to the file unless we really need to, because that can lead to sharing violations if someone else is using the file. If you move the code for IsFile() (which I commented on above) into a separate function called Stat() or something that returns the file type, then you could call that function both here and in IsFile().

114–134

It doesn't seem good to me that we're copying all these functions and only making very slight modifications. I would suggest handling it like this:

  1. In FuzzerIO.h, make the following declarations:
    • char getSeparator();
    • FILE *OpenFile(int Fd, const char *Mode);
    • int DuplicateFile(int fd);
    • int CloseFile(int fd);
  1. Also in FuzzerIO.h, copy DirPlusFile, DupAndCloseStderr, and CloseStdout to the common location, and use these functions.
  1. Implement the above functions in FileIOPosix.cpp and FileIOWindows.cpp as one-liners. For example, on Windows, you have something like this:
char getSeparator() { return `\\`; }
FILE *OpenFile(int Fd, const char *Mode) { return _fdopen(Fd, Mode); }

You could also do it with #defines, but this way seems safer.

136

Printf code is exactly the same in both places, so this function should just be declared and implemented in FuzzerIO.h and FuzzerIO.cpp, there is no reason to copy it.

lib/Fuzzer/FuzzerUtilWindows.cpp
157–165

Is there any reason we can't use this implementation on all platforms? The Posix implementation seems unnecessarily complicated, when this will work just fine.

180–188

In the future we should change this to use CreateProcessA, but for now this is probably fine.

rnk added inline comments.Nov 22 2016, 11:02 AM
lib/Fuzzer/CMakeLists.txt
12

FWIW, the sanitizer code takes the opposite approach, specifically to simplify build logic.

lib/Fuzzer/FuzzerUtilWindows.cpp
33

I think it would be better to draw this interface boundary at a higher level. Something where we pass in the handle_* flags, install a single exception handler, have one switch, and let it dispatch based on whether it's supposed to catch that type of exception. Then the cross-platform API would be something like InitializeSignalHandlers().

This would require hosting the Flags code out of FuzzerDriver.cpp, so we should ask @kcc about it.

149

For example, this handler is completely redundant with the SigInt handler. Only one will actually be called.

mpividori added inline comments.Nov 22 2016, 12:44 PM
lib/Fuzzer/CMakeLists.txt
12

Thanks! Yes, I completely agree.
I continued with the current approach in the implementation and left it as a future modification, because I wanted to know your opinion. I see that there is a bash script: "build.sh" that simply compiles all .cpp files and generates the LLVM library. If we remove the macros guards, then that script will not work.
I guess the purpose of that script is to make libFuzzer independent of the rest of the LLVM repository.
Would you agree if I simply remove that script? Also, I can modify it to ignore files ending in "Windows".

lib/Fuzzer/FuzzerDefs.h
31

Hi,
I omitted them to simplify the code. (when not defined, the macros assume value 0).
If you think it is clearer, I can include them.

lib/Fuzzer/FuzzerExtFunctionsWindows.cpp
13

Yes, I completely agree.

lib/Fuzzer/FuzzerIOWindows.cpp
30

Hi,
Thanks. Yes, I based my implementation in the implementation of status() / getStatus() from the Support Library.
What I haven't included is the function isReservedName, because I assumed that reserved names would return GetFileType() different to FILE_TYPE_DISK (probably FILE_TYPE_CHAR ?). Do you think I should include check for reserved names? The rest of the code is similar to the implementation of status().

58–65

Hi,
I took that code from directory_iterator_construct(..) implementation in llvm/lib/Support/Windows/Path.inc. I don't know why it check for : character, I thought this was a special case to be considered...
Ok, I agree with the code you proposed. It makes more sense for me too.

69

Thanks,
Ok. I took that code from directory_iterator_construct(..) implementation in llvm/lib/Support/Windows/Path.inc, so I think I should update that implementation too :).

85–95

Hi. I am not sure I understand properly. The status() implementation in llvm/lib/Support/Windows/Path.inc, opens a handle to get the file type, so I think it would be the same that current implementation. As far as I understand, it is not possible to know the file type without opening a handle. Am I missing something?
Thanks.

114–134

Thanks, I completely agree.

136

Yes, it is repeated because I included all functions working with the global variable OutputFile in the same .cpp file where it is defined.
I can declare OutputFile as extern variable in FuzzerDefs.h and move Printf code to FuzzerIO.cpp.

lib/Fuzzer/FuzzerUtilWindows.cpp
157–165

Yes, I think we can. I will update the code.
I could add this as the first option. If it fails (returns 0), then consider the commands npos/sysctl. Or directly simplify the code to only use: std::thread::hardware_concurrency().

zturner added inline comments.Nov 22 2016, 12:57 PM
lib/Fuzzer/CMakeLists.txt
12

Let's wait and see what kcc has to say on this.

lib/Fuzzer/FuzzerDefs.h
31

As long as we're consistent it's fine. If we're going to remove the 0 values, then remove them from the other cases too.

lib/Fuzzer/FuzzerIOWindows.cpp
30

Hmm. The function I'm looking at first calls GetFileAttributes(), and then only if it is a reparse point (kinda like a symlink) does it open the handle. This allows it to "follow the symlink" to get info about the target. But the advantage of that approach is that GetFileAttributes() doesn't actually open the handle -- which can fail in certain scenarios -- unless it has to.

Are we looking at the same function? The one I'm looking at is lib/Support/Windows/Path.inc line 464.

69

Interesting, yea that looks like a bug in the original code. Please submit that in a separate patch though since it's unrelated to libfuzzer.

85–95

You are right, I looked over that function again. But I still think we can do better. I think the status function there is intended to be more general. It retrieves more information than just whether or not it is a regular file or directory. For example, if you call GetFileAttributesA() and the output parameter is equal to FILE_ATTRIBUTE_NORMAL, you can return true immediately. Only if it contains FILE_ATTRIBUTE_REPARSE_POINT do you need to open the handle and do the more complicated aproach.

This will allow it to work in more scenarios, as opening a handle can fail in strange situations.

136

Maybe just have Printf take the file as first argument? So that instead of printf it's more like fprintf?

mpividori added inline comments.Nov 22 2016, 1:10 PM
lib/Fuzzer/FuzzerUtilWindows.cpp
33

Hi, thanks for your comment.
I don't see the real advantage of that. Do you see any problem in registering many exception handlers with AddVectoredExceptionHandler()?
Is it only about the design?
Thanks

149

I am not sure I understand what you mean. The handlers TERMHandler and INTHandler return a boolean value TRUE or FALSE, to know if they have handled the signal. We can register many handlers.

mpividori added inline comments.Nov 22 2016, 1:38 PM
lib/Fuzzer/FuzzerDefs.h
31

Yes, I agree.

lib/Fuzzer/FuzzerIOWindows.cpp
30

Hi, in fact the: if (attr & FILE_ATTRIBUTE_REPARSE_POINT) in lib/Support/Windows/Path.inc line 482, doesn't make sense. It does exactly the same when it is a reparse point than when it isn't .....
If it is not a reparse point, it also uses CreateFileView to open a handle...
I think there are some errors in that code, or the if statement is redundant...

zturner added inline comments.Nov 22 2016, 3:00 PM
lib/Fuzzer/FuzzerIOWindows.cpp
30

Hmm, you are right. That's some bad code :)

Anyway, for just returning true / false, it's slightly better to only open the file if it's a reparse point. If it's not a reparse point, you can just return attr == FILE_ATTRIBUTE_NORMAL and it will work.

mpividori updated this revision to Diff 79097.Nov 23 2016, 8:54 AM
mpividori edited edge metadata.

+ Update code to include proposed changes. In particular IsFile() and ListFilesInDirRecursive for Windows.
+ Split the declarations in FuzzerDefs.h into the files: FuzzerIO.h, FuzzerUtil.h and FuzzerSHA1.h.
+ Improve lexicographical order of system headers in #includes (a detail).

I updated the code to include proposed changes.

kcc edited edge metadata.Nov 29 2016, 11:39 AM

First comment, w/o looking in much detail: consider the complexity of the code review process is N^2 (N=size of the patch)
please try to split the change into as many smaller independent changes as you can.

@kcc Ok. Thanks for your comment. I will split this diff in smaller changes.

mpividori abandoned this revision.Nov 29 2016, 5:28 PM