+ Divide the IO functions's implementation in 2 different files for differences between POSIX and Windows API.
+ Implement external functions for Windows using weak symbols with alias.
+ Implement utils functions for Windows, using vectored exception handlers as equivalent of POSIX signals. Implementation of timers is incomplete.
+ Implement memmem, missing in Windows.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
+kcc for general stuff
+rnk for the exception handling stuff.
In the future, when you generate patches, you should pass -U 999999 to git. This way it will generate full context for the diff.
lib/Fuzzer/CMakeLists.txt | ||
---|---|---|
12 | I mentioned below that we should not have the preprocessor definitions use to conditionally compile code or not, but instead use CMake. Here is how you would do it. You might want to skip thsi comment for now and come back to it after you read the rest of the comments. set(FUZZER_SOURCES FuzzerCrossOver.cpp FuzzerDriver.cpp FuzzerIO.cpp FuzzerLoop.cpp FuzzerMutate.cpp FuzzerSHA1.cpp FuzzerTracePC.cpp FuzzerTraceState.cpp FuzzerUtil.cpp ) if (CMAKE_SYSTEM_NAME MATCHES "Windows") list(APPEND FUZZER_SOURCES FuzzerUtilWindows.cpp FuzzerIOWindows.cpp FuzzerExtFunctionsWindows.cpp ) else() list(APPEND FUZZER_SOURCES FuzzerUtilPosix.cpp FuzzerIOPosix.cpp ) if (CMAKE_SYSTEM_NAME MATCHES "Linux") FuzzerUtilLinux.cpp FuzzerExtFunctionsLinux.cpp elseif(CMAKE_SYSTEM_NAME MATCHES "Darwin") FuzzerUtilDarwin.cpp FuzzerExtFunctionsApple.cpp endif() endif() add_library(LLVMFuzzerNoMainObjects OBJECT ${FUZZER_SOURCES}) This way you can remove all the #if <foo> from the individual source files. | |
lib/Fuzzer/FuzzerDefs.h | ||
31 | I think you will need to add #define LIBFUZZER_POSIX 0 #define LIBFUZZER_LINUX 0 #define LIBFUZZER_APPLE 0 similarly, in the previous 2 branches of the #if, you will need to #define LIBFUZZER_WINDOWS 0 | |
lib/Fuzzer/FuzzerExtFunctionsWindows.cpp | ||
13 | Can you remove this and do it at the CMake level instead? It seems that the other 2 files were already working this way, but maybe we should clean it all up. I'll have more specific comments later on in the CMakeLists.txt file. | |
lib/Fuzzer/FuzzerIOPosix.cpp | ||
14 | Same thing here, this file should not even be sent to the compiler if it's not a posix system. More comments later in CMakeLists.txt | |
lib/Fuzzer/FuzzerIOWindows.cpp | ||
30 | Check the code in llvm/lib/Support/Windows/Path.inc for the status() function. There are some subtleties that you should be aware of that this code doesn't handle. You can probably just rip the code straight from there, changing W to A so we don't have to worry about widening the paths. | |
58–65 | This looks wrong to me. We should be expecting a path that ends with either a slash or that doesn't end with a slash. It looks like you're trying to check here "is this the root directory". But I'm not sure why that matters. For example, we should translate all of the following: D:\ -> D:\* D:\foo -> D:\foo\* d:\foo\ -> D:\foo\* And I also think we should be asserting that the path is not empty, because that wouldn't make any sense. So I think the correct check should be something like: assert(!Path.empty()); if (Path.back() != '\\') Path.push_back('\\'); Path.push_back('*'); Let me know if I'm misunderstood though. | |
69 | Some Windows api functions return nullptr if it fails, and some return INVALID_HANDLE_VALUE. You need to always check which one is returned to indicate an error. This one returns INVALID_HANDLE_VALUE, so the check here is incorrect. | |
85–95 | The same comment here applies as above. We shouldn't be opening a handle to the file unless we really need to, because that can lead to sharing violations if someone else is using the file. If you move the code for IsFile() (which I commented on above) into a separate function called Stat() or something that returns the file type, then you could call that function both here and in IsFile(). | |
114–134 | It doesn't seem good to me that we're copying all these functions and only making very slight modifications. I would suggest handling it like this:
char getSeparator() { return `\\`; } FILE *OpenFile(int Fd, const char *Mode) { return _fdopen(Fd, Mode); } You could also do it with #defines, but this way seems safer. | |
136 | Printf code is exactly the same in both places, so this function should just be declared and implemented in FuzzerIO.h and FuzzerIO.cpp, there is no reason to copy it. | |
lib/Fuzzer/FuzzerUtilWindows.cpp | ||
157–165 | Is there any reason we can't use this implementation on all platforms? The Posix implementation seems unnecessarily complicated, when this will work just fine. | |
180–188 | In the future we should change this to use CreateProcessA, but for now this is probably fine. |
lib/Fuzzer/CMakeLists.txt | ||
---|---|---|
12 | FWIW, the sanitizer code takes the opposite approach, specifically to simplify build logic. | |
lib/Fuzzer/FuzzerUtilWindows.cpp | ||
33 | I think it would be better to draw this interface boundary at a higher level. Something where we pass in the handle_* flags, install a single exception handler, have one switch, and let it dispatch based on whether it's supposed to catch that type of exception. Then the cross-platform API would be something like InitializeSignalHandlers(). This would require hosting the Flags code out of FuzzerDriver.cpp, so we should ask @kcc about it. | |
149 | For example, this handler is completely redundant with the SigInt handler. Only one will actually be called. |
lib/Fuzzer/CMakeLists.txt | ||
---|---|---|
12 | Thanks! Yes, I completely agree. | |
lib/Fuzzer/FuzzerDefs.h | ||
31 | Hi, | |
lib/Fuzzer/FuzzerExtFunctionsWindows.cpp | ||
13 | Yes, I completely agree. | |
lib/Fuzzer/FuzzerIOWindows.cpp | ||
30 | Hi, | |
58–65 | Hi, | |
69 | Thanks, | |
85–95 | Hi. I am not sure I understand properly. The status() implementation in llvm/lib/Support/Windows/Path.inc, opens a handle to get the file type, so I think it would be the same that current implementation. As far as I understand, it is not possible to know the file type without opening a handle. Am I missing something? | |
114–134 | Thanks, I completely agree. | |
136 | Yes, it is repeated because I included all functions working with the global variable OutputFile in the same .cpp file where it is defined. | |
lib/Fuzzer/FuzzerUtilWindows.cpp | ||
157–165 | Yes, I think we can. I will update the code. |
lib/Fuzzer/CMakeLists.txt | ||
---|---|---|
12 | Let's wait and see what kcc has to say on this. | |
lib/Fuzzer/FuzzerDefs.h | ||
31 | As long as we're consistent it's fine. If we're going to remove the 0 values, then remove them from the other cases too. | |
lib/Fuzzer/FuzzerIOWindows.cpp | ||
30 | Hmm. The function I'm looking at first calls GetFileAttributes(), and then only if it is a reparse point (kinda like a symlink) does it open the handle. This allows it to "follow the symlink" to get info about the target. But the advantage of that approach is that GetFileAttributes() doesn't actually open the handle -- which can fail in certain scenarios -- unless it has to. Are we looking at the same function? The one I'm looking at is lib/Support/Windows/Path.inc line 464. | |
69 | Interesting, yea that looks like a bug in the original code. Please submit that in a separate patch though since it's unrelated to libfuzzer. | |
85–95 | You are right, I looked over that function again. But I still think we can do better. I think the status function there is intended to be more general. It retrieves more information than just whether or not it is a regular file or directory. For example, if you call GetFileAttributesA() and the output parameter is equal to FILE_ATTRIBUTE_NORMAL, you can return true immediately. Only if it contains FILE_ATTRIBUTE_REPARSE_POINT do you need to open the handle and do the more complicated aproach. This will allow it to work in more scenarios, as opening a handle can fail in strange situations. | |
136 | Maybe just have Printf take the file as first argument? So that instead of printf it's more like fprintf? |
lib/Fuzzer/FuzzerUtilWindows.cpp | ||
---|---|---|
33 | Hi, thanks for your comment. | |
149 | I am not sure I understand what you mean. The handlers TERMHandler and INTHandler return a boolean value TRUE or FALSE, to know if they have handled the signal. We can register many handlers. |
lib/Fuzzer/FuzzerDefs.h | ||
---|---|---|
31 | Yes, I agree. | |
lib/Fuzzer/FuzzerIOWindows.cpp | ||
30 | Hi, in fact the: if (attr & FILE_ATTRIBUTE_REPARSE_POINT) in lib/Support/Windows/Path.inc line 482, doesn't make sense. It does exactly the same when it is a reparse point than when it isn't ..... |
lib/Fuzzer/FuzzerIOWindows.cpp | ||
---|---|---|
30 | Hmm, you are right. That's some bad code :) Anyway, for just returning true / false, it's slightly better to only open the file if it's a reparse point. If it's not a reparse point, you can just return attr == FILE_ATTRIBUTE_NORMAL and it will work. |
+ Update code to include proposed changes. In particular IsFile() and ListFilesInDirRecursive for Windows.
+ Split the declarations in FuzzerDefs.h into the files: FuzzerIO.h, FuzzerUtil.h and FuzzerSHA1.h.
+ Improve lexicographical order of system headers in #includes (a detail).
First comment, w/o looking in much detail: consider the complexity of the code review process is N^2 (N=size of the patch)
please try to split the change into as many smaller independent changes as you can.
I mentioned below that we should not have the preprocessor definitions use to conditionally compile code or not, but instead use CMake. Here is how you would do it. You might want to skip thsi comment for now and come back to it after you read the rest of the comments.
This way you can remove all the #if <foo> from the individual source files.