This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Diff 19 - Independent implementation of DirName for Windows.
ClosedPublic

Authored by mpividori on Dec 8 2016, 9:37 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

mpividori updated this revision to Diff 80774.Dec 8 2016, 9:37 AM
mpividori retitled this revision from to [libFuzzer] Diff 19 - Automatically link with Windows library..
mpividori updated this object.
mpividori added reviewers: zturner, amccarth.
mpividori set the repository for this revision to rL LLVM.
mpividori added a subscriber: llvm-commits.
zturner edited edge metadata.Dec 8 2016, 10:09 AM

Is libfuzzer using any shell api functions? If so, I kind of think it shouldn't. What Win32 API requires linking against this?

amccarth accepted this revision.Dec 8 2016, 10:09 AM
amccarth edited edge metadata.

This is fine by me. Some projects have a preference to do stuff like this in the build files rather than the code. I don't know if the Fuzzer folks have a preference.

I'll approve this to keep you from being blocked. It's easy to change later if necessary.

This revision is now accepted and ready to land.Dec 8 2016, 10:09 AM

@zach: It's for PathRemoveFileSpecA.

Hi, I need that library because I am using the function PathRemoveFileSpec(). I couldn't find other option.
I use this pragma so when users link against libFuzzer, they don't have to explicitly add the Shlwapi.lib library.
Similar approaches in simplifying the user interface were considered when implementing other parts of the code. For example, in "FuzzerExtFunctions*" we use different implementation for Linux and Apple to avoid requiring extra flags when using libFuzzer.

@zturner Yes, we can implement it. I considered that before, but as there are many cases to be considered: relative paths vs full paths, using UNC or disk designators, etc, I thought it would be better to use Win Api instead of implementing it from scratch. I didn't realize that it is implemented in the Support library. I will read that code.

mpividori updated this revision to Diff 80792.Dec 8 2016, 11:27 AM
mpividori edited edge metadata.

@zturner @amccarth
This is an option, using regex. It is simpler than the implementation in Support library, and probably more correct (I don't have much experience with Windows), but maybe not very efficient.
If we don't use regex, as Support library does, this will require many more lines of code. What do you think?
Thanks.

It's based on the documentation for Naming Files, Paths, and Namespaces in: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

I appreciate the effort here, but now it looks like we have something complicated enough to need testing (and fuzzing!). I'm not sure that's the best use of Marcos's time.

I would prefer using the LLVM support library or the API from Shlwapi. Both of those are tested and don't require re-inventing the wheel. If we can't do either of those, I'll take a closer look at this.

kcc added a subscriber: kcc.Dec 8 2016, 12:18 PM

remember that libFuzzer should not have *any* dependencies other than what's provided by the system by default,
i.e. LLVM support lib is out of discussion here.

kcc added a comment.Dec 8 2016, 12:28 PM

We may cook something with std::string

mpividori updated this revision to Diff 80838.EditedDec 8 2016, 3:42 PM

Hi, I implemented it from scratch. I tested with this cases and it works fine:

DirName("\\?\D:\SomeFile.txt") == "\\?\D:\"
DirName("\\?\D:\SomeDir\SomeFile.txt") == "\\?\D:\SomeDir"
DirName("\\?\UNC\Server\Share\") == "\\?\UNC\Server\Share\"
DirName("\\?\UNC\Server\Share\SomeFile.txt") == "\\?\UNC\Server\Share\"
DirName("SomeFile.txt") == "."
DirName("SomeDir\SomeFile.txt") == ".\SomeDir"
DirName("C:") == "C:"
DirName("C:SomeFile.txt") == "C:"
DirName("C:SomeDir\SomeFile.txt") == "C:SomeDir"
DirName("C:\") == "C:\"
DirName("C:\SomeFile.txt") == "C:\"
DirName("C:\SomeDir\SomeFile.txt") == "C:\SomeDir"
DirName("\") == "\"
DirName("\SomeFile.txt") == "\"
DirName("\SomeDir\SomeFile.txt") == "\SomeDir"
DirName("\\Server\Share\") == "\\Server\Share\"
DirName("\\Server\Share\SomeFile.txt") == "\\Server\Share\"

Can you do a quick sanity check that we return similar output on Posix and Windows (i.e. run all the above tests through Linux's DirName function and verify that they're the same)? Obviously slashes will be different, and things like \\?\ syntax and UNC syntax won't apply. But for example, but the postconditions of the function should be the same on Posix and Windows, such as if you pass a simple filenmae like foo.txt to the function, then your example shows we return ., therefore Posix should return . as well otherwise someone might rely on it. Similarly, for foo\bar\file.txt, if we return foo\bar, then I would expect /foo/bar/file.txt on Posix would return /foo/bar without a trailing slash. (This is another argument for not relying on the Windows function in shlwapi I suppose, since we should try to match the postconditions as closely as possible.

@zturner
I have checked this tests, and they output the same (Posix version and Windows):

WixDirName("") == "."
PosixDirName("") == "."

WinDirName("NoSlash") == "."
PosixDirName("NoSlash") == "."

WinDirName("\") == "\"
PosixDirName("/") == "/"

WinDirName("\SomeFile") == "\"
PosixDirName("/SomeFile") == "/"

WinDirName("\SomeDir\") == "\"
PosixDirName("/SomeDir/") == "/"

WinDirName("\SomeDir\SomeFile") == "\SomeDir"
PosixDirName("/SomeDir/SomeFile") == "/SomeDir"

I had imagined the algorithm starting from the end and looking backwards for the first separator. I can't find anything wrong, so it seems fine, but did you consider starting from the end and going backwards and decide it was more difficult and/or wouldn't work?

lib/Fuzzer/FuzzerIOWindows.cpp
149 ↗(On Diff #80838)

Can we use a better name than In? I'm not sure what it's supposed to represent. Is this more like the starting offset? If so call it Offset perhaps.

190 ↗(On Diff #80838)

What is Ad supposed to be short for? If it means something (Address?), I would just write it out as a full word.

203 ↗(On Diff #80838)

should Ref be a const reference?

mpividori updated this revision to Diff 80906.Dec 9 2016, 9:21 AM

@zturner I updated the code to include your suggestions.
The algorithm can't simply start from the end and look backwards for the first separator, because we need to preserve the prefix that represent the root location. We shouldn't remove anything there. In Windows we have many different options, like: \\Server\Share\ , \ , C: , C:\ , \\?\C:\ , \\?\UNC\Server\Share\
So, I need to start analyzing which kind of path we received, and then I can remove the separator in the rest of the path, if it exists.
Thanks.

zturner accepted this revision.Dec 9 2016, 9:27 AM
zturner edited edge metadata.

LGTM.

Please update the title and description of this patch to reflect the new direction.

lib/Fuzzer/FuzzerIOWindows.cpp
261 ↗(On Diff #80906)

Nit: In these comments, I would say separator instead of '\'. It removes both slashes and backslashes, and '\' looks like a bad char literal.

mpividori updated this revision to Diff 80922.Dec 9 2016, 11:28 AM
mpividori retitled this revision from [libFuzzer] Diff 19 - Automatically link with Windows library. to [libFuzzer] Diff 19 - Independent implementation of DirName for Windows..
mpividori edited edge metadata.

Done. Thanks.

This revision was automatically updated to reflect the committed changes.