This is an archive of the discontinued LLVM Phabricator instance.

Filesystem/Windows: fix inconsistency in readNativeFileSlice API
ClosedPublic

Authored by labath on Aug 16 2019, 6:09 AM.

Details

Summary

The windows version implementation of readNativeFileSlice, was trying to
match the POSIX behavior of not treating EOF as an error, but it was
only handling the case of reading from a pipe. Attempting to read past
the end of a regular file returns a slightly different error code, which
needs to be handled too. This patch adds ERROR_HANDLE_EOF to the list of
error codes to be treated as an end of file, and adds some unit tests
for the API.

This issue was found while attempting to land D66224, which caused a bunch of
lldb tests to start failing on windows.

Diff Detail

Repository
rL LLVM

Event Timeline

labath created this revision.Aug 16 2019, 6:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2019, 6:09 AM
Herald added a subscriber: kristina. · View Herald Transcript
labath updated this revision to Diff 215580.Aug 16 2019, 6:20 AM
  • change the test memset to a different value to properly test the zeroing aspect of the API
aganea added inline comments.Aug 16 2019, 8:13 AM
lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

Zachary also handles ERROR_OPERATION_ABORTED in lldb/source/Host/windows/ConnectionGenericFileWindows.cpp which corresponds to the user hitting Ctrl-C.

Also small nit not related to your change: we can't discriminate between a "full success" and a "eof success" from outside of this API, which could cause an extra loop to execute below? (and possibly trigger a different return code in readNativeFileSlice)

rnk added inline comments.Aug 16 2019, 11:46 AM
lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

Handling ctrl C seems reasonable to me, but it's not unit testable, so if you want to do it separately and manually test it, that's fine with me too.

I guess the concern with the loop for slice reading is that successive calls to ReadFile may not return the same error code. Right now, the loop is basically dead code since no test reads a slice of more than 4GB of data so far as I know, so it's hard to know what happens. Prior to my changes, I think the loop structure was exactly the same, though: it called ::read until it got back EOF or zero bytes read. We could check what read does in the CRT to see what we should be doing.

aganea added inline comments.Aug 17 2019, 10:54 AM
lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

CRT ::read is actually a wrapper around ReadFile. It returns 0 if it's at EOF or if GetLastError() == ERROR_BROKEN_PIPE; -1 if there's any other error; or the actual bytes read. Apparently, the doc states that ReadFile should return TRUE and bytesRead == 0 if reaching the end of file, and ERROR_HANDLE_EOF is reserved only for async/overlapped IO which we don't use here (setting Overlapped.Offset and Overlapped.OffsetHigh is not considered as overlapped IO), so I'm a bit surprised by its usage in this patch, but it's not the first time the behavior is different from what the doc states.
@labath Pavel, on which version/build of Windows are you testing? I'll give a try on mine to see if the return value from ReadFile is the same.

aganea added inline comments.Aug 17 2019, 12:42 PM
lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

Ah I see, the first read past the end of the file returns "Success" but with less bytesRead than asked for. And then any subsequent call with an offset larger that the length of file would return ERROR_HANDLE_EOF. At the expense of an extra trip to kernel space, which could be slow, considering recent mitigations and CFG (unrelated to this patch, but it'd worth the mention):

aganea accepted this revision.Aug 19 2019, 5:33 AM

LGTM!

lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

Sorry for the spam. On a afterthought I think we should leave ERROR_OPERATION_ABORTED as an error. Otherwise the input stream will be broken and we might see an (unrelated) error later down the line. It maybe makes sense for LLDB.

As for my last post, ReadFile can be used in a different ways for files and pipes if we want to be optimal; but at this point we don't know so we should leave it the way it is.

This revision is now accepted and ready to land.Aug 19 2019, 5:33 AM
labath marked an inline comment as done.Aug 19 2019, 8:30 AM
labath added inline comments.
lib/Support/Windows/Path.inc
1233 ↗(On Diff #215580)

No worries. Thanks for investigating this and for the pointers. I'm not very familiar with windows, so this was actually a very interesting and enlightening read for me.

This revision was automatically updated to reflect the committed changes.