This is an archive of the discontinued LLVM Phabricator instance.

File::Read does not handle short reads
Needs RevisionPublic

Authored by EugeneBi on Oct 17 2016, 5:56 PM.



When loading huge file from network share, the pread may succeed but read smaller amount bytes than it was requested. If this happens we need to issue another pread until our request is satisfied or pread returns 0 indicating end of file.

Diff Detail

Event Timeline

EugeneBi updated this revision to Diff 74938.Oct 17 2016, 5:56 PM
EugeneBi retitled this revision from to File::Read does not handle short reads.
EugeneBi updated this object.
EugeneBi added reviewers: zturner, clayborg.
EugeneBi set the repository for this revision to rL LLVM.
EugeneBi added a project: Restricted Project.
zturner added inline comments.Oct 17 2016, 8:57 PM

I know you are just fixing an existing bug, but since you're in here, the existing code really could use some love. There are two Read() methods, one which takes an offset and one which doesn't. They both are written completely differently though, even though the former might as well just call the other one with an offset of 0. Furthermore, there is an entirely separate code path when MAX_READ_SIZE is defined, even though it should be exactly the same algorithm, but reading fewer bytes on each iteration.

I won't force you to do it, but if you have some spare cycles it would be really great if this could get cleaned up. Ideally the code would look like this:

  1. Read (void *buf, size_t &num_bytes) would be a 1 line function. return Read(buf, num_bytes, 0);
  2. At the point where MAX_READ_SIZE is defined for apple platforms, have an #else branch which defines it for other platforms equal to SIZE_T_MAX.
  3. Delete the #if defined(MAX_READ_SIZE) branch entirely.
  4. Have the main branch only read std::min(MAX_READ_SIZE, bytes_left) bytes at a time until it's done.

This still leaves the question of what to do with the windows branch since it doesn't have pread. If you're familiar with Windows you can re-write this branch yourself, or I'm willing to do it for you otherwise.

What do you think?

clayborg requested changes to this revision.Oct 18 2016, 9:31 AM
clayborg edited edge metadata.

We can probably actually get rid of the function that takes the offset.

Error File::Read (void *buf, size_t &num_bytes, off_t &offset);

The only clients are internal from within File.cpp, or clients where they pass in zero as the offset so they can actually use the normal read function:

Error File::Read (void *buf, size_t &num_bytes);

Then we just need to make sure the above function works correctly.

This revision now requires changes to proceed.Oct 18 2016, 9:31 AM

Then it will keep Windows happy using just the normal "read" instead of "pread".

I'll take a look

I took a look.

The no-offset Read that Greg mentions suffers from the same bug, i.e. it better be fixed too. Now, what Greg says - prove that we always call with offset zero and eliminate the other two methods - makes total sense to me. Well, since I did not run into the first Read, it means that there are callers that do not use it but use the combination of the other two. So, doing the merge that Greg suggests involves fixing all current callers, and this is a bit deeper than I am comfortable to undertake.

Sorry, I do not feel that I can do it.