Previously all FILE objects were fully buffered, this patch adds line
buffering and unbuffered output, as well as applying them to stdout and
stderr.
Details
- Reviewers
sivachandra lntue - Commits
- rG6ce490e5a617: [libc] add buffering to FILE writes
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
libc/src/__support/File/file.cpp | ||
---|---|---|
34 | These additional conditionals are making the code even more complicated. Instead, we should do this: rename the current write_unlocked method to write_unlocked_fully_buffered and make it private. Next, add two new methods like this: public: size_t write_unlocked(const void *data, size_t len) { if (bufmode == _IOLBF) { return write_unlocked_line_buffered(data, len); } else if (bufmode == _IOFBF) { return write_unlocked_fully_buffered(data, len); } else { size_t written_size =platform_write(this, data, len); // Error handling if platform_write fails. platform_flush(this); // Error handling if platform_flush fails return len; } } private: size_t write_unlocked_line_buffered(const void *data, size_t len) { // Instead of looking for the first new line char, look for the last new line char. char *last = last_newline(data, len); if (last == nullptr) { // If no new line char is found, just do normal buffering. return write_unlocked_fully_buffered(data, len); } size_t written_size; if (pos == 0) { // If there is no content in the buffer, write out the new data. written_size = platform_write(data, last - data /* size including the new line char */); // Perform error handling } else { // First write out the existing content, and then write out the new data. platform_write(buf, pos); // Perform error handling pos = 0; written_size = platform_write(data, last - data); // Perform error handling. } platform_flush(); // Perform error handling if (written_size == len) return len; // Remaining data should be buffered. written_size = write_unlocked_fully_buffered(data + written_size, len - written_size); // Perform error handling return len; } | |
libc/src/__support/File/file.h | ||
62 | We shouldn't need this, and may be even not do this. Add the macros _IOFBF etc. to stdio.h and use them directly. | |
libc/src/__support/File/linux_file.cpp | ||
161 | Shouldn't the default buffering mode be _IOFBF? | |
173–174 | We don't need these anymore - just make the size arg 0 and buffer arg nullptr in the constructor call below. | |
libc/test/src/__support/File/file_test.cpp | ||
110 | Why did you change this to calloc? |
rewrite to be closer to the original
libc/src/__support/File/file.cpp | ||
---|---|---|
34 | I don't think splitting line buffered and fully buffered writes is the correct design. The reason being that their logic is almost identical, only varying in the spot where they flush the buffer. With this idea in mind, I've rewritten the original function to support both types of buffering in a manner closer to your original design. I've also kept unbuffered writes in this function, since it seems odd to split just that function off on its own if all the other write modes are together. |
libc/src/__support/File/file.cpp | ||
---|---|---|
34 | The concept of a flush_point is promising but in the current state of this patch, I am not yet convinced. | |
68 | While the concept of flush_point seems promising, I think there is a mix up of a xfer_point and a flush_point in the code below. You want to transfer and clear the libc buffer when it is full. But, under _IOLBF, you want to actually perform the flush operation. May be I am missing it, but I don't see any call to platform_flush after this line. | |
74 | It took me some time to understand why this return is OK. Which probably means we need to refactor/rename/comment. | |
75 | If the if has braces, else should also have braces. | |
82–83 | Should this be: size_t remaining = len - write_size; | |
86 | Should remaining be updated inside this if block? Couple of points:
| |
91 | Is an else missing here? | |
93 | This condition can be true if flush_point - write_size is less than remaining? |
address comments
libc/src/__support/File/file.cpp | ||
---|---|---|
68 | I've added a call to flush at the end. | |
82–83 | in this case remaining represents the number of characters left after the flush point. This is useful, since we want to know if the characters after the flush point will fit into the buffer, and if they will we want to write everything to output directly. | |
86 | no, since it's only tracking the number of characters after the flush point, and from that the number of characters written should be logical after each step. This isn't only true in a special case of line buffered mode, but it is related to that. This handles the case where the flush point is after the end of the buffer, as well as the cases where the remainder after the flush point doesn't fit into the buffer. The flush point can only be after the end of the buffer if there's a newline, because otherwise the flush point is equal to the end of the buffer. It should be able to be true if there's no newline in the input, that's a good catch. I've added an extra condition. | |
91 | no, but there used to be one here. If this were only fully buffered then writing remaining to the buffer and writing from data to the output directly would never happen together, but in line buffered mode in the case where the newline is after the end of the buffer, but the characters after the newline fit into the buffer, we might need to do both. |
While this is probably OK logically, it is way harder to reason than before. I am not yet convinced that this idea of flush_point and mixing line buffering and fully buffering is better.
I gave it one pass and will need at least one more to actually understand everything. I will do it once more and get back to you.
The idea of "split point" makes it almost equivalent to the split approach I proposed. Taking a step back and looking at it objectively, consider that once a file is in a buffering mode, it is in that mode for its life time. Likewise, FBF is a mode by itself but LBF is FBF plus something else. To reflect these conceptual notions in code, we should use the approach of building LBF over FBF. That will keep FBF and LBF implementations simple and focused. Quantitatively also, the current state of this patch demonstrates that mixing makes it more complicated than it was previously.
I was wrong in expecting that concepts like flush point would make the implementation simple so I apologize for the misleading feedback I have given earlier. To move forward, we will do this:
- Use the split approach of building LBF over FBF as I have suggested earlier.
- Fix any bugs in FBF implementation separately.
- If and when some optimizations are required, we will optimize based on the data about what exactly the problem is.
split the buffering modes into different functions, as well as cleaning up each implementation to be simpler in its specific case.
Good to go after the comments addressed.
libc/src/__support/File/file.cpp | ||
---|---|---|
117 | We will keep the implementation simple for now like this: fbf(primary.data(), primary.size()); // Error check flush_unlocked(); // A call to flush and not to platform_flush() fbf(remaining.data(), remaining.size()); // Error check It should incur the same number of platform_* calls but might lead to a few additional copies. Iff required, we can optimize FBF to eliminate the unecessary copies. Infact, what we really want is to replace the first FBF operation by an NBF operation. We can do all that iff required. I don't think we have flush_unlocked, so you will have to add it. | |
libc/test/src/__support/File/file_test.cpp | ||
153 | It will be cool to actually create an FBF file also and show the difference in pos comparisons (like those on line 170 and 174). | |
397 | _IOFBF? | |
411 | Use some buffering mode. Or, test for all modes. |
address comments
libc/src/__support/File/file.cpp | ||
---|---|---|
117 | I did add NBF as well as flush_unlocked, and now we're down to three platform writes maximum (one for the buffer, one for the section up to the newline, and one for the remainder). |
libc/src/__support/File/file.cpp | ||
---|---|---|
52 | I have the flush outside of the unbuffered write so that in the fully buffered function (line 63) we can use the unbuffered write when the data is too large for the buffer. This means that in the base write_unlocked function and in write_unlocked_lbf we have to flush separately, but I think that's okay. |
libc/src/__support/File/file.cpp | ||
---|---|---|
52 | Ah, I missed that! |
We shouldn't need this, and may be even not do this. Add the macros _IOFBF etc. to stdio.h and use them directly.