- User Since
- Dec 15 2015, 10:55 AM (135 w, 2 d)
Tue, Jul 17
Mon, Jul 9
Thu, Jun 28
Wed, Jun 27
simplified as suggested by @pcc (thanks!)
Jun 14 2018
Jun 13 2018
finds a lot of places where we are putting location information in diagnostics. Are you planning to convert those to use the new API/formatting, too?
We decided against shipping the feature that depends on this. Given that, I'd rather not have this code. Abandoning the diff.
Thanks for your suggestions, sbc. Those would be good to implement if we wanted the feature. However, we decided against the feature that depends on this, so I'll be abandoning this diff.
After checking with ruiu, it seems we are all on the side of not adding this. Abandoning.
Jun 12 2018
I still don't see a compelling reason to do this to be honest.
Ensure reproducer is also written when calling exitLld().
Alright, sounds like we are in agreement about the code. @pcc, can you still change the wording from "bug" to something else (like "observed behavior")?
Jun 11 2018
Thanks for the fix! A couple of suggestions:
Jun 8 2018
Alright then, LGTM.
I haven't accepted the patch yet because I want to make sure we do end up removing the RequiresNullTerminator flag. You indicated that you don't want to do that in this patch. That's ok with me as long as you do plan to do it in a soon-to-come follow-up patch. If you don't plan to write that follow-up patch, I would like you to include the new function and rewrite the calls in this patch.
Thanks for doing this! I did some of the ground work for this in the COFF linker, but never got around to flipping the default. I too like the idea of having a separate function for the NUL-terminated case. It seems we all want that, so how do we get there? Are you planning to follow this up with a patch that removes the RequiresNullTerminator flag and rewrites calls that use it to use the new function instead?
Jun 6 2018
Thanks, @pcc . Out of curiosity, how did you build chrome (I guess I'm mostly interested in args.gn and the command you used to build). I've been trying this on my machine and it can literally go for weeks without hitting the problem.
Jun 5 2018
I am investigating https://bugs.chromium.org/p/chromium/issues/detail?id=786127, which is a failure that does not reproduce reliably. I could use /linkrepro to save all link inputs always, but that would use up a lot of disk space. Running the command again with /linkrepro will almost certainly not reproduce the failure. Having a way to only save the reproducer when an error occurs would help here, and also seems a generally useful feature to have.
May 9 2018
We landed D46527 instead.
Bruce's blog post (https://randomascii.wordpress.com/2018/02/25/compiler-bug-linker-bug-windows-kernel-bug/) does seem to suggest that this issue should only ever be hit when files are written through memory maps. If that is the case, this change should not make a difference. I'll shelve it for now and see if I can think of something better. Thanks!
May 8 2018
It's a speculative fix. I'm not getting the error locally and it doesn't consistently reproduce anywhere I know of, which makes it hard to think of a way to ascertain that it actually fixed the problem. It's just that I noticed that a number of files we write when doing ThinLTO are written through raw_fd_ostream, and we then occasionally see bytes being read as 0 when they shouldn't be, which is a problem we fixed elsewhere by calling FlushFileBuffers. If we land this and the problem persists I think we should revert it (for the reasons zturner mentioned), but given the similarity to the problem we worked around with D42925, I would like to give this a shot and see if it helps.
My hope is that this will help with https://crbug.com/786127. I haven't seen the undefined symbol errors in local builds for several weeks now, but they still happen on the bots.
Digging a bit deeper into this, I don't understand how this breaks. The crash indicates that we're running off the end of the file while reading from the long file name table while looking for a newline that isn't there. However, we seem to always create archives with the long file name table before any member that references it, and both GNU ar and llvm-ar will reject archives that are not structured that way. This means that there must be at least one member after the long file name table. Since the record at the beginning of a member ends in a newline (see ArchiveWriter::printRestOfMemberHeader), this means there must always be a newline between the end of the long file name table and the end of the file. For archives we write, every long file name including the last is followed by a newline (see ArchiveWriter::addToStringTable), so the newline we find should always be in memory we can read and always be at the end of the long file name we're looking at.
Thanks, hans. I spent all of yesterday trying to cook up a test case for this. Taking into account the code in MemoryBuffer, I created a file of 20480 (4096 * 5) bytes to get it mmapped and a multiple of the page size, but it did not reproduce the problem. Just in case, I also tried a file not a multiple of the page size, and a small file. No luck. I tried just building the targets that were failing on the bot, but they built just fine locally. I tried using the exact Clang and Chromium revisions that the bot used, but, again, things worked fine locally. I tried the revisions I used a week ago, when I was able to get a repro, but no luck with that either. I also installed afl to try and fuzz my way to something that exercises the failure mode. I'll run that today and see if that gets me something we can use as a test case. At least we have the fix in now.
Accepted so that you can land the fix if still needed, per my comment at https://bugs.chromium.org/p/chromium/issues/detail?id=840260#c9
Apr 28 2018
A test case for this would be an archive with a long file name at the end of the string table. It may have to be a specific size to tickle the problem I'm trying to fix here, to prevent MemoryBuffer or the OS from padding the archive with 0 bytes, which would avoid the problem.
Apr 27 2018
Apr 25 2018
Apr 24 2018
Anyone able to take a look at this?
@ruiu, did you get a chance to look at this?
Apr 20 2018
Fair enough. Changed it to only do the memcpy if the ArrayRef is not
empty, and otherwise leave the function alone.
Apr 19 2018
@ruiu, how about this?
Apr 18 2018
Apr 17 2018
Apr 12 2018
Apr 9 2018
Ah, my bad. Thanks for clarifying, @zturner.
Doesn't shutil.rmtree handle this?
Anyone able to take a look at this?
slight reword of comment
Thanks for looking, @dsanders. Does this comment make things clearer?
Apr 6 2018
Apr 5 2018
Apr 4 2018
Apr 3 2018
Thanks, ruiu. Added the newlines before committing.
Apr 2 2018
Mar 26 2018
This now generates a components_unittests.exe that is the same size as without this change that works correctly. I also verified that the data race is gone on COFF. I will land this to give the bots a chance to chew on it, and then fix the same data race in the ELF linker.