- User Since
- Apr 18 2013, 12:16 AM (313 w, 6 d)
Nice. Even if the probability of a failure for each relocation is very low, we have huge number of relocations. Apparently the latter overwhelms the former!
Thank you for doing this.
Mon, Apr 22
It is a good idea to add a test to catch this though.
Sun, Apr 21
OK, then moving just below
Thu, Apr 18
I'm curious what kind of application you want to use this option for.
I believe that your benchmark is inappropriate and does not correctly capture correct performance characteristics of PLT entries of different sizes.
Although I understand that reporting an error is sometimes useful, are you sure that that's the behavior that users want? I wonder if we would end up adding --really-allow-undefined in the future to override this new behavior.
By the way I think your finding that this struct can be 12 bytes long instead of 16 bytes long is pretty interesting. Previously, we found that making this struct as small as possible does matter in terms of performance perhaps due to memory locality. So, shaving off 4 bytes from this might make a noticeable difference in speed. Do you want to try? (If not I'll do that sometime in the future.)
No, the actual issue is that a non-empty PT_LOAD segment that starts with an empty symbol only section does not have a file offset that is congruent modulo the page size. As a result, the output is not a valid ELF.
Please commit this first.
Wed, Apr 17
Changes to the linker script processor oftentimes has subtle implications and in order to review such change I have to stop and think. But for this patch I took too much time. Apologies for the delay.
IIUC, you are fixing an issue that a segment whose size is 0 doesn't have a file offset that is congruent modulo page size. Is this correct?
It doesn't race. Since OutputOffHi is not a bitfield, it takes a different memory location. Concurrent accesses to different memory locations don't race. I still don't understand the split of OutputOff. If you pack other members adjacent to OutputOffHi and read them, it may race with the write of OutputOffHi.
Committed as r358638.
Sorry, I though that I replied to this thread.
InputFile abstracts the concept of a file containing symbols. In each parse() function, we parse a symbol table and then insert symbols to the symbol table. parse() is part of a symbol resolution process step.
The point of my last comment was that, with the new layout, accesses to OutputOffHi don't race with accesses to Live or Hash, as they are now in different non-integer-bitmap struct members.
LGTM, please commit.
Tue, Apr 16
There should not be a fundamental reason for parallelForEach to not be reentrant, so we eventually want to fix it so that we can call the function without worrying the nested condition. For the time being, we should be able to detect the error condition more easily by adding the assertions.
The root cause of the problem here is that the API doesn't have enough rationale to choose a complex format than a simpler one, even though it was set just a few years ago and there's no historical burden to do so. I was repeatedly told that it is too late to change the ABI but that wasn't convincing (that's why the same topic came up many times in the thread) because no one said how much it is too late -- what thing is already done and if we redo it how long does it likely to take, etc. So, Carlos, thank you for giving some examples of the stuff that's already done to some programs.
This change seems fine as long as it doesn't cause any performance regression. Did you run lld with this change to see if the increase of hash collision doesn't have a negative impact?
This change seems correct, although it is a bit tricky and perhaps unnecessary to move . backwards in many linker scripts.
If you worry about the cost of assigning a boolean value to the thread-local variable, I'd think that's really cheap and negligible. This function is not called too casually, because distributing jobs to threads waiting in the thread pool is not cheap. Compared to that, these assignments are virtually nothing.
I will keep this patch for a day to give sbc100 to take a look.
Sorry, I missed this one.
Mon, Apr 15
- 80 columns
- address review comments
This patch naturally changes the order how strings are placed in a merged string section. In addition to that, with this change, strings equal to or shorter than 4 bytes long are not merged. This is why.
Sun, Apr 14
I don't think you have to call std::stable_sort. Something like this should suffice.
This patches changes the order how input files are parsed. The other approach is to parse input files normally and then sort the list of object files so that crtend.o is at the end of the input list. I wonder if the latter is simpler. What do you think?
Yeah, if recent versions of clang and gcc are happy with the code, I wouldn't add the extra braces as Fangrui suggested. Thank you for your effort to fix warnings, though. It's important to keep our code warning-free.
Thu, Apr 11
- address review comment
Wed, Apr 10
- remove dead code
I will do that.
Overall looking good.
Tue, Apr 9
I'd write a comment explaining why we are handling debug sections in a special way, but that can be done later. Please submit. Thank you for doing this!