Previously, we parsed them into a Reloc struct when reading the input,
but that's unnecessary. Handling them at output time is closer to what
lld-ELF is doing, and should make future parallelization work easier.
Depends on D79114.
Hmm. I thought part of the reason the original prototype parsed them early was to handle subsections via symbols (where you may need to adjust relocations based on subsection splitting). Do you have a sense of how this would play with that?
If I understand our current setup, we're parsing the relocations early but only resolving them (as in writing out their target) at the end, so that end part should still represent opportunities for parallelism, I think. (We should also figure out exactly what it is that COFF and ELF parallelize, and check if our design can handle parallelizing those as well.)
I'm still a bit hazy on how subsections and their relocations will end up looking like, but I don't see why we might need to adjust relocations at load time instead of at output time. From what I understand, we basically need to figure out which subsection a given relocation's address points into. Presumably subsections will keep track of their original address ranges in the input file, so we can do the relocation -> subsection mapping at output time too. Well, I should probably have a look at how ld64 handles subsections...
I *think* what ld64 does is to translate the raw relocation structures into "fixups" that target specific atoms / subsections. But it's still not clear to me that we can't do the relocation -> subsection mapping at output time. Moreover, given the current state of the implementation, I don't think having a separate Reloc struct is super useful -- all we're really doing is doing a 1:1 copy of various field values, plus the symbol/section resolution, and the latter can definitely be done at output time.
The architecture may need to be different here IMHO because of subsections. Don't forget that you need to map relocations onto subsections in order to implement gc-sections, and depending on the number of subsections you have per section, that could get expensive without an intermediate data structure. On top of that you'd still need the O(M log N) at output time. To me it seemed better to pay the O(M log N) up front once and avoid the cost at gc-sections time.