Page MenuHomePhabricator

[WebAssembly] Optimise relocation iteration to remove n^2 loop. NFC.
AbandonedPublic

Authored by ncw on Jan 17 2018, 6:17 AM.

Details

Reviewers
sbc100
ruiu
Summary

Addresses issue: https://github.com/WebAssembly/tool-conventions/issues/32

Previously, for each function/segment we iterated over every relocation to find the relevant ones, which is an n^2 operation. Now, we just make a single pass.

Event Timeline

ncw created this revision.Jan 17 2018, 6:17 AM
ncw updated this revision to Diff 130168.Jan 17 2018, 6:24 AM

Updated:

  • Remove copy for already-sorted input
ncw added a comment.Jan 22 2018, 10:30 AM

This should be OK to merge (assuming it passes review :) ), doesn't change anything functionally.

Another n^2 loop I've noticed in LLD is the one where the segment is looked up from the data symbol, which is done by iterating over all the segments in order. That will be addressed of course by the symtab changes, which will add a direct data-symbol-to-segment association.

ruiu added a subscriber: ruiu.Jan 22 2018, 3:42 PM

Somewhat orthogonal to this patch, but one of the reasons why lld is so fast is that it doesn't only avoid expensive algorithms such as O(n^2) or O(log n) but also avoids even hash table lookups, though it's O(1). Loading something through two-levels of pointer indirection is very fast, but obtaining the same information through a hash table can be very expensive, as the amount of data the linker has to deal with is huge. Maybe I should write a document about that somewhere in the lld repository, but please keep in mind that that's one of the design principles.

ncw updated this revision to Diff 135634.Feb 23 2018, 6:30 AM
ncw retitled this revision from [WebAssembly] Optimise relocation iteration to remove n^2 loop (LLD) to [WebAssembly] Optimise relocation iteration to remove n^2 loop. NFC..
ncw edited the summary of this revision. (Show Details)
ncw added a reviewer: ruiu.

Rebased.

Any objections to this?

(NB. I have LLVM commit access now, so can merge patches myself once reviewed.)

ruiu added inline comments.Feb 23 2018, 12:09 PM
wasm/InputFiles.cpp
195

I read this part of code recently and found the current code does something very weird. It copies relocations for no obvious reason. Instead of copying relocations and attach it to a Chunk, you could directly consume mmap'ed relocation section. So, I think "copy" or "set" relocation should be eliminated, instead of making it efficient.

ncw added inline comments.Feb 23 2018, 2:15 PM
wasm/InputFiles.cpp
195

I agree - in fact, I think that's exactly what I've done! I've renamed the methods from "copy" to "set" because the copy is eliminated - see the std::vector in InputChunk that's been replaced with an ArrayRef that points directly into the WasmObjectFile (OK, so WasmObjectFile does a copy to unserialise the relocations, so it's not quite a direct pointer into the input file... but it's closer).

I think that with this change, there are no longer any copies of relocations in Wasm-LLD. All we're doing here is giving each InputChunk a single pointer to that chunk's relocs in the InputFile's array.

ruiu added a comment.Feb 23 2018, 2:21 PM

Ah, sorry. If that's the case, I misunderstood this patch. But why do you have to explicitly set relocations to chunks? Chunks knows which file they are created from, and Files have pointers to relocations, so Chunks doesn't have to be told what relocations they have, no?

ncw added a comment.Feb 23 2018, 4:29 PM

Ah, sorry. If that's the case, I misunderstood this patch. But why do you have to explicitly set relocations to chunks? Chunks knows which file they are created from, and Files have pointers to relocations, so Chunks doesn't have to be told what relocations they have, no?

That's all true... the chunks already know the file, and the file already knows the relocations - but the file stores a big list of all the relocations (in one array). They are sorted by chunk, but when we write out the (non-discarded) chunks we have to know which relocations go with which chunk.

Alternatives:

  • Don't remember the association between chunks and relocs. Do a binary search for every chunk, to find the relocs that are relevant to it. Could make LLD slower.
  • Change WasmObjectFile so the reloc list in WasmObjectFile is per-chunk, not per-file; not really any better than this commit, it just pushes the same code into a different class.
  • Change the actual Wasm file format, so we store relocs per-chunk on disk, rather than per-file (Sam was suggesting this one actually). Could result in marginally higher memory usage? I'm not too bothered by the current format; relocs in one list sorted by chunk seems a reasonable file format.
ruiu added a comment.Feb 23 2018, 5:51 PM

Hmm, I think I'm confused. IIUC, a wasm object file has single .data and .code section, and there is one relocation section for each section.

A "section" in wasm is actually not an atomic unit of inclusion/exclusion. It can be split into multiple "chunks", and "chunks" is a unit of garbage collection.

But why does that two-level abstraction needed? Why can't everything just a section?

ncw added a comment.Feb 26 2018, 10:00 AM

Hmm, I think I'm confused. IIUC, a wasm object file has single .data and .code section, and there is one relocation section for each section.

A "section" in wasm is actually not an atomic unit of inclusion/exclusion. It can be split into multiple "chunks", and "chunks" is a unit of garbage collection.

Yes - I think that's right.

But why does that two-level abstraction needed? Why can't everything just a section?

I don't follow what your alternative suggestion is ("why can't everything just a section"). Possibly one of these:

  1. WasmObjectFile should implement llvm::object::ObjectFile to return an llvm::object::SectionRef for each chunk. Then there would be a one-to-one mapping between the LLVM "sections" and the LLD "chunks". You don't have any opinion on how Wasm files are encoded on disk - you're simply asking for the code I've written to essentially be pushed down into WasmObjectFile, so that LLD doesn't have to handle lists of relocations.
  2. Or, maybe you are concerned about the Wasm file encoding on disk - you want to change that so that relocations are stored per chunk/function. You don't care about what WasmObjectFile's sections iterate over, as long as each input chunk (function etc) has a list of relocations associated with it by WasmObjectFile, so that LLD doesn't have to track that. I think this is what Sam was suggesting on GitHub.
  3. Or maybe you're worried about both? The file format and the section abstraction are both wrong to you.

I don't have a strong opinion - I mean functionally they all end up doing the same thing! It's just a matter of shuffling around/renaming some code, to make future maintenance less confusing.

I'll go with whatever other people want - this PR is just a quick win that gets rid of an unoptimised loop, while avoiding changing the file format, and without redefining what "section" currently means in the context of Wasm. With those constraints, I don't think there's that much that could be done differently...

I suggest we move to storing the relocations on per chunk basis. There is already a bug open to do this: https://github.com/WebAssembly/tool-conventions/issues/32. I also don't it makes much sense to try to fix this performance issue in the mean time.

ruiu added a comment.Feb 27 2018, 12:13 PM

This may be a silly question, but what is the distinction of Chunk and Section? Chunk seems like a section in a section, so I wonder why wasm object files is designed that way. In ELF and COFF, everything is a section, and section is a atomic unit of linking.

This may be a silly question, but what is the distinction of Chunk and Section? Chunk seems like a section in a section, so I wonder why wasm object files is designed that way. In ELF and COFF, everything is a section, and section is a atomic unit of linking.

In a wasm binary all code is in a single section (WASM_SEC_CODE) and all data is defined in single section (WASM_SEC_DATA). Within the code section there are N functions and within the data section there are N data *segments*. Things are not as homogeneous as in ELF were everything is just bytes and virtual addresses. We use the term "chunk" there as an abstraction that linker can use to handle both functions bodies and data segements, even though in the file format these are no really the same thing.

We could have gone a different route for the intermediate object and allowed multiple code sections, but then none of the existing wasm tools would load the file (i.e. they would no longer we valid wasm files) and I'm not sure this would have made the linker any simpler, since we would still not be able to match ELF in its ability to blindly concatenate everything.

ruiu added a comment.Feb 27 2018, 12:56 PM

We could have gone a different route for the intermediate object and allowed multiple code sections, but then none of the existing wasm tools would load the file (i.e. they would no longer we valid wasm files) and I'm not sure this would have made the linker any simpler, since we would still not be able to match ELF in its ability to blindly concatenate everything.

I don't think I agree with this point. Object files and executable files are different kind of files, so the requirement that executables must have one text and one data section doesn't have to be applied to object files. If you apply that, then you would have to come up with a way to workaround -- and that's the "chunk". I'm not particularly interested in making wasm as much similar as ELF because they are naturally different, but "chunk" is beyond that -- it just feels more complicated than necessary. I strongly believe that everything can be (and should be) sections in object files.

We could have gone a different route for the intermediate object and allowed multiple code sections, but then none of the existing wasm tools would load the file (i.e. they would no longer we valid wasm files) and I'm not sure this would have made the linker any simpler, since we would still not be able to match ELF in its ability to blindly concatenate everything.

I don't think I agree with this point. Object files and executable files are different kind of files, so the requirement that executables must have one text and one data section doesn't have to be applied to object files. If you apply that, then you would have to come up with a way to workaround -- and that's the "chunk". I'm not particularly interested in making wasm as much similar as ELF because they are naturally different, but "chunk" is beyond that -- it just feels more complicated than necessary. I strongly believe that everything can be (and should be) sections in object files.

I understand that executables and objects are different and we are not striving to make the object files actually run-able, but we tried to make them into at least valid wasm files (with extra metadata sections).

If you feel strongly, perhaps we should have a longer discussion about that. An issue in https://github.com/WebAssembly/tool-conventions/ is probably the best place for that. I guess we could still revisit this.

ncw added a comment.Mar 1 2018, 8:01 AM

We can't change WebAssembly - it's too late for that. What seems to be the disagreement here is that the Wasm spec has chosen the word "section" to describe a different concept to what "section" means in ELF. We're stuck with that!

Rui's suggesting changing the Wasm object format, so that each function gets its own Wasm "section". But that's not going to make the format any better, the thing that's called a "section" in Wasm isn't designed for holding individual functions. Just because it's called a "section" in the Wasm spec, has got nothing to do with whether it means the same thing to the linker that an ELF "section" does.

I think we just have to accept that the ELF and Wasm specs have chosen to use terminology differently. That's not even a defect in those specs, the Wasm meaning of "section" is perfectly in line with how it's used in other file formats to represent a portion of the data in the file.

To resolve this, I think it's just a question of how terminology and naming should be used in LLD. We have total freedom to change the LLD class names, so if Rui really wants LLD to operate on "sections" then it can, just with some renaming! But, we can't change the Wasm format itself without a good reason, since there are downstream projects already using these files.

Non-exhaustive options:

  1. Status quo. Accept the difference, and use "chunk" where the ELF linker uses "section", and use "section" in the Wasm code with the same meaning it has in the Wasm spec.
  2. Rename the Wasm code to avoid the contentious "section" terminology entirely. Come up with a different name for Wasm "sections" so no-one is tempted into thinking it's supposed to be the unit of linking like it is in ELF.
  3. Rename the Wasm code to use the terminology "section" to match the ELF linker (ie rename InputChunk -> InputSection), with a comment explaining that InputSection doesn't represent the thing that's called a "section" in the Wasm spec.

TLDR:

I agree that everything in the linker should be "sections" - and that's already how it works, it's just that that abstraction is called a "chunk" currently in the Wasm code, because "section" is taken to mean something else in Wasm. Surely it already is what you want it to be, apart from the fact it's called "chunk". The fact that Wasm has something called a "section" doesn't mean that's what Wasm object files should be using for storing each function. We can do whatever we want with the terminology in the code to fix the situation, but the file format itself isn't the problem.

ruiu added a comment.Mar 1 2018, 10:50 AM

Maybe you are right. I didn't have time to take a look and understand the wasm file format (that's why I didn't send any comments to https://github.com/WebAssembly/tool-conventions yet), but by reading wasm/InputChunks.h and wasm/InputFiles.cpp again, there's no two-level containers in the file format. What I found a bit confusing is the name of "InputSegment" -- that sounds like it is some base class, but it is actually a leaf class for data segment.

To be clear, I'm not suggesting to make wasm look simliar to ELF. I'm of course not complaining that wasm is different from ELF. All file formats should have equal rights in lld. :) When I say something seems weird, that's just that. I don't think wasm needs to be similar to ELF because the two file formats are naturally different for a good reason. That's why you guys have invented a new file format in the first place.

If "section" is used for other stuff in wasm, we shouldn't use the same terminology for something else to avoid confusion. I think this is more like a problem of lack of documentation or comments in wasm lld. Maybe we should expand the file comment of InputChunk.h to describe what Chunk is and how a file consists with chunks. I'll try to do that.

I also think that we should rename InputSegment InputData. A function is called InputFunction, so it is I believe a better name.

ruiu added a comment.Mar 1 2018, 10:54 AM

As to the spec, how much is it fixed at the moment? You said that it's too late, but IIRC, Sam is trying to emit relocation per chunk, instead of emitting a single relocation per file. That's not a file format change?

Maybe you are right. I didn't have time to take a look and understand the wasm file format (that's why I didn't send any comments to https://github.com/WebAssembly/tool-conventions yet), but by reading wasm/InputChunks.h and wasm/InputFiles.cpp again, there's no two-level containers in the file format. What I found a bit confusing is the name of "InputSegment" -- that sounds like it is some base class, but it is actually a leaf class for data segment.

To be clear, I'm not suggesting to make wasm look simliar to ELF. I'm of course not complaining that wasm is different from ELF. All file formats should have equal rights in lld. :) When I say something seems weird, that's just that. I don't think wasm needs to be similar to ELF because the two file formats are naturally different for a good reason. That's why you guys have invented a new file format in the first place.

If "section" is used for other stuff in wasm, we shouldn't use the same terminology for something else to avoid confusion. I think this is more like a problem of lack of documentation or comments in wasm lld. Maybe we should expand the file comment of InputChunk.h to describe what Chunk is and how a file consists with chunks. I'll try to do that.

I also think that we should rename InputSegment InputData. A function is called InputFunction, so it is I believe a better name.

The term "data segment" in the wasm world has a specific meaning. It refers to the things in the data section. The data section is made up of segments. I know this is directly counter to the meaning in the ELF world but it make some sense. Each wasm data segment is a contiguous indivisible run of of bytes. InputSegment already has a comment specifying why its called what it is, and why it might be counter intuitive to someone coming from ELF.

I think the name InputSegment makes sense when this is understood, but I don't really might what we call it.

ncw added a comment.Mar 2 2018, 10:46 AM

Thanks for looking into improving the comments! :)

As to the spec, how much is it fixed at the moment? You said that it's too late, but IIRC, Sam is trying to emit relocation per chunk, instead of emitting a single relocation per file. That's not a file format change?

The Wasm core sections are very fixed, they're used by browsers (Firefox, Chrome, Edge, Safari...) and that boat has sailed. So that includes the way functions and globals are encoded. The bits specific to object files (like relocations) are under our control, and we can still change those. And of course, object files don't have to use a FUNCTION section for functions - but not doing so would be counterintuitive. The FUNCTION section we have no control over, that's set in stone.

In D42176#1025688, @ncw wrote:

Thanks for looking into improving the comments! :)

As to the spec, how much is it fixed at the moment? You said that it's too late, but IIRC, Sam is trying to emit relocation per chunk, instead of emitting a single relocation per file. That's not a file format change?

The Wasm core sections are very fixed, they're used by browsers (Firefox, Chrome, Edge, Safari...) and that boat has sailed. So that includes the way functions and globals are encoded. The bits specific to object files (like relocations) are under our control, and we can still change those. And of course, object files don't have to use a FUNCTION section for functions - but not doing so would be counterintuitive. The FUNCTION section we have no control over, that's set in stone.

While you are correct that we can't change the executable format, the object file format, as you say, is under our control, so I think think should rely on the "we can't change it" argument too strongly.

We could, for example, have an object format the contains one code section per function and one data section per segment. I'm not suggesting we do that.. but its conceivable. It would even be pretty easy to have existing tools such as wabt support this (by parsing multiple code and data sections as if they were one).

ncw added a comment.Mar 2 2018, 1:15 PM

We could, for example, have an object format the contains one code section per function and one data section per segment. I'm not suggesting we do that.. but its conceivable. It would even be pretty easy to have existing tools such as wabt support this (by parsing multiple code and data sections as if they were one).

It wouldn't be a valid Wasm file if it had more than one CODE or DATA section. We'd have to use a custom section if we wanted to do that; it would be an abuse to reuse existing official Wasm sections in a way that doesn't the specified syntax. You're right it's possible, my objection is that it just feels odd to use the Wasm container but not use its way of enclosing functions/data, since those are the core things that we're trying to work with.

ncw abandoned this revision.Aug 22 2018, 1:47 AM

Abandoned in favour of https://reviews.llvm.org/D51063