This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Add __data_end linker-synthetic symbol
ClosedPublic

Authored by sbc100 on Feb 2 2018, 1:55 PM.

Details

Summary

This is similar to _end (See https://linux.die.net/man/3/edata for more) but
using our own unique name since our use cases will most likely be different
and we want to keep our options open WRT to memory layout.

This change will allow is to remove the DataSize from the
linking metadata section which is currently being used by emscripten
to derive the end of the data.

Event Timeline

sbc100 created this revision.Feb 2 2018, 1:55 PM
sbc100 updated this revision to Diff 132680.Feb 2 2018, 2:02 PM
  • rebase

Is there a use for _edata? I'm curious how applications would use it differently from _end.

Is there a way for applications to determine where the data starts? Traditionally that might be something like _etext, but of course that doesn't apply to wasm. Could they get it from __bss_start and/or __heap_base? What's the relationship between those symbols?

sbc100 added a comment.Feb 2 2018, 3:00 PM

Is there a use for _edata? I'm curious how applications would use it differently from _end.

Is there a way for applications to determine where the data starts? Traditionally that might be something like _etext, but of course that doesn't apply to wasm. Could they get it from __bss_start and/or __heap_base? What's the relationship between those symbols?

Maybe nobody will need _edata. I added it mostly for completeness. The current DataSize we have in the linker meta section will be replaced by _end, and that is needed by emscripten.

There is no way to determine where the data starts, no. We could put _etext there, but it seems a little strange to have _etext.

I think bss_start will normally be the same as _edata on wasm (modulo alignment of the first bss symbol I guess?. I just realized I forgot to add bss_start here.

__heap_base follows data + bss + stack.

I'm nervous about adding all these symbols. I like ELF features when they make sense, but for some of these symbols here, it's difficult to think of even hypothetical valid use cases.

Adding these features could make it harder for us to change how we lay out memory in the future. For example, if we want to move where the stack is, or do things with the memory of dynamic libraries present at startup time, it may be harder if we have to consider whether someone might be using _end or other things and depending on memory having a particular layout.

sbc100 added a comment.Feb 2 2018, 4:01 PM

I'm nervous about adding all these symbols. I like ELF features when they make sense, but for some of these symbols here, it's difficult to think of even hypothetical valid use cases.

Adding these features could make it harder for us to change how we lay out memory in the future. For example, if we want to move where the stack is, or do things with the memory of dynamic libraries present at startup time, it may be harder if we have to consider whether someone might be using _end or other things and depending on memory having a particular layout.

Well, we have a user that needs to know where the application data ends. Emscripten is currently using the "linking" metadata section and the DataSize for this. This change move that from the custom field in the custom section to regulare symbol which as several advantages:

  1. Its more in line with the existing stardards.
  2. I mean we don't need to write the "linking" metadata when we write out executable (executable shouldn't needs such a thing).
  3. I means emscriptne doesn't need to know about the linking metadata.

I'm happy to only add _end for now, and we could rename it to __wasm_data_end if you prefer something more explicit. I just liked the simplicity of _end since it is already a thing in ELF.

sbc100 added a comment.Feb 2 2018, 4:04 PM

This change will allow the following cleanup: https://reviews.llvm.org/D42869

I do like the idea of having a linker-generated symbol instead of metadata for _end (or _wasm_end or whatever) and it's something that is unlikely to be broken in the future. So I think we should do that. But I do agree that specifying all these things would limit our ability to improve the layout in the future, especially wrt the order of the static/heap/stack bits. Since we have more constraints on our memory space than ELF (e.g. expanding only at the top instead of mmap), we might want to retain that flexibility, or even make things more dynamic or configurable, which would probably argue against matching the behavior of ELF here. We could consider having separate start/end markers for particular sections instead of having implicit ordering, if there's a use case for that.

sbc100 updated this revision to Diff 132703.Feb 2 2018, 4:25 PM
  • remove bss start
sbc100 retitled this revision from [WebAssembly] Add _edata, _end and other link-sythentic symbols to [WebAssembly] Add _edata and _end link-synthetic symbols.Feb 2 2018, 4:26 PM
sbc100 edited the summary of this revision. (Show Details)

I mostly agree. _end/__wasm_end seem reasonable for now. My preference would be to wait on _edata and __bss_start until we know how those might be used.

sbc100 updated this revision to Diff 132709.Feb 2 2018, 5:01 PM
  • only define a single new symbol
sbc100 added a comment.Feb 2 2018, 5:46 PM

I ended up calling it __data_start in line with the existing __heap_base

sbc100 retitled this revision from [WebAssembly] Add _edata and _end link-synthetic symbols to [WebAssembly] Add __data_start linker-synthetic symbols.Feb 2 2018, 5:53 PM
sbc100 edited the summary of this revision. (Show Details)
sbc100 retitled this revision from [WebAssembly] Add __data_start linker-synthetic symbols to [WebAssembly] Add __data_start linker-synthetic symbol.
ncw accepted this revision.Feb 3 2018, 3:56 AM

Great, looks useful. (And the previous diff as well, moving the symbols into WasmSyms)

But it's going to require yet another tedious rebase of the symbol table patches (there'll be a conflict in every test file). I can't keep my patches continually up to date, I'd like to do just one more rebase - tell me what day you want to merge and I'll do it then.

This revision is now accepted and ready to land.Feb 3 2018, 3:56 AM
sbc100 retitled this revision from [WebAssembly] Add __data_start linker-synthetic symbol to [WebAssembly] Add __data_end linker-synthetic symbol.Feb 6 2018, 7:05 PM
This revision was automatically updated to reflect the committed changes.
quirinpa added a subscriber: quirinpa.EditedFeb 7 2019, 1:15 PM

I'm a big noob but I'm trying to port Linux to wasm. Call me crazy. Anyway, Linux expects all those symbols you speak about. So it'd be great to have them.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2019, 1:15 PM

@quinripa IIUC this patch was committed a year ago, so these symbols should be working?

@quinripa IIUC this patch was committed a year ago, so these symbols should be working?

Yes indeed. A shame they removed ELF support thou.