This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Add support for the event section
ClosedPublic

Authored by aheejin on Nov 5 2018, 2:56 AM.

Details

Summary

This adds support for the 'event section' specified in the exception
handling proposal. (This was named 'exception section' first, but later
renamed to 'event section' to take possibilities of other kinds of
events into consideration. But currently we only store exception info in
this section.)
Event section spec
PR for updating WebAssembly Object File Linking doc

The event section is added between the global section and the export
section. This is for ease of validation per request of the V8 team.

This patch:

  • Creates the event symbol type, which is a weak symbol
  • Makes throw instruction take the event symbol __cpp_exception
  • Adds relocation support for events
  • Adds WasmObjectWriter / WasmObjectFile (Reader) support
  • Adds obj2yaml / yaml2obj support
  • Adds .eventtype printing support

Diff Detail

Repository
rL LLVM

Event Timeline

aheejin created this revision.Nov 5 2018, 2:56 AM
aheejin added a comment.EditedNov 5 2018, 3:09 AM
  • I tried to add support for .eventtype printing and parsing in the same vein as D54012 and D53842, but ended up only adding printing support for now because of PR39557. (This is talking about adding '.functype' parsing support, but adding '.eventtype' parsing support is the same because it also uses WasmSignature).
  • Currently test/Object/wasm-string-outside-section.test fails with this patch. This test case is testing a malformed custom section, and I have to update the test case binary to include the new event section to maintain the same failure behavior (otherwise this fails before it reaches the custom section). This test was added in D50387 and I don't know how this test binary was generated. Asked the author about it.
aheejin edited the summary of this revision. (Show Details)Nov 5 2018, 3:10 AM
aheejin edited the summary of this revision. (Show Details)Nov 5 2018, 3:32 AM
aardappel added inline comments.Nov 5 2018, 9:40 AM
include/llvm/BinaryFormat/Wasm.h
72 ↗(On Diff #172557)

Some comments as to what these refer to, esp the uint32_t types?

include/llvm/MC/MCSymbolWasm.h
27 ↗(On Diff #172557)

nicer to have an "invalid" member of WasmEventType ?

lib/CodeGen/AsmPrinter/WasmException.cpp
22 ↗(On Diff #172557)

why do we use a label for this?

lib/Target/WebAssembly/MCTargetDesc/WebAssemblyTargetStreamer.cpp
121 ↗(On Diff #172557)

We should maybe decide to make all signatures look the same, so they can parsed equally (both by the assembler and a human :)
We should merge .param and .return? Should that have a symbol name much like .eventtype? Use void instead of just omitting .return?

sbc100 added inline comments.Nov 5 2018, 10:05 AM
include/llvm/BinaryFormat/Wasm.h
189 ↗(On Diff #172557)

These integer values are defined by the spec I think, which means you can't just change them. I imagine you will have to use 13, or what even number the spec decides on, but specify that it must come between section 6 and section 7.

Of course this will make the section parsing a code a little complex but I don't see any way around this.

lib/MC/WasmObjectWriter.cpp
62 ↗(On Diff #172557)

Would WasmSignature be a better name?

dschuff added inline comments.Nov 5 2018, 10:17 AM
include/llvm/BinaryFormat/Wasm.h
189 ↗(On Diff #172557)

I think these enums correspond to section IDs in the spec (https://webassembly.github.io/spec/core/binary/modules.html) and can't be changed. I guess Event should just get number 12, even though it goes here in the ordering? We should probably add that to the spec proposal too.

include/llvm/CodeGen/WasmEHFuncInfo.h
24 ↗(On Diff #172557)

I don't see uses of these tags. Is the idea that these would be the tags used by LLVM to throw/catch? I guess we'd want to use event symbols for this? Something like catch __cpp_exception where cpp_exception is an event symbol that gets a relocation to point to the event section entry with the right signature?

aheejin updated this revision to Diff 172725.Nov 6 2018, 2:20 AM
aheejin marked 2 inline comments as done.
  • Changed event section number to 12
  • Deleted yaml2wasm error checking routine that insists section codes should be monotonically increasing
  • Changed the term 'label' to 'symbol' in WasmException::endModule()

Thank you for comments.

include/llvm/BinaryFormat/Wasm.h
72 ↗(On Diff #172557)

This follows the spec here. The spec says the attribute data type to be varuint32. These uint32_t values will be later LEB-encoded in WasmObjectWriter.

As for comments, the spec says "The attribute of the event", which is not very descriptive either, so I didn't add any. Do you have other suggestions on what to add?

189 ↗(On Diff #172557)

Changed the event section code to 12 and restored all other section codes. The corresponding spec update is in this PR.

include/llvm/CodeGen/WasmEHFuncInfo.h
24 ↗(On Diff #172557)

They are not really tags but just enum values used in WebAssemblyTargetLowering::LowerINTRINSIC_VOID down there. As you can see there, the code is something like

case Intrinsic::wasm_throw: {
  int Tag = cast<ConstantSDNode>(Op.getOperand(2).getNode())->getZExtValue();
  switch (Tag) {
  case CPP_EXCEPTION: { // <---- It's used here
    const TargetLowering &TLI = DAG.getTargetLoweringInfo();
    MVT PtrVT = TLI.getPointerTy(DAG.getDataLayout());
    const char *SymName = MF.createExternalSymbolName("__cpp_exception");
    SDValue SymNode =
        DAG.getNode(WebAssemblyISD::Wrapper, DL, PtrVT,
                    DAG.getTargetExternalSymbol(
                        SymName, PtrVT, WebAssemblyII::MO_SYMBOL_EVENT));
    return DAG.getNode(WebAssemblyISD::THROW, DL,
                       MVT::Other, // outchain type
                       {
                           Op.getOperand(0), // inchain
                           SymNode,          // exception symbol
                           Op.getOperand(3)  // thrown value
                       });
  }
  default:
    llvm_unreachable("Invalid tag!");
  }

So depending on the immediate enum operand of throw intrinsic, we create a different event symbol. C_LONGJMP is just a reserved value to use when we add setjmp/longjmp handling.

include/llvm/MC/MCSymbolWasm.h
27 ↗(On Diff #172557)

How? It is a struct in include/llvm/BinaryFormat/Wasm.h:

struct WasmEventType {
  uint32_t Attribute;     
  uint32_t SigIndex;                                  
};

And there are many other similar struct in that file.

lib/CodeGen/AsmPrinter/WasmException.cpp
22 ↗(On Diff #172557)

Maybe the term 'label' was misleading. It's __cpp_exception event symbol that's used in throw instructions, and to make the symbol defined, it had to be emitted somewhere in the module. I'll change the term to just 'symbol', which I guess sounds less confusing.

lib/MC/WasmObjectWriter.cpp
62 ↗(On Diff #172557)

Wouldn't it be confused with [[ https://github.com/llvm-mirror/llvm/blob/5bc1446f3bee779c5a15a0256169bc7623682121/include/llvm/BinaryFormat/Wasm.h#L275-L285 | wasm::WasmSignature ]]? But to think about it they are kind of similar and there is even a TODO there saying consider using it directly instead...

And there are many vectors called Types in this file that store this type of objects, so I'm kind of torn. What do you think?

lib/Target/WebAssembly/MCTargetDesc/WebAssemblyTargetStreamer.cpp
121 ↗(On Diff #172557)

Opened an issue to discuss this.

aheejin updated this revision to Diff 172896.Nov 6 2018, 7:38 PM
  • Handle event section name printing in llvm-readobj
  • Currently test/Object/wasm-string-outside-section.test fails with this patch. This test case is testing a malformed custom section, and I have to update the test case binary to include the new event section to maintain the same failure behavior (otherwise this fails before it reaches the custom section). This test was added in D50387 and I don't know how this test binary was generated. Asked the author about it.

This is now passing. This failed because I incorrectly changed the section codes in my first patch.

aheejin updated this revision to Diff 173010.Nov 7 2018, 1:01 PM
  • Remove check lines irrelevant to event section
aheejin updated this revision to Diff 173013.Nov 7 2018, 1:11 PM
  • Comment fix
dschuff added inline comments.Nov 7 2018, 1:20 PM
lib/CodeGen/AsmPrinter/WasmException.cpp
29 ↗(On Diff #172896)

What is the linkage for this symbol? Is it 'external' in the linkage sense (in which case we have to worry about multiple definitions) or undefined ( in which case where should it be defined?)

lib/MC/WasmObjectWriter.cpp
62 ↗(On Diff #172557)

WasmSignature would definitely be a better name, and this does represent the same concept as wasm::WasmSignature (hence why I put the TODO to merge them). I think WasmType might be confused with e.g. ValueType. Since WasmSignature is taken we could call this... hm... I don't know what would be a good but not-ugly name. We could call it something ugly like EncodedSignature or FunctionOrEventType to make it clearer that this should be fixed; or namespace it like ObjectFile::WasmSignature to make it really clear that this is the same thing as WasmSignature.

lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp
102 ↗(On Diff #172896)

I guess this answers my question above.
We could have this symbol weakly defined in throwing/catching object files, and then strongly defined in libcxxabi? Or we could use a COMDAT?

sbc100 added inline comments.Nov 7 2018, 2:45 PM
test/ObjectYAML/wasm/event_section.yaml
30 ↗(On Diff #172896)

Would it be worth reducing this test case to something more minimal? I think you could remove most of this if you like

sbc100 added inline comments.Nov 7 2018, 2:50 PM
lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp
102 ↗(On Diff #172896)

Why would need a strong definition in libcxxabi? For that matter there is no syntax for defining it in code so I'm not sure how we would do that (assembly I guess?)

dschuff added inline comments.Nov 7 2018, 3:11 PM
lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp
102 ↗(On Diff #172896)

Sorry I was thinking of extern_weak. Then you would need a definition. But yeah weakly-defined should work here. And yeah, there's no way to define an event-type symbol in C, but assembly syntax should support it.

aheejin updated this revision to Diff 173062.Nov 7 2018, 3:46 PM
aheejin marked an inline comment as done.
  • Rename WasmType to WasmSignature and other variables too
  • Delete a TODO comment
  • Minimize a test case
include/llvm/CodeGen/WasmEHFuncInfo.h
24 ↗(On Diff #172557)

Changed the name to EventTag because this is also gonna be used by other instructions like if_except.

lib/CodeGen/AsmPrinter/WasmException.cpp
29 ↗(On Diff #172896)

As you saw below, its linkage is 'weak' and 'external' so that if there are multiple compilation units, it will be resolved in the linker. And this is the place where it gets defined. This is why it is emitted here, in which way each compilation module gets a (weak) definition.

lib/MC/WasmObjectWriter.cpp
62 ↗(On Diff #172557)

This is not in wasm:: namespace, so we can actually use WasmSignature. (There are also other classes both defined here and wasm:: namespace in [[ https://github.com/llvm-mirror/llvm/blob/master/include/llvm/BinaryFormat/Wasm.h | include/llvm/BinaryFormat/Wasm.h ]], such as WasmFunction)
Changed the class name to WasmSignature and other local/parameter names to match it.

lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp
102 ↗(On Diff #172896)

We don't need a strongly defined symbol because when there are multiple weak symbols the linker can choose any one of them. Currently event symbols are defined as weak. And it is emitted in WasmException::endModule() in lib/CodeGen/AsmPrinter/WasmException.cpp in this CL, which serves as its definition. The proof this is defined is, in test/MC/WebAssembly/event-symbol.ll test, the event symbol's flags are shown as BINDING_WEAK but not UNDEFINED. And if it is undefined this should show up in the import section, which it does not.

I don't know how COMDAT is different from a weak symbol in this case, so if there's advantage using that instead please let me know.

aheejin marked 2 inline comments as done.Nov 7 2018, 3:52 PM
aheejin added a comment.EditedNov 7 2018, 5:02 PM

As suggested by people in the meeting, I checked the multi-value proposal to see if the current way of specifying event types should be more like that of multi-values. Turns out the way block_type can be specified in the multi-value spec is rather complicated; there are three ways to do this:

blocktype ::= 0x40       => [] -> []
           |  t:valtype  => [] -> [t]
           |  ft:typeidx => ft

I'm not sure if three-way representation is ideal because the first two actually can be represented as the third one (function type index), if we assume the return types to be void as we did with the event types here. But I guess we can't merge this to the third one now because the first two ways in block_type for the MVP spec have been already standardized.

So anyway, what I confirmed is block_type is gonna be extended to be able to take a type index as well, so I guess we should keep the current event section implementation (and thus the spec), which shares type indices with functions in the type section.

aardappel added inline comments.Nov 9 2018, 9:07 AM
include/llvm/BinaryFormat/Wasm.h
72 ↗(On Diff #172557)

So maybe it is all described in the spec, I still think comments locally in the source code can help when navigating this code. uint32_t Attribute is so generic, it is meaningless. Any 1-line summary of what the spec says about it would be welcome.

include/llvm/MC/MCSymbolWasm.h
27 ↗(On Diff #172557)

Ah sorry, I thought it was an enum. Still, having a bool just to indicate if another variable is set is not great, I guess there is no "default" value for WasmEventType ?

aheejin updated this revision to Diff 173444.Nov 9 2018, 2:18 PM
aheejin marked an inline comment as done.
  • Add a comment to 'WasmEventType::Attribute' field
aheejin added inline comments.Nov 9 2018, 2:19 PM
include/llvm/MC/MCSymbolWasm.h
27 ↗(On Diff #172557)

Yeah it's not great, but I'm not sure how we can work around it. GlobalType and GlobalTypeSet are the same case. If we make it a pointer then we can tell it's not set if it's a null pointer, but then it also creates another ownership problem because MCSymbols are not deleted by MCContext, which was described in D52580.

aardappel accepted this revision.Nov 9 2018, 2:23 PM
This revision is now accepted and ready to land.Nov 9 2018, 2:23 PM
aheejin updated this revision to Diff 173446.Nov 9 2018, 2:26 PM
  • Add missing 'only' to comment
aheejin updated this revision to Diff 173624.Nov 12 2018, 1:00 AM
  • Use llvm::Optional to represent may-or-maynot present values
aheejin added inline comments.Nov 12 2018, 1:03 AM
include/llvm/MC/MCSymbolWasm.h
74 ↗(On Diff #173624)

@aardappel I changed the types of GlobalType and EventType to llvm::Optional and now we can remove those GlobalTypeSet and EventTypeSet. Thanks for the suggestion.

aardappel added inline comments.Nov 12 2018, 8:47 AM
include/llvm/MC/MCSymbolWasm.h
74 ↗(On Diff #173624)

awesome.. that's a solution I hadn't thought of.

sbc100 accepted this revision.Nov 12 2018, 9:52 AM
sbc100 added inline comments.
include/llvm/MC/MCSymbolWasm.h
74 ↗(On Diff #173624)

lgtm, as long as making these Optional doesn't require the destructor being called. IIRC we use of have a non-POD type in MCWasmSymbol before and we failed with msan because of that. Do Optional POD types like this allocate? I assume not.

tools/yaml2obj/yaml2wasm.cpp
482 ↗(On Diff #173624)

I feel like the section ordering constraint should still be enforces somewhere, presumably when parsing the yaml into the in-memory sections. But we also need to enforce it when parsing the binaries in WasmObjectFile and I'm OK with that being done in a later change.

aheejin added inline comments.Nov 12 2018, 1:57 PM
include/llvm/MC/MCSymbolWasm.h
74 ↗(On Diff #173624)

That's my understanding. llvm::Optional, like std::optional, uses placement new and does not allocate memory in heap.

tools/yaml2obj/yaml2wasm.cpp
482 ↗(On Diff #173624)

How can we enforce it when reading file? The event section appears between global section and export section in the section order now but it has section code 0xC. In both binary and yaml files, the event section occurs before the export section, so the assumption of monotonically increasing section ordering is not valid anymore.

sbc100 added inline comments.Nov 12 2018, 5:00 PM
tools/yaml2obj/yaml2wasm.cpp
482 ↗(On Diff #173624)

Exactly, but we will need to find way to validate the order none the less. We want to be able to reject invalid files after all. I'm not saying that should be part of this CL though.

aheejin updated this revision to Diff 173816.Nov 13 2018, 12:16 AM
  • Change the order of structs
sbc100 added inline comments.Nov 13 2018, 8:34 AM
include/llvm/BinaryFormat/Wasm.h
80 ↗(On Diff #173816)

Should this be called Kind or Type? Or is it more of a bitfield?

If its Type then perhaps that SigIndex should be in union:

e.g.

struct WasmEventType {
  uint32_t Kind
  union {
    ExceptionSigIndex;
  };
};

still lgtm in any case as this stuff can be iterated on.

dschuff accepted this revision.Nov 13 2018, 10:24 AM

LGTM too. I realized I don't really like (or perhaps I just don't understand) the term "Attribute" where it's used in the spec, but it's fine for this code to match that.

aheejin edited the summary of this revision. (Show Details)Nov 13 2018, 6:48 PM
This revision was automatically updated to reflect the committed changes.
aheejin marked an inline comment as done.Nov 25 2018, 2:58 AM
aheejin added inline comments.
include/llvm/BinaryFormat/Wasm.h
80 ↗(On Diff #173816)

Oh I missed this comment when landing this. For Attribute, I tried to follow the spec. If we change this to a bitfield, would the memory savings be big enough to be meaningful?

And why should SigIndex be within a union? By the way, I think SigIndex is better be outside of WasmEventType anyway, so I opened D54873.