This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Make signature index not a part of EventType
AbandonedPublic

Authored by aheejin on Nov 25 2018, 12:38 AM.

Details

Reviewers
dschuff
sbc100
Summary

Previously EventType contained SigIndex as its member based on the
spec, but signature indieces can change after linking, which makes
EventType not a constant property of an event. This patch takes
SigIndex out of EventType and treat it similarly with function
types and makes SigIndex a property that any type can optionally
possess.

This is not a spec change; this only affects yaml tests and
binary file results do not change after this patch.

Companion to D54874,

Diff Detail

Event Timeline

aheejin created this revision.Nov 25 2018, 12:38 AM
sbc100 added inline comments.Nov 27 2018, 2:51 PM
include/llvm/BinaryFormat/Wasm.h
88

My initial reaction is that we should leave this as is.

The linker shouldn't be modifying the input EventTypes but instead creating a new set of output eventtypes based on the input ones. At least that how it makes sense to me. I will take a look at the lld side change and see if I can make some suggestions there.

aheejin marked an inline comment as done.Nov 27 2018, 3:32 PM
aheejin added inline comments.
include/llvm/BinaryFormat/Wasm.h
88

You mean we should create separate set of everything, like WasmEventType, WasmGlobalType, ... and copy all data from the old structures to the new structures in the linker to resolve this? Wouldn't that be too much duplication? We already have separate WasmEventType and WasmGlobalType, ... in WasmObjectWriter side (lib/MC). I would want to avoid adding another set of all these, just to resolve signature index. And if we only add WasmEventType for the output type in the linker, that would look inconsistent as well.

And the original reason SigIndex was a part of WasmEventType was just the spec has a table structured that way, which is not very important reason. (This CL does not change the binary spec itself)

And the other thing is, taking SigIndex out of a certain type (here WasmEventType) might be more extensible way, for when we have multi-value globals. In that case every type can have an optional SigIndex seems a easier way to handle things.

sbc100 added inline comments.Nov 27 2018, 4:26 PM
include/llvm/BinaryFormat/Wasm.h
88

I think so yes. If you look at the InputGlobal class in lld today it looks like already makes copy of the InputGlobal. I imagine the InputEvent class would have the WasmEvent in it. These object do not need to be the exact objects that get written to the output. It happens that for InputGlobal we do end up calling writeGlobal with that as the input, but for events you could do something like this:

e.g.

for (const InputEvent *E : InputEvent)  {
  WasmEvent NewEv = *E->Event;
  // Replace the input sig index with the output sig index.
  NewEv->Type->SigIndex = lookupType(E->Type>SigIndex);
   writeEvent(OS, NewEv); 
}
92

I also think it nice if you can get the signature from the WasmEvent stuct. Rather than separating the events from their signatures. Seems to fit better with the binary representation.

dschuff added inline comments.Nov 27 2018, 4:40 PM
include/llvm/BinaryFormat/Wasm.h
92

I don't know that I have an opinion about whether we have a duplication like in InputGlobal, but I do think that however we handle signatures for functions should be the same as what we do for events. Globals are different because they have a wasm value type instead of a signature, but @aheejin pointed out that with the multi-value proposal they will also have a signature in the same index space. So the way we handle the signatures should be uniform (which I guess is the point of this CL...)

I think part of the confusion here is that this file has some code that is intended to directly model the binary format (which is why WasmEventType had the SigIndex field in the first place) and some that's intended to be useful at a slightly higher level for MC and the linker. e.g. why we have WasmSignature* in some places and signature indices in others.

dschuff added inline comments.Nov 27 2018, 4:44 PM
include/llvm/BinaryFormat/Wasm.h
85

Now that this structure no longer directly reflects the binary format, it might make sense to just remove it since there's no need for a 1-element structure. I still think that FooKind -style nomenclature like that used elsewhere in LLVM is more readable (since this models "kinds" of events, one of which is exception). I actually think I'd like to rename attribute to kind in the spec itself, but that's obviously a separate thing.

aheejin marked an inline comment as done.Nov 27 2018, 6:12 PM
aheejin added inline comments.
include/llvm/BinaryFormat/Wasm.h
88

I'm not sure which option you are suggesting.

  1. Create a new set of WasmGlobal, WasmEvent and other similar structures in lld side
  2. Create only a new WasmEvent class in lld side (which I think is inconsistent)
  3. Use the current [[ https://github.com/llvm-mirror/llvm/blob/0833f2640ef8874438edb516785b58faae6acd38/include/llvm/BinaryFormat/Wasm.h#L91-L95 | wasm::WasmEvent ]] defined in include/BinaryFormat/Wasm.h, but just create a new instance of this

I thought you were suggesting either 1 or 2 and now you sound like 3.

And currently, the InputGlobal class and its constructor looks like this:

class InputGlobal {                                                              
public:                                                                          
  InputGlobal(const WasmGlobal &G, ObjFile *F)                                   
      : File(F), Global(G), Live(!Config->GcSections) {}
  ...                                                                                                                                                  
  WasmGlobal Global;                                                             
  ...                                    
};

By the way, is the member variable WasmGlobal Global not being a reference but a value, so that we get to copy WasmGlobal in the constructor, intentional? I'm wondering if that was intentionally by value to allow modification in cases like this.

sbc100 added inline comments.Nov 28 2018, 8:42 AM
include/llvm/BinaryFormat/Wasm.h
88

I think I'm advocating for 3, which I believe matches what we do with Globals and Imports, etc today.

The fact that the WasmGlobal is, I think, so that createSyntheticSymbols() can pass a stack allocated new global and have it copied.

In the case of Event I think its reasonable to use a const reference to the Event in the input object here. The idea is that this member represents the item in the input file, so it cannot be written as-is to the output, a new event based on the input event is what should be written. The input event won't have the correct indexes etc within it.

Remember that for Globals and Events we normally/currently have exactly 1 of each in the link. (stack_pointer and cpp_exception). So there is really no need to try to avoid copying them at this point.

dschuff added inline comments.Nov 28 2018, 10:59 AM
include/llvm/BinaryFormat/Wasm.h
88

I think in the future we may have more globals (even aside from the special known ones such as stack pointer, memory_base, table_base): there may be references once we have support for reference types, and there may be languages other than C that either use LLVM or want to produce LLVM-compatible object files. So I think it makes sense for the linker and object file code to have the capability of treating them in a generic way. I guess that doesn't necessarily mean we can't copy them (e.g. unlike functions, but like events, they will be small) but we should at least consider how we want to handle linking for them even if we don't use it in clang for now.

sbc100 added inline comments.Nov 28 2018, 11:09 AM
include/llvm/BinaryFormat/Wasm.h
88

I agree. Actually we don't copy them right now in order to gain mutability, but simply to handle being passed as stack allocated synthetic global. It would be easy enough to change the ownership model here. I don't think it relates to this change actually.

aheejin marked an inline comment as done.Nov 28 2018, 1:07 PM
aheejin added inline comments.
include/llvm/BinaryFormat/Wasm.h
88

@sbc100 In the initial code example you suggested:

for (const InputEvent *E : InputEvent)  {
  WasmEvent NewEv = *E->Event;
  // Replace the input sig index with the output sig index.
  NewEv->Type->SigIndex = lookupType(E->Type>SigIndex);
   writeEvent(OS, NewEv); 
}

InputEvent can't be const anymore because now we are modifying WasmEvent inside it. And also existing InputGlobal::getType returns const WasmGlobalType&, which InputEvent mimics in D54875. If we want to change something inside InputEvent after it is created, this method can't be const anymore. That's fine with me, but InputEvent::getType() returns WasmEventType& while InputGlobal::getType() returns const WasmGlobalType& seems inconsistent.

And the reason I asked why those member variables are by value and not reference is not really relevant to whether we should take out SigIndex out of a certain type (e.g. EventType). I agree they are separate things. I still think generalizing SigIndex so that every type can optionally have one is more extensible in face of multi-value globals.

sbc100 added inline comments.Nov 28 2018, 1:31 PM
include/llvm/BinaryFormat/Wasm.h
88

Sorry I don't understand. The above code should not be modifying the InputEvent. It modifies NewEv which is on the stack right? NewEv is the output event where InputEvent is the input event which we should not change. Maybe I made a mistake in my code?

aheejin marked an inline comment as done.Nov 28 2018, 1:46 PM
aheejin added inline comments.
include/llvm/BinaryFormat/Wasm.h
88

I think I was confused too. To sum up,

  1. In case we store WasmEvent member variable as a value like WasmGlobal:

The code should look like this

WasmEvent NewEv = E->Event;

In this case it is copying, so we can modify NewEv.
or this

WasmEvent *NewEv = &E->Event;

In this case, even if it compiles, we are semantically modifying something that's within a const InputEvent.

  1. In case we store WasmEvent member variable as a const reference, as you suggested above (which is inconsistent with WasmGlobal):

In this case we should copy it like this, otherwise it wouldn't compile:

WasmEvent NewEv = E->Event;

Anyway, do you think taking out SigIndex out of WasmEventType is still not a good idea?

aheejin added a comment.EditedNov 30 2018, 2:37 PM

Ping. So the decision points are

  1. Whether SigIndex should be in or out of WasmEventType?
    1. If SigIndex should be out of WasmEventType, should there even be WasmEventType anymore? Because then there's only Attribute left.
  2. It seems to be people don't like the term Attribute. Should we change it to Kind or something? We can change the spec text, because it's not set in stone or anything.

Ping. So the decision points are

  1. Whether SigIndex should be in or out of WasmEventType?
    1. If SigIndex should be out of WasmEventType, should there even be WasmEventType anymore? Because then there's only Attribute left.
  2. It seems to be people don't like the term Attribute. Should we change it to Kind or something? We can change the spec text, because it's not set in stone or anything.

Seems like we should keep the two member WasmEventType. We need this for specifying the events themselves, and also for importing events. It seems like this matches WasmGlobalType,

Attrbiute does seem like a strange name, but maybe there is reason that was chosen in spec currently?

aheejin added a comment.EditedNov 30 2018, 2:55 PM

Ping. So the decision points are

  1. Whether SigIndex should be in or out of WasmEventType?
    1. If SigIndex should be out of WasmEventType, should there even be WasmEventType anymore? Because then there's only Attribute left.
  2. It seems to be people don't like the term Attribute. Should we change it to Kind or something? We can change the spec text, because it's not set in stone or anything.

Seems like we should keep the two member WasmEventType. We need this for specifying the events themselves, and also for importing events. It seems like this matches WasmGlobalType,

I'm not really sure what this means. Do you mean we should keep SigIndex as a part of WasmEventType and drop this CL? And why do we need this for specifying events themselves and for importing events? I don't think I understand what 'this' is actually. And why would it match WasmGlobalType? And do you also suggest, when we get to add multi-value globals later, we should insert SigIndex as a part of WasmGlobalType too?

Attrbiute does seem like a strange name, but maybe there is reason that was chosen in spec currently?

I don't know. Karl chose it, and apparently I can't ask him the reason now... I'm fine with the name, but I heard some complaints about it. If it was not you maybe it was @dschuff. Not that I want to change that myself, but I was just suggesting that it's possible to change the spec text also because it's not fixed yet.

sbc100 added a comment.Dec 3 2018, 9:38 AM

Ping. So the decision points are

  1. Whether SigIndex should be in or out of WasmEventType?
    1. If SigIndex should be out of WasmEventType, should there even be WasmEventType anymore? Because then there's only Attribute left.
  2. It seems to be people don't like the term Attribute. Should we change it to Kind or something? We can change the spec text, because it's not set in stone or anything.

Seems like we should keep the two member WasmEventType. We need this for specifying the events themselves, and also for importing events. It seems like this matches WasmGlobalType,

I'm not really sure what this means. Do you mean we should keep SigIndex as a part of WasmEventType and drop this CL?

Yes

And why do we need this for specifying events themselves and for importing events?

I mean event imports need to specify their "type" and non-imported events also need to specify their "type". And in both cases the type needs to include the signature. So I think we should have a unified "eventType" that includes the signature index.

I don't think I understand what 'this' is actually. And why would it match WasmGlobalType? And do you also suggest, when we get to add multi-value globals later, we should insert SigIndex as a part of WasmGlobalType too?

Attrbiute does seem like a strange name, but maybe there is reason that was chosen in spec currently?

I don't know. Karl chose it, and apparently I can't ask him the reason now... I'm fine with the name, but I heard some complaints about it. If it was not you maybe it was @dschuff. Not that I want to change that myself, but I was just suggesting that it's possible to change the spec text also because it's not fixed yet.

aheejin added a comment.EditedDec 3 2018, 2:50 PM

And why do we need this for specifying events themselves and for importing events?

I mean event imports need to specify their "type" and non-imported events also need to specify their "type". And in both cases the type needs to include the signature. So I think we should have a unified "eventType" that includes the signature index.

I can drop this CL, but I still don't understand what you meant by this part. Currently imported exception's sigindex is handled fine. The below is obj2yaml's result if I don't define the __cpp_exception symbol:

- Type:            IMPORT                                                      
  Imports:                                                                     
    ...                                    
    - Module:          env                                                     
      Field:           __cpp_exception                                         
      Kind:            EVENT                                                   
      SigIndex:        1                                                       
      EventAttribute:  0

As you see in WasmImport definition in this CL, SigIndex is now out of union and every type can set it optionally, so there's no problem for events having sigindex there.

struct WasmImport {                                                              
  StringRef Module;                                                              
  StringRef Field;                                                               
  uint8_t Kind;                                                                  
  uint32_t SigIndex;                                                             
  union {                                                                        
    WasmGlobalType Global;                                                       
    WasmTable Table;                                                             
    WasmLimits Memory;                                                           
    WasmEventType Event;                                                         
  };                                                                             
};
sbc100 added a comment.Dec 3 2018, 2:52 PM

And why do we need this for specifying events themselves and for importing events?

I mean event imports need to specify their "type" and non-imported events also need to specify their "type". And in both cases the type needs to include the signature. So I think we should have a unified "eventType" that includes the signature index.

I can drop this CL, but I still don't understand what you meant by this part. Currently imported exception's sigindex is handled fine. The below is obj2yaml's result if I don't define the __cpp_exception symbol:

- Type:            IMPORT                                                      
  Imports:                                                                     
    ...                                    
    - Module:          env                                                     
      Field:           __cpp_exception                                         
      Kind:            EVENT                                                   
      SigIndex:        1                                                       
      EventAttribute:  0

As you see in WasmImport definition in this CL, SigIndex is now out of union and every type can set it optionally, so there's no problem for events having sigindex there.

struct WasmImport {                                                              
  StringRef Module;                                                              
  StringRef Field;                                                               
  uint8_t Kind;                                                                  
  uint32_t SigIndex;                                                             
  union {                                                                        
    WasmGlobalType Global;                                                       
    WasmTable Table;                                                             
    WasmLimits Memory;                                                           
    WasmEventType Event;                                                         
  };                                                                             
};

But what about non-imported events? They need signature too right?

Yes in that case it's represented like sigindex of functions.

- Type:            FUNCTION                                                    
  FunctionTypes:   [ 0, 0 ]                                                    
- Type:            EVENT                                                       
  EventTypes:      [ 1 ]                   // here                                        
  Events:                                                 
    - Index:           0                                                       
      Attribute:       0
sbc100 added a comment.Dec 3 2018, 3:02 PM

Yes in that case it's represented like sigindex of functions.

- Type:            FUNCTION                                                    
  FunctionTypes:   [ 0, 0 ]                                                    
- Type:            EVENT                                                       
  EventTypes:      [ 1 ]                   // here                                        
  Events:                                                 
    - Index:           0                                                       
      Attribute:       0

So I guess my point is that its cleaner to keep the signature as part of the type so it can be specified the same in imports as in non-imports.
And the original reason for this CL, which was to make the signature mutable for inputs to lld is no longer an issues, so I don't see any advantage to landing this CL now. But I could still be missing something?

aheejin abandoned this revision.Dec 3 2018, 3:33 PM

No, I don't feel strong about this either. I just wanted to say this CL does not lose any sigindex info both in import and non-import cases, because you sounded like otherwise.