This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Support `annotate` clang attributes for marking functions.
ClosedPublic

Authored by brendandahl on May 17 2023, 10:45 AM.

Details

Summary

Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.

A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.

Diff Detail

Event Timeline

brendandahl created this revision.May 17 2023, 10:45 AM
Herald added a project: Restricted Project. · View Herald Transcript
brendandahl requested review of this revision.May 17 2023, 10:45 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 17 2023, 10:45 AM

This change looks really nice. I like the new relocation type, I think we would have had to add that sooner or later anyway.

My only major concern is making this attribute available outside of emscripten (i.e. wasm32-unknown-emscripten). It seems like we should maybe called it em_async or something like that? And make it illegal on other targets?

@dschuff WDYT?

clang/include/clang/Basic/Attr.td
1984

Should we call this em_async or emscripten_async since this is an emscripten-specific attribute?

llvm/lib/MC/MCExpr.cpp
364

Maybe this on single line to match the local convention.

666

Was this whole block indented? Maybe limit this to just a single line change?

Nice! Just some drive-by nitpicking, sorry 😅

lld/test/wasm/async.ll
11 ↗(On Diff #523110)
21–22 ↗(On Diff #523110)
lld/test/wasm/merge-async-section.ll
27
39–40
llvm/test/CodeGen/WebAssembly/async.ll
10
14–15
llvm/test/MC/WebAssembly/async.s
11–12

The intention looks little weird.. Is that what llc emits?

brendandahl marked 5 inline comments as done.

Review comments.

clang/include/clang/Basic/Attr.td
1984

I wouldn't say this is emscripten specific. You could use this with clang+binaryen without emscripten.

llvm/lib/MC/MCExpr.cpp
666

This was changed by git-clang-format HEAD~1 . I can revert though and only change that line.

llvm/test/MC/WebAssembly/async.s
11–12

Yeah, I wasn't really sure how to format, but that's what llc was doing. Open to changing it.

sbc100 added inline comments.May 24 2023, 3:19 PM
clang/include/clang/Basic/Attr.td
1984

Maybe we can at least say that its web-specific? Would it make any sense for non-web platforms?

If you are using clang + binaryen + web its hard to imagine a non-emscripten triple being used.. but maybe?

lld/test/wasm/async.ll
18 ↗(On Diff #525348)

For lld tests I've been trying for a while now to move away from .ll and just use .s format.

The only reason we have any .ll files at all in this directory is because for a long time we didn't have a working .s format.

llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
561

Perhaps call this AsyncFuncs or AsyncFuncNames?

567

No need for curlys around one-liners in llvm

llvm/test/MC/WebAssembly/async.s
21

Indentation look wrong here.. Index, Offset and Name should align with Type I think?

This is missing all the usual Sema tests that the attribute only applies to functions, takes no arguments, only works in web assembly targets, etc. Also, the changes should come with a release note so users know about the new attribute.

clang/include/clang/Basic/AttrDocs.td
5608–5612

This could be my ignorance of web assembly showing, but the docs don't really help me understand when you'd want to use this attribute. Perhaps a link to what JSPI is and a code example would help a little bit? Or is this more of a low-level implementation detail kind of attribute where folks already know the domain?

clang/lib/CodeGen/TargetInfo.cpp
898
clang/lib/Sema/SemaDeclAttr.cpp
7637–7640

This should be handled automatically for you by the common attribute handling code, so I think you can remove this.

7645

This diagnostic doesn't make sense to me -- how does this attribute relate to ifuncs or aliases? Why should users be prohibited from writing the attribute on a definition?

brendandahl marked 12 inline comments as done.May 31 2023, 3:40 PM
brendandahl added inline comments.
llvm/test/MC/WebAssembly/async.s
11–12

Actually, it does look like something got messed up. I'll fix.

brendandahl added inline comments.May 31 2023, 3:40 PM
clang/lib/Sema/SemaDeclAttr.cpp
7645

This was from a previous version, it's not needed anymore. I'll remove.

brendandahl marked an inline comment as done.

Address comments.

Couple of CFE comments, otherwise LGTM.

clang/docs/ReleaseNotes.rst
559

We require underscores to spell it in code, so I figure we should do it here too.

clang/lib/Sema/SemaDeclAttr.cpp
7638

We should probably document that this forces the function to be emitted as well.

brendandahl added inline comments.Jun 1 2023, 2:59 PM
clang/lib/CodeGen/TargetInfo.cpp
892

Whoops, I changed the wrong one here. I'll fix that.

This change looks really nice. I like the new relocation type, I think we would have had to add that sooner or later anyway.

My only major concern is making this attribute available outside of emscripten (i.e. wasm32-unknown-emscripten). It seems like we should maybe called it em_async or something like that? And make it illegal on other targets?

@dschuff WDYT?

Hm, this is interesting because in the long term we plan to have stack switching in wasm, which could allow for similar async behavior that JSPI has, and could be useful in non-web systems. But that's a ways off. The file format we are generating with this CL will be used in emscripten sooner (and we may want to try to stabilize it some point, possibly before pure wasm stack switching is usable in non-web systems).
So overall I kind of feel like I could go either way on this. Curious if @sunfish has had any thoughts about async outside of emscripten.

Hm, this is interesting because in the long term we plan to have stack switching in wasm, which could allow for similar async behavior that JSPI has, and could be useful in non-web systems. But that's a ways off. The file format we are generating with this CL will be used in emscripten sooner (and we may want to try to stabilize it some point, possibly before pure wasm stack switching is usable in non-web systems).
So overall I kind of feel like I could go either way on this. Curious if @sunfish has had any thoughts about async outside of emscripten.

When we do have stack switching, would you anticipate still using this wasm_async attribute, or would we switch to something else at that point?

Review comments. Add tombstone for unused functions.

brendandahl marked 2 inline comments as done.Jun 6 2023, 1:40 PM
brendandahl added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
7638

I've removed this since we won't actually always want these functions preserved. I've changed the linker to now put a tombstone in the async section instead. I also added a new test for this.

brendandahl marked an inline comment as done.Jun 6 2023, 1:45 PM

Hm, this is interesting because in the long term we plan to have stack switching in wasm, which could allow for similar async behavior that JSPI has, and could be useful in non-web systems. But that's a ways off. The file format we are generating with this CL will be used in emscripten sooner (and we may want to try to stabilize it some point, possibly before pure wasm stack switching is usable in non-web systems).
So overall I kind of feel like I could go either way on this. Curious if @sunfish has had any thoughts about async outside of emscripten.

When we do have stack switching, would you anticipate still using this wasm_async attribute, or would we switch to something else at that point?

It's unclear to me at the moment how we're going to implement full stack switching with emscripten/llvm. I'm not attached to the wasm_async name, so we could be more specific and use wasm_jspi or something else?

wasm_jspi works for me.

sbc100 added a comment.Jun 6 2023, 3:03 PM

wasm_jspi works for me.

The problem with that is that we will also likely want to use this attribute for asyncify (which is the current async approach we have today) as well as JSPI (which is hopefully coming soon)

wasm_jspi works for me.

The problem with that is that we will also likely want to use this attribute for asyncify (which is the current async approach we have today) as well as JSPI (which is hopefully coming soon)

Ah, I see. Would it be accurate to say that asyncify is acting as a kind of a polyfill for JSPI in this situation, such that wasm_jspi is actually correct?

Alternatively, what would you think about wasm_external_async?

sbc100 added a comment.Jun 6 2023, 5:10 PM

wasm_jspi works for me.

The problem with that is that we will also likely want to use this attribute for asyncify (which is the current async approach we have today) as well as JSPI (which is hopefully coming soon)

Ah, I see. Would it be accurate to say that asyncify is acting as a kind of a polyfill for JSPI in this situation, such that wasm_jspi is actually correct?

Sure, that seems reasonable to see that way. I'm ok with that name if others are.

Alternatively, what would you think about wasm_external_async?

Aside from the suggested changes, the Clang bits LGTM

clang/lib/Sema/SemaDeclAttr.cpp
7636–7639

This can be removed entirely (see edit below).

8748

After some feedback, a few people have indicated they'd like to do something similar to this, but for their own attributes. I've changed to a more generic attribute that allows arbitrary strings. Now a custom section will be created for each unique attribute.

brendandahl retitled this revision from [WebAssembly] Add a new `wasm_async` clang attribute for marking async functions. to Add a new `wasm_custom` clang attribute for marking functions..Jun 13 2023, 11:35 AM
brendandahl edited the summary of this revision. (Show Details)

Fix few remaining issues.

backend code looks pretty good. I'll defer to the clang experts for the clang part.

lld/test/wasm/custom-undefine.s
17 ↗(On Diff #531014)

I don't fully understand how this test is different from custom-attr.s. Bar is still defined, so the only thing I can see that's different is that it's not called. Is it relying on the linker GCing bar so that it's not defined? is the result different when bar is really undefined?

lld/test/wasm/merge-custom-attr-section.ll
1 ↗(On Diff #531014)

wow I didn't know this utility existed 🤯

7 ↗(On Diff #531014)

oh also: please add the new relocation type to the document at https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md

sbc100 accepted this revision.Jun 13 2023, 3:29 PM

Nice!

lld/test/wasm/merge-custom-attr-section.ll
47 ↗(On Diff #531014)

Is there some reason we can't use assembly for this test?

lld/wasm/InputChunks.cpp
526

Can we switch the logic around so that the default return 0; is that last line of the function. Then your new use case could be added right before that line.

e.g.

if (name.startswith(".debug_"))
   return UINT64_C(-1)
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
568

Are these two lines needed? i.e. does the subscript operator automatically create an empty element when its first used?

This revision is now accepted and ready to land.Jun 13 2023, 3:29 PM

Would it make sense to call this "jspi" in practice, rather than "async"? Even though this isn't clang's decision per se anymore, it seems like the reasoning earlier in this review would still apply.

Would it make sense to call this "jspi" in practice, rather than "async"? Even though this isn't clang's decision per se anymore, it seems like the reasoning earlier in this review would still apply.

Did you mean to comment on the old PR? This new PR doesn't propose either "jspi" or "async", but adds that ability to define custom attributes.. since that was deemed more flexible and forward thinking. The decision as to which name emscripten will use for what can then be part of different discussion/PR we hope.

brendandahl marked 4 inline comments as done.Jun 14 2023, 9:39 AM
brendandahl added inline comments.
lld/test/wasm/custom-undefine.s
17 ↗(On Diff #531014)

Undefined is a bad name. I'll change to unreferenced or something. This tests when the function isn't referenced a tombstone will be placed in the custom section (e.g. the payload below has a FFFFFFFF).

lld/test/wasm/merge-custom-attr-section.ll
47 ↗(On Diff #531014)

Nope, I rewrote the other two, but missed this one. I'll fix.

brendandahl marked an inline comment as done.

Review comments.

Did you mean to comment on the old PR? This new PR doesn't propose either "jspi" or "async", but adds that ability to define custom attributes.. since that was deemed more flexible and forward thinking. The decision as to which name emscripten will use for what can then be part of different discussion/PR we hope.

Ah, I think I somehow was looking at the old PR, and thought there was a test with custom_section.func_attr.async in it.

This patch looks ok, though my initial impression is that if we're going to have a fully general-purpose mechanism like this, we might want to plan ahead for attributes that have parameters. We don't need to implement it now though; if there's a reasonable way to evolve in that direction, then this LGTM.

@aaron.ballman or @erichkeane Did you want to re-review after that latest changes (more generic attribute) or are things good to go?

aaron.ballman added inline comments.Jun 21 2023, 5:10 AM
clang/include/clang/Basic/AttrDocs.td
5608–5612

Based on the documentation here, I'm wondering why the annotate attribute doesn't suffice? That attribute lets you specify custom information to associate with a declaration that then gets lowered such that passes can do whatever they want with the info, which seems to be a more generalized version of what this attribute is.

(FWIW, I'm back to having basically no idea when you'd use this attribute or what it would be used for, so my thoughts above might make no sense.)

brendandahl added inline comments.Jun 21 2023, 3:39 PM
clang/include/clang/Basic/AttrDocs.td
5608–5612

I was considering that, but it would require more machinery in the wasm backend to store all the attribute values in the output. For now we only really need a flag associated with function. I think if we find more uses for the annotations in the future we could replace wasm_custom with it.

aaron.ballman added inline comments.Jun 22 2023, 5:23 AM
clang/include/clang/Basic/AttrDocs.td
5608–5612

I was considering that, but it would require more machinery in the wasm backend to store all the attribute values in the output. For now we only really need a flag associated with function. I think if we find more uses for the annotations in the future we could replace wasm_custom with it.

More machinery in the backend is preferred to exposing a new attribute that is this general-purpose; the backend is what needs this functionality, the frontend basically does nothing with it. (I'm assuming this is an implementation detail attribute and not something you expect users to write. If I'm wrong about that, please let me know.)

aaron.ballman requested changes to this revision.Jun 22 2023, 5:24 AM

Marking as requested changes so it's clear there's more worth discussing, so we don't accidentally land this.

This revision now requires changes to proceed.Jun 22 2023, 5:24 AM
sbc100 added inline comments.Jun 22 2023, 8:17 AM
clang/include/clang/Basic/AttrDocs.td
5608–5612

Based on the documentation here, I'm wondering why the annotate attribute doesn't suffice?

I believe that was the original solution that was considered but Benden was told this was perhaps not an appropriate use of "annotate". Brenden can you elaborate on why? IIRC it was something like "the semantics of the program should not depend on annotate attributes but the attribute being added in this case is required for the program to run correctly".. is that about right?

FWIW, I think using the existing annotate attribute would make a lot of sense... if we can get away with it.

5608–5612

(I'm assuming this is an implementation detail attribute and not something you expect users to write. If I'm wrong about that, please let me know.)

IIUC user would indeed be specifying these attributes. Kind of like they do for "visibility" today. The initial attribute that motivated this change is "async" which tells the host runtime that a certain function import or export behaves in an async manor (See https://v8.dev/blog/jspi for more details).

sbc100 added inline comments.Jun 22 2023, 8:23 AM
clang/include/clang/Basic/AttrDocs.td
5608–5612

I was considering that, but it would require more machinery in the wasm backend to store all the attribute values in the output. For now we only really need a flag associated with function. I think if we find more uses for the annotations in the future we could replace wasm_custom with it.

More machinery in the backend is preferred to exposing a new attribute that is this general-purpose; the backend is what needs this functionality, the frontend basically does nothing with it. (I'm assuming this is an implementation detail attribute and not something you expect users to write. If I'm wrong about that, please let me know.)

I think perhaps there are two alternative implementations being proposed here, and I want to make sure we don't confuse them:

  1. Use the existing annotate attribute.
  2. Make an even more complex custom attribute that hold key=value pairs rather than the current CL which proposes simply boolean custom attributes (e.g. key=true).

I think we would be happy with (1) if this usage is within scope.

IIUC the thing that would take a lot more work in the backend (and more complex custom section design in the binary format) is (2). I don't think we need (2) today and we should probably wait until we have a use case before designed a more complex version of this attribute.

brendandahl added inline comments.Jun 22 2023, 9:50 AM
clang/include/clang/Basic/AttrDocs.td
5608–5612

(1) could also be more complex since annotate takes a name and optional additional arguments e.g. __attribute__((annotate("x", "arg1", "arg2"))). I suppose we could just handle the case where there's a name for now and effectively do what this patch is currently doing.

5608–5612

It's been awhile, but I think I originally didn't go with annotate since 1) we thought it shouldn't affect compilation (in practice that doesn't seem to be true anymore) 2) It was more complex (e.g. args mentioned above). I'll give using annotate with a single attribute a try.

Use the annotate attribute to generate custom sections.

Marking as requested changes so it's clear there's more worth discussing, so we don't accidentally land this.

I've switched to using annotate now. Let me know if there's anything else.

Marking as requested changes so it's clear there's more worth discussing, so we don't accidentally land this.

I've switched to using annotate now. Let me know if there's anything else.

I guess the CL title and description now need updating? Nice to see we no longer need any clang changes here!

brendandahl retitled this revision from Add a new `wasm_custom` clang attribute for marking functions. to [WebAssembly] Support `annotate` clang attributes for marking functions..Jun 27 2023, 3:32 PM
brendandahl edited the summary of this revision. (Show Details)
brendandahl added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
583–596

This currently supports arguments, but it's somewhat brittle since the arguments aren't escaped. I'm fine with dropping this and only supporting name, but I think this could be useful.

sbc100 added inline comments.Jun 27 2023, 3:53 PM
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
583–596

I'd suggest going for the simplified version.. and just ignore any annotations that have arguments. My rational here is that we shouldn't add stuff that we don't actually need yet, and also that the code annotation proposal might supersede this mechanism in the future.

llvm/test/MC/WebAssembly/func-attr.s
21 ↗(On Diff #535126)

Should these be llvm.func_attr.custom0 perhaps?

I also wonder if its makes sense to preserve all of these. I worry that it could result in production code that contains these extra sections. Can we figure out way to drop them by default on only preserve them in the cases we care about?

How about maybe a linker flag? Something like --preserve-attributes=async.. that emcc could inject?

Remove annotate arguments. Change name from func_attr to llvm.func_attr.

brendandahl marked an inline comment as not done.Jun 27 2023, 4:40 PM
brendandahl added inline comments.
llvm/test/MC/WebAssembly/func-attr.s
21 ↗(On Diff #535126)

I'm planning to have binaryen remove all of these by default, since they're only needed as hints for binaryen.

aaron.ballman resigned from this revision.Jun 28 2023, 5:58 AM

Marking as requested changes so it's clear there's more worth discussing, so we don't accidentally land this.

I've switched to using annotate now. Let me know if there's anything else.

I guess the CL title and description now need updating? Nice to see we no longer need any clang changes here!

FWIW, I like this approach! I'm going to resign as a reviewer so that my "requested changes" marking no longer matters, but I'll still be subscribed to the review if folks need me to weigh in on anything for some reason.

This revision is now accepted and ready to land.Jun 28 2023, 5:58 AM
sbc100 accepted this revision.Jul 10 2023, 1:08 PM

lgtm. Can you update the change description now that we decided to not support values?

(Ironically I think have a use for values in pipeline now)

Rebase on main.

This revision was landed with ongoing or failed builds.Jul 11 2023, 3:18 PM
This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
566

"StringMap iteration order, however, is not guaranteed to be deterministic, so any uses which require that should instead use a std::map." https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h

I'll fix this in one minute.