This is an archive of the discontinued LLVM Phabricator instance.

IR+AArch64: add `swiftasync` function parameter attribute
ClosedPublic

Authored by t.p.northover on Jan 20 2021, 6:22 AM.

Details

Reviewers
thegameg
jroelofs
Summary

Swift is getting some new asynchronous execution abilities which means functions will want to pass special execution contexts along in a way that is preserved by unaware functions much like swiftself. So this adds a parallel attribute to support that, which will map to a callee saved-register. That accounts for the fairly straightforward IR changes here.

Also, when this kind of argument is present, much of the control flow will be via guaranteed tail-calls which since they destroy the stack frame don't preserve backtrace information well. So when one of these contexts is present it will extend the usual (on AArch64) [FP, LR] frame record by putting the context just before it: [Ctx, FP, LR] (before rather than afterwards so that the same layout can be used on x86 where callq pushes the return address).

Finally, so that system tools and other things capturing stack traces can detect this kind of frame, the in-memory FP gets bit 60 set to 1.

On arm64e, this context will be signed in an address-discriminated way.

Diff Detail

Event Timeline

t.p.northover created this revision.Jan 20 2021, 6:22 AM
t.p.northover requested review of this revision.Jan 20 2021, 6:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2021, 6:22 AM

Just a few quick, knee-jerk-style, questions. My apologies if these are too simplistic or vague.

  • Do you happen to have a pointer to some more documentation about this new Swift feature?
  • I wonder if any features from other languages could also map to this IR/backend implementation? E.g. I haven't looked at how C++ co-routines are implemented, but would they be able to reuse this feature (assuming they would need a somewhat similar feature in mid- and back-end).
  • As this seems to be introducing new ABI - do you have an idea of what the most appropriate ABI specification is this would be best documented in? Do I guess correctly that this is not a Darwin-only feature? If this is really swift-specific (i.e. it does not seem to be a good trade-off to have this mid/back-end feature support similar language features for multiple front-ends), I guess this is Swift-specific ABI. Do I guess correctly that there isn't really a specification for Swift-specific ABI - it's purely defined by its primary implementation?
  • I just had one very detailed nit-pick comment on the documentation of frame layout - see specific comment.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
49–54

Would it be useful to update the documentation here on the frame record for AArch64 to cover "So when one of these contexts is present it will extend the usual (on AArch64) [FP, LR] frame record by putting the context just before it: [Ctx, FP, LR]"?

Thanks for the questions. I'm quite fuzzy on the Swift side of this I'm afraid, so take my replies with a pinch of salt.

Do you happen to have a pointer to some more documentation about this new Swift feature?

Most of the ABI-specific things are unfortunately on internal documents. The Swift proposals exist and I'll put links here, but kind of understandably they focus on the implications for Swift itself. But with that in mind, there's:

https://github.com/DougGregor/swift-evolution/blob/async-await/proposals/nnnn-async-await.md
https://github.com/DougGregor/swift-evolution/blob/structured-concurrency/proposals/nnnn-structured-concurrency.md
https://github.com/DougGregor/swift-evolution/blob/actors/proposals/nnnn-actors.md

I think that third one about Actors is probably most significant here, as far as I understand the idea is that the context will store the current Actor, the current Task, and an asynchronous backtrace structure. I'm afraid I don't yet know what form that will take, even from our own pages.

I wonder if any features from other languages could also map to this IR/backend implementation? E.g. I haven't looked at how C++ co-routines are implemented, but would they be able to reuse this feature (assuming they would need a somewhat similar feature in mid- and back-end).

I think it's generic enough that other languages could use it. The proposed context prefix appears to be:

struct SwiftAsyncContext {
  struct SwiftAsyncContext *callerFrame;
  void *callerReturnAddress;
};

File off the name Swift and I think any front-end could construct such a structure. Unfortunately, I think it would be an ABI break for C++ at this stage; an old caller would pass rubbish into a function trying to store this context and that's not good.

As this seems to be introducing new ABI - do you have an idea of what the most appropriate ABI specification is this would be best documented in?

I think we should document the reusable bits, so probably up to that struct above and state that what follows those two pointers is up to the language.

Do I guess correctly that this is not a Darwin-only feature?

That's right. I think it's mostly a quality-of-life feature (so it's not critical for me to get this done on AArch32 right now, for example), but Swift runs on Linux & Windows too, and I'd hope they can use it (even if the default system tools
don't catch up for a while, or ever). I haven't done extensive testing on them yet though.

If this is really swift-specific (i.e. it does not seem to be a good trade-off to have this mid/back-end feature support similar language features for multiple front-ends), I guess this is Swift-specific ABI. Do I guess correctly that there isn't really a specification for Swift-specific ABI - it's purely defined by its primary implementation?

I think that's right.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
49–54

Yep, I'll do that.

  1. Applied Kristof's FrameLowering suggestion.
  2. Added IR-level tests for the attribute I'd forgotten about.
varungandhi-apple added inline comments.
llvm/include/llvm/Bitcode/LLVMBitCodes.h
659

This is going to hit a merge conflict or a miscompile; on main, 73 is already taken by ATTR_KIND_NO_PROFILE.

t.p.northover added inline comments.Jan 28 2021, 1:20 AM
llvm/include/llvm/Bitcode/LLVMBitCodes.h
659

Yep, for this kind of thing the last step is always going to be rebasing and refreshing magic constants. If I've done my job right, the regression tests should be enough to flag it even if I forget.

Just a rebase really (minor comment & doc changes). Also ping.

jroelofs added inline comments.
llvm/docs/LangRef.rst
11863

I think the ^s need to be the same length as the text above it, otherwise we'll get sphinx warnings.

11882

by "the function" do you mean the caller?

llvm/include/llvm/CodeGen/TargetFrameLowering.h
165

Are callers expected to know whether the implementation of this API sets {Min,Max}CSFrameIndex? If there's a way to derive them, or set them to some default value, I think that would make this easier to hold.

Maybe it's better to implement the other one in terms of this one, to force them to be correctly populated even if they're not used?

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
705

The immediate probably needs an explanation. IIUC this is the discriminator?

710

If it's supposed to be zeroing X17, wouldn't MOVZ be better?

The comment above suggests it should be doing a mov x17, x22 though, and neither behavior matches what the code is doing here.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1089

do you mean #0x1000_1000_0000_0000?

1705

I'm having trouble making sense of this immediate. Comment above says it clears the MSB, but this looks like it does more than that.

2796

sink the declaration of FrameIdx to its definition.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
138

This is for that pacdb sequence right? Might be worth a comment here showing what the 5 instructions are.

thegameg accepted this revision.May 11 2021, 10:23 AM
thegameg added a subscriber: thegameg.

Flags/attributes/calling conventions look very straightforward.

FrameLowering is always getting harder and harder to review but I think your approach with the StoreSwiftAsyncContext makes it much easier! Thanks!

This LGTM (with the rest of the comments addressed).

llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
641

I'm having trouble seeing what this is checking for. I understand we want to fallback to DWARF like you explained in llvm/test/CodeGen/AArch64/swift-async-unwind.ll, but I'm not sure I get the way CurOffset is used to check for that.

This revision is now accepted and ready to land.May 11 2021, 10:23 AM
t.p.northover marked 4 inline comments as done.May 12 2021, 1:40 AM

Thanks for the comments. I've implemented a lot of them and explained the reasoning behind what I did for some. I'll upload a new patch.

llvm/include/llvm/CodeGen/TargetFrameLowering.h
165

There's just one caller of this function, in PrologEpilogInserter.cpp, and it calls this new version. Targets that want to set the new variables will override this one, targets that don't will override the other one.

If the target doesn't set them, PrologEpilogInserter supplies defaults (essentially saying no registers count as callee-saved) and I believe that changes some details like spill order.

I'm not sure if there's a point to that, just explaining some of the background really.

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
710

On AArch64, mov xD, xS is actually just an alias for the instruction fully described as orr xD, xzr, xS, lsl #0 which is this what we're doing here (mov x17, CtxReg). I think that matches the comment.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1089

No, it's just that one bit. The encoding for AArch64 bitwise immediates is weird (it allows weird repeating patterns etc).

1705

Same as above, weird encoding for immediates. Hence the comment, really.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
138

This switch is notorious for getting out of date when expansion changes. But the biggest problem isn't people not knowing what a pseudo might expand to; it's forgetting the switch exists at all.

A comment here is just one more thing to go stale rather than a useful thing you might act on (IMO).

llvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
641

Until now, this generator assumed that the .cfi directives it saw always started at an offset just after where FP was stored and counted up by 8 each time.

With the gap for the new context (and possibly padding) this is no longer true so we have to check that assumption and bail to DWARF if it's violated. This seemed like the least fragile way to do that, even though the specific hole we're looking for could be checked more easily.

t.p.northover added inline comments.May 12 2021, 1:45 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2796

Promise I have done this, I just forgot to save the file before committing when I uploaded the new diff.

jroelofs accepted this revision.May 12 2021, 8:48 AM

LGTM

llvm/include/llvm/CodeGen/TargetFrameLowering.h
165

ok, sounds good.

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
710

ack

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1089

ack

1705

ack

2796

ack

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
138

ack

t.p.northover closed this revision.May 14 2021, 3:44 AM

Thanks Jon. Committed as ea0eec69f16e