This is an archive of the discontinued LLVM Phabricator instance.

IR/AArch64/X86: add "swifttailcc" calling convention.
ClosedPublic

Authored by t.p.northover on Jan 26 2021, 6:51 AM.

Details

Reviewers
paquette
Summary

Swift's new concurrency features are going to require guaranteed tail calls so that they don't consume excessive amounts of stack space. This would normally mean tailcc, but there are also Swift-specific ABI desires that don't naturally go along with "tailcc" so this adds another calling convention that's the combination of swiftcc and tailcc.

Since the changes to support tailcc in AArch64 were virtually identical to supporting swifttailcc I decided it would be silly not to do that at the same time. I can split them up again for commit if people would prefer.

Diff Detail

Event Timeline

t.p.northover created this revision.Jan 26 2021, 6:51 AM
t.p.northover requested review of this revision.Jan 26 2021, 6:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2021, 6:51 AM

Could you add a short entry on swifttailcc to LangRef.rst?

Of course, should have spotted that one.

llvm/docs/CodeGenerator.rst
2066–2077

I think this section needs to be updated to also mention swifttailcc.

t.p.northover marked an inline comment as done.
  • Updated documentation as requested.
  • Changed CallSeq_start AArch64 operand so we don't reserve the stack space twice.

Spotted a couple of issues over the last week:

  • We sometimes restored SP to the very beginning and then subtracted it back again to "reallocate" argument storage on AArch64. If there's no red-zone this leaves arguments vulnerable to an interrupt.
  • Added new keyword to Emacs & Vim highlighters

In this update:

  • Switched swiftasync and swiftself parameters to caller-save because otherwise they'd be clobbered by the epilogue in a tail call.

Added support for i386 (for the watch simulator). Mostly came for free but some slight issues around swiftself and CSRs.

jroelofs added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
196–203

measured in bytes?

321

@compnerd might know?

1754

What if the pop size is larger than the immediate field in the ldp is able to encode?

llvm/test/CodeGen/ARM/swifttailcc.ll
3 ↗(On Diff #335488)

Is the swifttail attribute implicit here?

Seems like there should be tighter CHECKs for what is expected around the vadd.f32.

paquette added inline comments.
llvm/docs/CodeGenerator.rst
2071–2072

Is tailcc supported on AArch64 yet? If not, this is somewhat misleading.

paquette added inline comments.May 11 2021, 10:04 AM
llvm/docs/CodeGenerator.rst
2071–2072

oh, duh, this patch adds that

paquette added inline comments.May 11 2021, 10:13 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
196–203

Should be a doxygen comment?

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
57

This should be a Doxygen comment?

llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
1018

Why did this change?

t.p.northover marked 2 inline comments as done.May 12 2021, 4:53 AM

Thanks for the comments, I'll upload a new patch.

llvm/docs/CodeGenerator.rst
2071–2072

Yes. I debated whether it was closely related enough to include, but it just seemed silly to be updating exactly the places that would need to know about tailcc for it to work and not do it at the same time.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
196–203

Done because it's harmless either way, but does doxygen care about static functions?

321

Possibly, but that's a whole other topic. Just making sure we fail noisily.

1754

The function is probably better named tryConvert..., if it can't fold it'll insert a normal sub sp, sp, #imm.

But you're right I think there is a risk of overflow with the arg area possibly being large, and there's not actually an overflow check in that function. I'll add one.

llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
1018

Minor bugfix. The immediate tells stack allocation how much space it should reserve for this call's stack-based arguments, but because it's a tail call its stack space is actually allocated independently at the other end of the frame (before even the callee-saved registers get pushed).

The practical effect of this is that with NumBytes the function was allocating space in both places, using more stack than is necessary.

llvm/test/CodeGen/ARM/swifttailcc.ll
3 ↗(On Diff #335488)

I have no idea what this file is here for. It should probably be part of the ARM implementation if anything, but even there it seems useless. I'll remove it from this review.

paquette added inline comments.May 12 2021, 9:24 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
196–203
321

I feel like this should actually be something heavier than an assert? If someone builds without asserts and tries to use this on Windows, it'd be unfortunate if they ended up with something totally broken.

2387

The assert in the original code checks that ByteOffset % 16 == 0, why did that change here?

llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
1018

Ah, I see. Makes sense.

t.p.northover marked an inline comment as done.May 14 2021, 3:10 AM
t.p.northover added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2387

If you're asking why the direction of the test changed, it's because the assert was after ByteOffset had been incremented by +/-8 in this block.

The reason the check's needed at all now is that there are multiple potential sources of misalignment (odd # of CSRs, and the new context store), and they might or might not combine to realign the stack again.

This revision is now accepted and ready to land.May 14 2021, 9:30 AM
t.p.northover closed this revision.May 17 2021, 2:48 AM

Thanks. Committed as 82a0e808bb2c.