This is an archive of the discontinued LLVM Phabricator instance.

[ARM64EC 10/?] Add support for lowering indirect calls.
Needs ReviewPublic

Authored by efriedma on Jun 1 2022, 12:43 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Part of initial Arm64EC patchset.

On Arm64EC targets, all indirect calls must first check whether the callee is ARM64 or x64 code. If the callee is ARM64 code, the pointer is directly called; if the callee is x64, it goes through a function to translate the calling convention from ARM64 to x64.

If CFG is enabled, this check is integrated with the CFG check: the call also checks whether the pointer points to a valid target.

The translation of calling conventions is done late; as far as I can tell, the way clang normally translates function types to LLVM IR leaves enough information to do this correctly. (If it turns out some case doesn't work, we should be able to bridge the gap using attributes.) I was originally thinking about generating thunks in clang, but there are complications: we'd need to staple the thunk to the call using a bundle, and calls generated by the optimizer wouldn't have access to whatever code clang used.

Diff Detail

Event Timeline

efriedma created this revision.Jun 1 2022, 12:43 PM
efriedma requested review of this revision.Jun 1 2022, 12:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 12:43 PM
dpaoliello added inline comments.
llvm/lib/Target/AArch64/Arm64ECCallLowering.cpp
1 ↗(On Diff #433498)

Most other files in this directory are prefixed with AArch64, should we do the same with this one?

72–73 ↗(On Diff #433498)

I didn't think that a FunctionPass was allowed to modify the list of functions - aren't we currently iterating through the list of function at this point (and so a realloc would invalidate the iterator)?

73 ↗(On Diff #433498)

There is a specific way that the names for these thunks are calculated - I'm working on getting this documented.

73 ↗(On Diff #433498)

Once we have the correct way to generate the names for thunks we may want to change the linkage to WeakAnyLinkage, since any thunk with the same name should be doing the same transformation to the parameters.

78 ↗(On Diff #433498)

I would think that we'd need to copy anything that affects the calling convention.

140 ↗(On Diff #433498)

Again, probably only things that affect the calling convention...

141–142 ↗(On Diff #433498)

The way that I was going to try to share thunks is by creating a map of FunctionType->thunk Function. This also solves the issue of modifying the list of functions in a module while iterating over it: you can create the Function while iterating WITHOUT adding it to the module, and then iterate the map afterwards and add each thunk Function to the module.

226 ↗(On Diff #433498)

My recommendation here is to change this into a ModulePass. You can then iterate over all blocks in all functions in the module and build a map of FunctionType->thunk, WITHOUT inserting those thunk Functions into the module (thus not invalidating the iterator).

Once you've completed this, iterate over the map and insert the thunks into the module.

I've made a "suggested edit" which is my work-in-progress for doing this, but please keep in mind that that 1) I haven't implemented the second step yet, 2) I'm not sure that FunctionType can be used as a key and 3) the way I'm dealing with CFG is by keeping the existing pass, and then fixing up the second argument in this pass to point to the thunk.

242–250 ↗(On Diff #433498)

Would isDeclaration work?

251–256 ↗(On Diff #433498)

This doesn't seem right: you need to rewrite the indirect calls AND generate thunks for any call to something that may be an x64 function.

Also, I'm pretty sure that getCalledFunction returns null for indirect calls.

llvm/test/CodeGen/AArch64/arm64ec-cfg.ll
26–35

Yeah, this isn't right: indirect calls should be transformed like this, but direct calls shouldn't be changed. It's the linker's responsibility to rewrite direct calls (it will lookup the thunk by name).

efriedma added inline comments.Jun 2 2022, 1:19 PM
llvm/lib/Target/AArch64/Arm64ECCallLowering.cpp
1 ↗(On Diff #433498)

What's the resulting name, though? AArch64Arm64ECCallLowering.cpp?

73 ↗(On Diff #433498)

I assume you mean LinkOnceODRLinkage?

As a first cut, I think we should just use internal linkage, and not worry about using compatible mangling.

78 ↗(On Diff #433498)

Right.

That said, I think for Windows x64/Arm64 ABI we don't normally use very many ABI attributes; I'm not sure we actually need to look at anything other than sret. Maybe inreg? Maybe also Swift ABI attributes, but I'm not going to worry about those for now.

I guess we could copy signext/zeroext/byval if we see them, somehow?

226 ↗(On Diff #433498)

We can't use the FunctionType as a key, no. At minimum, sret markings affect the calling convention.

Technically, we can add functions to the module while we're iterating over the function list, since it's a linked list, but it would probably be clearer to avoid that.

242–250 ↗(On Diff #433498)

I think we also need to account for weak definitions, but we could do something like that, sure.

Ideally, we could distinguish between functions in the current library, vs. functions which are in other libraries. For example, if the user is using -fvisibility. But I guess it's not a big deal if we emit extra thunks.

251–256 ↗(On Diff #433498)

Ideally, we do something like that, yes... but emitting the thunks is useless until we have some way to emit the hybmp tables.

llvm/test/CodeGen/AArch64/arm64ec-cfg.ll
26–35

I'm not sure what isn't right here, specifically?

In this testcase, all the calls are indirect. And we can't call __os_arm64x_check_icall directly, I think; __os_arm64x_check_icall is the address of the function, not the function itself.

(I don't think there's any rule that says we *can't* transform direct calls this way, but probably we shouldn't.)

dpaoliello added inline comments.Jun 2 2022, 1:45 PM
llvm/lib/Target/AArch64/Arm64ECCallLowering.cpp
1 ↗(On Diff #433498)

Yeah, I'd go with AArch64Arm64ECCallLowering.cpp

78 ↗(On Diff #433498)

I think the only thing that we really need is byval.

It doesn't look like the x64 calling convention uses sret, so I'm not sure that it's needed.

signext and zeroext should have been taken care of by the caller (we don't need to redo that work in the thunk).

x64 calling convention also uses nest and swift, but I don't think we need them for now.

242–250 ↗(On Diff #433498)

Extra thunks should be dropped by the linker, so we should err on the side of more thunks: isDefinition is nice since it is one of the few cases where we know that no thunk is needed. We can always refine this later to reduce the number of useless thunks, but if we don't generate enough then we can't link.

llvm/test/CodeGen/AArch64/arm64ec-cfg.ll
26–35

Ah, sorry, I misread the code. Yeah, this is correct - although it looks like we should add some direct-call cases then :)

efriedma updated this revision to Diff 435001.Jun 7 2022, 5:21 PM

Update with a couple minor fixes.

bcl5980 added inline comments.
llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
223

For now MSVC(19.32.31329/19.33.31517) use a different function name __os_arm64x_dispatch_icall_cfg, __os_arm64x_dispatch_icall.
But the document is still using __os_arm64x_check_icall_cfg, __os_arm64x_check_icall
Not sure which one Microsoft will use finally.

efriedma added inline comments.Jun 28 2022, 12:21 PM
llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
223

I assume that, for ABI reasons, the MIcrosoft libraries will continue to expose both versions. So it probably doesn't really matter which one we use.

I assume the semantics match the CFG functions on other targets; see llvm/lib/Transforms/CFGuard/CFGuard.cpp .