Page MenuHomePhabricator

AArch64: support arm64_32, an ILP32 slice for watchOS.
Needs ReviewPublic

Authored by t.p.northover on Apr 29 2019, 6:36 AM.

Details

Summary

This is the main CodeGen patch to support the arm64_32 watchOS ABI on the LLVM side.

FastISel is mostly disabled for now since it would generate incorrect code for ILP32. Disabling it also allows for a more incremental approach to implementing that side since we have sensible fallbacks in place.

Diff Detail

Event Timeline

t.p.northover created this revision.Apr 29 2019, 6:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 6:36 AM
Herald added subscribers: dang, jfb, arphaman and 8 others. · View Herald Transcript

Looks like there's no reviewers set here. @t.p.northover who should review this?

Looks like there's no reviewers set here. @t.p.northover who should review this?

Anyone reasonably active in the AArch64 backend. The primary review location is still the llvm-commits mailing list, so I don't tend to bother adding specific reviewers unless there's a completely obvious candidate.

Gerolf added a subscriber: Gerolf.

+ reviewers to move this forward hopefully

Rebased patch.

efriedma added inline comments.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9804

Does this have any practical effect for other targets?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3767–3768

What's changing here? Does it make sense to add any comments?

llvm/test/CodeGen/AArch64/tail-call.ll
4–5

What are you trying to do here?

llvm/test/CodeGen/AArch64/win64_vararg.ll
272

There isn't any obvious reason for this test to change?

Thanks for upstreaming this, Tim.

At a high level, my main care about here is that what is upstreamed here doesn't conflict with also adding support at some point in the future for the AArch64 ILP32 ABI - for which a beta quality specification from Arm exists.

I browsed through the code changes. Without being a deep expert in a lot of the areas touched, I just shared a few thoughts in the few places where I saw some change that I couldn't easily come up with a plausible explanation for.

I'm assuming that there is nothing in here that would prevent adding support for non-Darwin AArch64 ILP32 later?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1196–1197

I assume this patch introduces support for the Darwin-specific arm64_32 ILP32 ABI.
That makes me wonder why there is a need to define/create the "getTheAArch64_32Target()". Doesn't "getTheARM64_32Target()" suffice to enable support for the Darwin-specific arm64_32 ILP32 ABI?

llvm/lib/Target/AArch64/AArch64ISelLowering.h
264–266

I found it surprising that this method doesn't loop for whether ILP32 is targeted or not to decide whether the pointer type is a 32 or 64 bit integer.
Would it be worthwhile to add a small comment here why the integer pointer type is always 64bits?
Maybe I'm missing something completely trivial though...

t.p.northover marked 6 inline comments as done.Jul 8 2019, 3:09 AM

At a high level, my main care about here is that what is upstreamed here doesn't conflict with also adding support at some point in the future for the AArch64 ILP32 ABI - for which a beta quality specification from Arm exists.

I think most of the changes should be reusable.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9804

I don't think so. It's a correctness issue so if other targets were affected then they'd start seeing vregs without definitions and things.

I believe the driving arm64_32 feature is the fact that pointers are extended beyond 32-bits by the caller, which leads to some weird (but valid) DAG nodes trying to communicate that fact. Other targets only assertzext from invalid types.

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1196–1197

I think it would be enough for production purposes, but the extra lines allow someone to write triple strings starting with aarch64_32 in tests and so on if they want to.

I still have a long-term goal to switch Clang over to using aarch64 as the canonical name in LLVM (aarch64-apple-ios14.0 etc). I was prevented when I tried years ago because ld64 wasn't ready, but I think I fixed that and I should try again some time.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3767–3768

We use the ability to lookup whether a register has already been added (use at line 3801 in this diff). If it has then we're trying to combine two parts of (say) a [2 x i32] into a single register for compatibility with armv7k IR and need to do the bitwise arithmetic to make that happen.

I don't know that a comment here would work (it would either be a historic note, or pre-empting what comes later). I'll try to do something to call it out unobtrusively at the use-point.

llvm/lib/Target/AArch64/AArch64ISelLowering.h
264–266

A comment would definitely be worthwhile. This is the function that enables the large change I upstreamed earlier so that pointer types in the DAG can remain 64-bits (to exploit addressing-modes available) and get truncated to 32-bits in memory.

llvm/test/CodeGen/AArch64/tail-call.ll
4–5

These functions are using the arrays purely to consume register space during a call, but after this change [8 x i32] only uses x0-x3 (in full). It's only an IR-level change (i.e. it doesn't affect C or C++ ABI), but I'll limit it to the arm64_32 target when I update the diff.

llvm/test/CodeGen/AArch64/win64_vararg.ll
272

I think it's because of the std::map change you called out above. It implicitly sorts the list of registers that get copied, perturbing the DAG and scheduling.

I think I'll switch it back to SmallVector and use std::find_if to handle the (rare) ARM compatibility instead. It ought to be faster in the common case and won't have this side-effect.

  1. Switched back to SmallVector to track registers used, reducing code perturbation. I decided to add a SmallSet too for queries instead of using std::find_if directly because searches actually happen regardless.
  2. Disabled special handling for [N x i32] except on arm64_32 MachO.
  3. Noticed a bug in 2 above, where we allocated 2N registers anyway and fixed it.
  4. Added getPointerType comment.
t.p.northover marked an inline comment as done.Jul 8 2019, 5:29 AM
t.p.northover added inline comments.
llvm/test/CodeGen/AArch64/win64_vararg.ll
272

Well, as you can see that accounted for a lot of the differences, but the fmovs still get reordered w.r.t. the store. I have no idea why this is: the DAG is identical and I'm reasonably sure it's harmless so I blame gremlins.

AlexDenisov added inline comments.
llvm/lib/Target/AArch64/AArch64Subtarget.h
440

When I compile LLVM with this patch applied I'm getting the error:

AArch64Subtarget.h:430:34: error: only virtual member functions can be marked 'override'
  bool addrSinkUsingGEPs() const override {
                                 ^~~~~~~~~

Removing the override keyword fixes it, but I'm curious where it comes from? I cannot see any usage of this method across the code base.

Ah sorry, it was part of an NFC change in a different commit. I've rolled it into this diff; it's splitting off GEP sinking from the useAA callback since they're not really related.

aemerson added inline comments.Thu, Aug 22, 12:48 AM
llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
115

Comment here explaining why we need this? I'm guessing for passing arrays of i32s & compatibility with armv7k?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3428

Why does this happen at all?

3767–3768

Looks like we still need some form of documentation here at RegUsed use?

llvm/test/CodeGen/AArch64/arm64-collect-loh.ll
163

Can you add a comment here explaining that inbounds is needed for arm64_32 to produce the same code.

llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll
22

It's odd that this test changed?

llvm/test/CodeGen/AArch64/arm64_32-neon.ll
6

Is anything really expected to change with NEON & arm64_32?