Page MenuHomePhabricator

[AArch64][BFloat] basic AArch64 bfloat support
ClosedPublic

Authored by stuij on May 11 2020, 4:32 AM.

Details

Summary

This patch adds the bfloat type to the AArch64 backend:

  • adds it as part of the FPR16 register class
  • adds bfloat calling conventions
  • as f16 is now not the only FPR16 type anymore, we need to constrain a number of instruction patterns using FPR16Op to help out the TableGen type inferrer

This patch is part of a series implementing the Bfloat16 extension of the
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

The bfloat type, and its properties are specified in the Arm Architecture
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

Diff Detail

Event Timeline

stuij created this revision.May 11 2020, 4:32 AM
stuij added a reviewer: ab.May 11 2020, 6:48 AM
fpetrogalli requested changes to this revision.May 15 2020, 9:18 AM

Hi @stuij ,

thank you for working on this!

Is there a way we can test the calling convention with your changes? A test that shows that passing or returning bfloat (scalar or vector) via IR functions is mapping to the correct registers will be good to have.

Grazie,

Francesci

llvm/lib/Target/AArch64/AArch64RegisterInfo.td
428

nit: remove me

This revision now requires changes to proceed.May 15 2020, 9:18 AM
LukeGeeson added a comment.EditedMay 15 2020, 10:11 AM

Hi @stuij ,

thank you for working on this!

Is there a way we can test the calling convention with your changes? A test that shows that passing or returning bfloat (scalar or vector) via IR functions is mapping to the correct registers will be good to have.

Grazie,

Francesci

Hi Francesci,
I implemented this in the C type patch, up to the Bfloat IR level only so far, this patch adds the AArch64 codegen for those things and it'd be worth having a backend patch too
https://reviews.llvm.org/D76077

specifically clang/test/CodeGen/arm-bf16-params-returns.c

stuij added a comment.May 16 2020, 1:51 PM

@fpetrogalli: Yes, good point. It was on my todo. I had tested these changes with such a snippet, so there's no good reason I didn't add it in the first place.

stuij added a comment.May 20 2020, 7:37 AM

Having a think, I think the calling convention test would be more meaningful if we have a bit more bfloat lowering support. Which we have in the bfloat lowering follow-up patch: https://reviews.llvm.org/D79712

As it is, for this patch we can test we pass a bfloat through the function, which gives us a ret, which I feel isn't too meaningful. But if we for example load it from a pointer, we can check that the arguments are put in the correct register.

stuij added a comment.May 20 2020, 7:39 AM

Note that these argument passing tests discussed above aren't present in the follow-up patch at the moment. I'm currently implementing them.

Having a think, I think the calling convention test would be more meaningful if we have a bit more bfloat lowering support. Which we have in the bfloat lowering follow-up patch: https://reviews.llvm.org/D79712

Yep - I'd make sure that the codegen patch is in before testing the CC. In fact, for the codegen patch you mention, I don't think you need CC at all:

void test_ldst(bfloat * A, bfloat *B) {
%1 = load bfloat, bfloat* A
void store(bfloat %1, bfloat *b)
}

Then, when the codegen is available, you can test the calling convention.

As it is, for this patch we can test we pass a bfloat through the function, which gives us a ret, which I feel isn't too meaningful. But if we for example load it from a pointer, we can check that the arguments are put in the correct register.

I think that the tests for the calling convention should look something like the following:

bfloat callee() {
   // check that h7 is not preserved
   // check that h22 and h23 are preserved (didn't check the actual calling convention, just guessing here, to give an idea)

   //some assembly code that uses some of the registers that are not preserved across the call, and some that are call preserved
   // e.g. :
  call void asm sideeffect "nop", "~{h7},~{h22},~{h23}"() nounwind
}

bfloat caller (bfloat *%A) {
// check that the registers preserved across the call are preserved by the caller, by for example clobbering h0 between the load and the call, and making sure that the clobbered h0 is stored before being written with the value of A
    %0 = load bfloat from A
   %1 = call bfloat @callee(bfloat %A); 
  ret bfloat %1;
}
stuij added a comment.May 21 2020, 2:00 AM

Hiya Francesco, thanks for the thorough comment :)

Having a think, I think the calling convention test would be more meaningful if we have a bit more bfloat lowering support. Which we have in the bfloat lowering follow-up patch: https://reviews.llvm.org/D79712

Yep - I'd make sure that the codegen patch is in before testing the CC. In fact, for the codegen patch you mention, I don't think you need CC at all:

I'm not talking about testing CC in general (tbh I'm not convinced we need extensive CC testing for every single type). I'm talking about servicing your request of a test for passing or returning bfloat:

Is there a way we can test the calling convention with your changes? A test that shows that passing or returning bfloat (scalar or vector) via IR functions is mapping to the correct registers will be good to have.

As you mention, a sensible test would show we're passing bfloat in a sensible register, but this patch doesn't supply support for lowering of for example load. The follow-up patch does support this. However this patch implements passing bfloat as an argument, so this patch needs to be applied before we can apply any other patches.

My suggestion is to commit this patch as-is, and put a meaningful argument-passing test (which I've already implemented) in the above mentioned follow-up patch.

fpetrogalli accepted this revision.May 21 2020, 8:58 AM

Hi @stuij

approving this, thank you for the explanation on the way you want to do testing.

My preferred way would be to extract the ld/st from the code gen patch, do the calling convention patch and test it with the lds/st codegen, then add those codegen tests that will make use of the CC for things like passing argument to operations like add, sub and so on.

If you are confident that your code here is correct because of the codegen patch test, I am happy for you to submit it without tests, followed by the codegen tests. Please make sure that you submit a third patch that tests the calling conventions using inline asm that clobbers registers.

So, LGTM! :)

Francesco

This revision is now accepted and ready to land.May 21 2020, 8:58 AM
stuij edited the summary of this revision. (Show Details)May 22 2020, 3:12 AM
stuij updated this revision to Diff 266523.May 27 2020, 6:55 AM

update commit text

This revision was automatically updated to reflect the committed changes.
c-rhodes added inline comments.Fri, Jun 26, 6:17 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
135

Shouldn't this and the types below be predicated on Subtarget->hasBF16()?

We've been fixing up cases in SVE for bfloat intrinsics where we missed predicating intrinsics / patterns on +bf16. I fixed this for the sizeless bfloat types added here in D82494 and it revealed the places we'd forgot to add the guard.

stuij marked an inline comment as done.Mon, Jul 6, 7:06 AM
stuij added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
135

Sorry, I missed this comment. Yes, you're right, we should clean this up.