- User Since
- Apr 11 2019, 6:15 AM (9 w, 2 d)
Fri, Jun 14
Thu, Jun 13
LGTM except for a few typos in comments.
D62669 includes a few VPT predicable MVE instructions to test the assembly side of this, is that enough to add some MIR tests for this?
LGTM, just one minor nit.
Mon, Jun 10
Fri, Jun 7
LGTM with one change.
Thu, Jun 6
Wed, Jun 5
Fair enough, I don't think we currently try to diagnose any other invalid combinations of features. LGTM.
Tue, Jun 4
There is still a lot of code here not tested, but I think it will all be easier to test once we have some more actual instruction selection, so LGTM.
Mon, Jun 3
Fri, May 31
This still needs tests adding.
Thu, May 30
Could these be done without having to move to the GPR register file and back? The v8.2A FP16 extension added the VINS and VMOVX instructions which move between the top and bottom halves of half of S registers, which look ideal for this.
LGTM with one nit.
May 15 2019
It's hard to put exact numbers on this, especially when we are trading off vague concepts like "the complexity of the LLVM codebase", but I'd like to see enough benchmark results that we can be confident that this is making an overall improvement, and that we're not being misled by a few outliers. That might come from running a greater variety of benchmarks until we collect enough data, or it might come from improvements to the patch itself.
My concern isn't really the size of the change, it's with the variance. With only four benchmarks which change, and only by a small amount each, then if you happened to run one more or one fewer benchmark you could come to a completely different conclusion.
I'm not sure those benchmark results are good enough to justify this: there are only 4 changes, and the ups and downs almost balance each other out. Have you investigated the regressions to see what this makes worse, and if there's a way we could avoid them?
May 3 2019
Looks like a nice tidy-up, just a few style issues.
May 2 2019
The instruction exists on A-class targets. It's just that on most A-class targets, we run in userspace, and userspace isn't allowed to disable interrupts, for obvious reasons.
Apr 29 2019
Apr 25 2019
Thanks for fixing the order of the patterns, it's much easier to check for completeness now.
Apr 18 2019
Apr 17 2019
For the auto-upgrader, these limited FPUs only exist for microcontrollers, where I doubt any users are keeping bitcode files around for a long time, so I'd be fine with not doing this. I've had a skim through the auto-upgrader code, and I don't see any other target-specific attributes which are handled there, though there are some target-specific intrinsics.
Apr 16 2019
Why do we need to treat this differently to NEON? As far as I know the calling convention is the same.
Can any of this be tested yet, or are we still missing some patches?
I think it would be worth a test in this patch, especially since D60706 only tests v8.1M, not v8.2A which also introduces this instruction.
It looks like we are still missing a lot of the conversions between f16 vectors and other vector types, I think it would be better to add them all at once, so it's easier to check we haven't messed any.
Can this be tested now, or does that depend on one of the other patches?
Apr 15 2019
Clang tests should just cover the C->IR translation, and not depend on the LLVM backends. This should instead be an IR->asm test in the LLVM repository.
Apr 11 2019
The document you linked in the LLVM change (https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values) says that small POD types are returned directly in X0 or X0 and X1, but this looks like it will always return them indirectly. I think we also need to check the size of the type, and fall back the the plain C ABI for small types.