Support vfclass, vfmerge, vfrec7, vfrsqrt7, vfsqrt instructions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
One of the ten commits that landed here (https://github.com/llvm/llvm-project/compare/a3bfddbb6a27...59d5b8c27b43), this being one of them, slowed down check-clang by over 20%: https://bugs.llvm.org/show_bug.cgi?id=49962
Since recovering test runtime regressions is very difficult, it's probably a good idea to revert this first and then reland in smaller chunks to see which of the commits causes the problem (?)
The fact that we're running the optimization pipeline in these tests might be to blame. Might also be that riscv_vector.h is currently about 71000 lines which probably isn't quick to parse.
The fact that we're running the optimization pipeline in these tests might be to blame. Might also be that riscv_vector.h is currently about 71000 lines which probably isn't quick to parse.
I found removing all overloaded api definition functions in riscv_vector.h could save the time more than 100% (run llvm-lit with one file vloxei.c, cost 3.57s-> 1.56s)
I'm not sure how to improve the riscv_vector.h, is it possible to generate a PCH for riscv_vector.h?
I hacked the RISCVVEmitter.cpp file to emit a riscv_vector_testing.h that wraps all the definitions with macros like RVV_TEST_vadd and RVV_TEST_vadd_mask (I used the IRName as it was easier for the experiment).
The testing header looks like this
#if defined(RVV_TEST_vadd) __rvv_overloaded vint8m1_t vadd(vint8m1_t op0, vint8m1_t op1, size_t op2){ return vadd_vv_i8m1(op0, op1, op2); } #endif // RVV_TEST_vadd #if defined(RVV_TEST_vadd_mask) __rvv_overloaded vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t op3, size_t op4){ return vadd_vv_i8m1_m(op0, op1, op2, op3, op4); } #endif // RVV_TEST_vadd_mask
In a release build (in a machine that is not very fast) looks like this:
$ ./bin/llvm-lit --time-test -sv ../llvm-src/clang/test/CodeGen/RISCV/rvv-intrinsics/vadd.c ../llvm-src/clang/test/CodeGen/RISCV/rvv-intrinsics/vadd_fast.c llvm-lit: /home/rferrer/fio/upstream/llvm-src/llvm/utils/lit/lit/llvm/config.py:428: note: using clang: /home/rferrer/fio/upstream/llvm-build-release/bin/clang Slowest Tests: -------------------------------------------------------------------------- 8.91s: Clang :: CodeGen/RISCV/rvv-intrinsics/vadd.c 2.11s: Clang :: CodeGen/RISCV/rvv-intrinsics/vadd_fast.c
Where vadd.c is the current file and vadd_fast.c is the same as vadd.c with this change
--- vadd.c 2021-04-15 06:51:30.554996630 +0000 +++ vadd_fast.c 2021-04-15 07:00:38.684074973 +0000 @@ -8,7 +8,9 @@ // RUN: -Werror -Wall -o - %s -S >/dev/null 2>&1 | FileCheck --check-prefix=ASM --allow-empty %s // ASM-NOT: warning -#include <riscv_vector.h> +#define RVV_TEST_vadd +#define RVV_TEST_vadd_mask +#include <riscv_vector_testing.h> // CHECK-RV32-LABEL: @test_vadd_vv_i8mf8( // CHECK-RV32-NEXT: entry:
This seems to suggest that making sure the tested header contains only the necessary will significantly speed up the testing.
We would want to make sure riscv_vector.h and riscv_vector_testing.h are equivalent when all the macros are enabled. I guess such a test is feasible if we preprocess the testing header with all the RVV_TEST_nnn macros enabled and then we do some form of textual comparison (which will likely have to ignore whitespace). This way we can test riscv_vector.h using its proxy riscv_vector_testing.h.
I've left a PoC of my approach here https://reviews.llvm.org/D100529 in case we can use that
Was there any more progress on this? If not, let's revert for now to not permanently slow down tests by over 20%.
Great, let's wait and see for a few days to see if it lands, and if so how much that recovers then.
Do you mean revert https://github.com/llvm/llvm-project/compare/a3bfddbb6a27...59d5b8c27b43?
Yes.
Please wait a few days.
Those commits add almost total 300k line from all tests.
I think maybe reverting is not a good idea if we have a good progress on this slow down issue.
Please wait a few days.
Those commits add almost total 300k line from all tests.
I think maybe reverting is not a good idea if we have a good progress on this slow down issue.
Yes, if we can fix the regression in a few days, then all is good.
I think maybe reverting is not a good idea if we have a good progress on this slow down issue.
Yes, if we can fix the regression in a few days, then all is good.
Another ping here. check-clang is down to 5 min now, but it's still a 11% regression (used to be 4.5 min, went to 5.5 min, now 5 min).
https://reviews.llvm.org/D100611 has not landed yet. Is the 5 min number with or without that patch?
Without. I'm just looking at my bot cycle times. Ok, let's wait for that to play out :)
@thakis
https://reviews.llvm.org/D100611 had landed, could you check your bot to see the cycle time?
No effect, or even worse. We're back at 5:30 cycle time and we'v been in that state for over 2 weeks now. I think it's time to revert back to green, so this slowdown doesn't become permanent.
I believe we had reductions at builds 7601 and 8175.
It looks like there may have been an increase between builds 7979 and 7991 the build failed for a while there. I suspect "e951b04 [AArch64][SVE] Regression test all ACLE tests with C++" as it added a quite a few RUN lines.
I think we also caused an increase at build 8202 when these 4 commits went in
bd32c2d [RISCV] Implement the vwcvt{u}.x.x.v/vncvt.x.x.w builtin.
645c5f2 [RISCV] Implement the pseudo compare builtin.
bfb3fca [RISCV] Implement the vfabs.v/vfneg.v builtin.
4b24341 [RISCV] Implement the vmmv.m/vmnot.m builtin.
Any disagreement about reverting riscv builtin changes until there's some way to test them without increasing check-clang time by 20%?
We're preparing a patch to remove to stop testing both rv32 and rv64 on every test. That should reduce the time by half. What is an acceptable number?
I have removed rv32 test cases for vector intrinsics in https://github.com/llvm/llvm-project/commit/b358a2be52480c008a0789c78f5df1bb2053bfd8.
! In D99741#2721669, @craig.topper wrote:
We're preparing a patch to remove to stop testing both rv32 and rv64 on every test. That should reduce the time by half. What is an acceptable number?
You tell me. I'd say well below 1%?
I wanted to prepare a graph that shows check-clang time over the last year – it stayed basically constant and the jumped by 20% (now 10% now that half the tests are disabled). It'd look dramatic, but I haven't had time to make a graph. I'm sure you can imagine it though :) (Also, I feel the burden of proof shouldn't be on people pointing out regressions.)
@rnk suggested putting these tests behind some kind of REQUIRES: expensive-checks thing that's off by default, instead of reverting everything, until perf is sorted out. That sounds like a great idea to me – wdyt?
I just uploaded D101700 which takes lit on clang/test/CodeGen/RISCV/rvv* from ~8 seconds to ~5.5 seconds as measured on a 48 core server I use at work. Not sure how that translates to a less capable machine with less cores, slower disk, etc.
For comparison, here are some times for running lit on subsets of tests on this machine before the patch. Mostly looked at clang/test/CodeGen as I was curious to see how close we are to other targets.
clang/test/ - 68.75 seconds clang/test/CodeGen - 24.59 seconds clang/test/CodeGen/X86 - 2.68 seconds clang/test/CodeGen/aarch64* - 12.33 seconds clang/test/CodeGen/arm* - 2.04 seconds
We're also considering D100529 to further reduce the RISCV test time. Another option we've talked about is concatenating some of the tests into a single file to cut down the number of times we parse riscv_vector.h.
At the time the check-clang time increase was first raised, lit on clang/test/CodeGen/RISCV/rvv* was 22.57 seconds on my machine. clang/test/CodeGen/ was 34.15 seconds.