The SVE builtins tests rely on optimisations to remove clutter from
the resulting IR that is not relevant to the tests. However, with
the increasing number of target intrinsic combines the clang tests
are moving further away from verifying what is relevant to clang.
During early SVE (or rather scalable vector) bringup, we chose to
mitigate bugs by minimising our usage of LLVM IR instructions then
later implemented the combines to lower the calls to generic IR once
scalable vector support had matured. With the mitigations no longer
required and the combines mostly trivial I have moved the logic into
CGBuiltins, which allows the existing tests to remain unchanged once
they stop using instcombine.
The optimisations include:
- Using shifts in place of multiplies by power-of-two values.
- Don't emit getelementptrs when offset is zero.
- Use IR based vector splats rather than calls to dup_x.
- Use IR based vector selects rather than calls to sel.
- Use i64 based indices for insertelement.
The test changes are the result of "sed -i -e 's/instcombine,//'",
with the exception of acle_sve_dupq.c which required regeneration
due to its previous reliance on a zext->tunc->zext combine.
The following tests still rely on instcombine because they require
changes beyond CGBuiltin.cpp:
CodeGen/aarch64-sve-intrinsics/acle_sve_clasta.c CodeGen/aarch64-sve-intrinsics/acle_sve_clastb.c CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c CodeGen/aarch64-sve-intrinsics/acle_sve_dup-bfloat.c CodeGen/aarch64-sve-intrinsics/acle_sve_dup.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1-bfloat.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sb.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sh.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1sw.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1ub.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uh.c CodeGen/aarch64-sve-intrinsics/acle_sve_ld1uw.c CodeGen/aarch64-sve-intrinsics/acle_sve_len-bfloat.c CodeGen/aarch64-sve-intrinsics/acle_sve_len.c CodeGen/aarch64-sve-intrinsics/acle_sve_rdffr.c CodeGen/aarch64-sve-intrinsics/acle_sve_sel-bfloat.c CodeGen/aarch64-sve-intrinsics/acle_sve_sel.c CodeGen/aarch64-sve-intrinsics/acle_sve_st1-bfloat.c CodeGen/aarch64-sve-intrinsics/acle_sve_st1.c CodeGen/aarch64-sve-intrinsics/acle_sve_st1b.c CodeGen/aarch64-sve-intrinsics/acle_sve_st1h.c CodeGen/aarch64-sve-intrinsics/acle_sve_st1w.c
Tests within aarch64-sve2-intrinsics don't use opt but instead use
-O1 to cleanup their output. These tests remain unchanged and will
be visited by a later patch.
Depends on D140983
Given this seems a frequent idiom is it worth putting this into a helper routine? i.e. something like
where