The existing tests are checking back-end generated assembly. Instead, we want to check front-end generated IR.
Details
- Reviewers
rengolin echristo cfe-commits - Commits
- rGd162b5c8c479: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
rC256822: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
rL256822: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
Diff Detail
- Repository
- rL LLVM
Event Timeline
Please remove the asm tests here. As I stated in the original review thread there's no reason for them to be here.
Thanks.
-eric
One inline comment, thanks!
-eric
test/CodeGen/aarch64-v8.1a-neon-intrinsics.c | ||
---|---|---|
4 ↗ | (On Diff #42035) | Why do you need to enable the optimizers? |
test/CodeGen/aarch64-v8.1a-neon-intrinsics.c | ||
---|---|---|
4 ↗ | (On Diff #42035) | Our intention with these tests is to check that we are generating a sequence of {v/s}qrdmulh, {v/s}q{add/sub}{s}, shufflevector, {insert/extract}element IR instructions. Using -O1 promotes memory to registers, combines instructions, and therefore decreases the context of IR that we need to check. |
Should be pretty easy to either use CHECK-DAG or pick out the particular instructions you want to check here. Otherwise you're just checking how the optimizer runs. That, in particular, also sounds like a good backend check.
Hi Eric,
The main optimization I feel is useful is mem2reg. Without that, if I want to properly check the right values go to the right operands of the intrinsic calls I have to write FileCheck matchers that match stores and their relevant loads, plus bitcasts. This not only looks more obfuscated than matching the mem2reg output, but it is also less resilient to changes in the way clang code generates.
The generated IR for each intrinsic is around 50 lines. I can just pick out the particular instructions I want to check, as you suggested, but they won't we connected by the flow of values. In my opinion such a test will be less valuable.
I can do this both ways but my preferred way is to run the bare minimum of optimization to de-cruft the output and make the test robust and readable. If you feel however that you don't want the optimizers run I will make a best effort at writing a test that doesn't use them.
I understand the conflicting priorities here for sure. You'd like a test that's as minimal as possible, without having to depend on external (to clang) libraries here. I really would appreciate it if you'd make the test not rely on mem2reg etc so we can be sure that clang's code generation is the thing tested here and not the optimizer. Making sure that the unoptimized output reduces properly would be a great opt test for the backend though.
Thanks!
-eric