- handle the case with extra shift
- use two short(16-bits) store for 4 byte store when one word store is not allowed
Details
- Reviewers
spatel craig.topper frasercrmck RKSimon
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
@spatel
Please help to review my code.
In the testing(https://reviews.llvm.org/harbormaster/build/217328/), there are some failures but I can't reproduce them locally. Any suggestion is welcome.
Thanks,
Baoshan
Adding more potential reviewers. It might help if the baseline tests are committed, so we can see the asm difference for each test. Is it the alignment or something else that prevents combining these stores currently?
I agree that pre-committed test cases would help here.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
7757 | It'd be nice if we could be cleverer about this somehow as this is 90% copy/paste from below. Seems like we only use Offset as a key into a map in the following loop. Can we post-process that map instead? | |
7911 | Don't need to pre-declare these variables uninitialized like this. | |
llvm/test/CodeGen/RISCV/store-combine.ll | ||
3 | We normally check RV32 and RV64 in the same test. I think this change benefits RV64 too, right? | |
80 | Do we need all of this metadata for the test to work? I presume we don't need TBAA since the stores trivially don't alias. |
Oh and I wouldn't worry about those failing tests: those races tests are/were flaky. They may well resolve themselves then you upload the next version of this diff.
llvm/test/CodeGen/RISCV/store-combine.ll | ||
---|---|---|
5 | For a test like this, it's helpful to put a human-readable description of what behaviour it's trying to demonstrate. |
The .ll test case is derived from this C file:
struct ab { char a; char b; char c; char d; } __attribute__ ((aligned (2))); void store0(struct ab *_ab, unsigned v) { _ab->a = v&0xff; _ab->b = (v>>8)&0xff; _ab->c = (v>>16)&0xff; _ab->d = (v>>24)&0xff; } void store1(struct ab *_ab, unsigned v) { _ab->a = v&0xff; _ab->b = (v>>8)&0xff; } void store2(struct ab *_ab, unsigned v) { _ab->c = (v>>16)&0xff; _ab->d = (v>>24)&0xff; } void store3(struct ab *_ab, unsigned v) { _ab->c = (v)&0xff; _ab->d = (v>>8)&0xff; }
Before my change, the generated asm code is like this:
.text .attribute 4, 16 .attribute 5, "rv32i2p0_m2p0_a2p0_c2p0" .file "t.c" .globl store0 .p2align 1 .type store0,@function store0: sb a1, 0(a0) srli a2, a1, 8 sb a2, 1(a0) srli a2, a1, 16 sb a2, 2(a0) srli a1, a1, 24 sb a1, 3(a0) ret .Lfunc_end0: .size store0, .Lfunc_end0-store0 .globl store1 .p2align 1 .type store1,@function store1: sh a1, 0(a0) ret .Lfunc_end1: .size store1, .Lfunc_end1-store1 .globl store2 .p2align 1 .type store2,@function store2: srli a2, a1, 16 sb a2, 2(a0) srli a1, a1, 24 sb a1, 3(a0) ret .Lfunc_end2: .size store2, .Lfunc_end2-store2 .globl store3 .p2align 1 .type store3,@function store3: sh a1, 2(a0) ret .Lfunc_end3: .size store3, .Lfunc_end3-store3 .ident "clang version 15.0.0 (git@172.30.80.200:/opt/repos/fortirisc_llvm.git efb383266d04c70b8adf4ffb3f4872b36bc4653f)" .section ".note.GNU-stack","",@progbits .addrsig
With my change the genereated asm code is like this:
.text .attribute 4, 16 .attribute 5, "rv32i2p0_m2p0_a2p0_c2p0" .file "t.c" .globl store0 .p2align 1 .type store0,@function store0: srli a2, a1, 16 sh a1, 0(a0) sh a2, 16(a0) ret .Lfunc_end0: .size store0, .Lfunc_end0-store0 .globl store1 .p2align 1 .type store1,@function store1: sh a1, 0(a0) ret .Lfunc_end1: .size store1, .Lfunc_end1-store1 .globl store2 .p2align 1 .type store2,@function store2: srli a1, a1, 16 sh a1, 2(a0) ret .Lfunc_end2: .size store2, .Lfunc_end2-store2 .globl store3 .p2align 1 .type store3,@function store3: sh a1, 2(a0) ret .Lfunc_end3: .size store3, .Lfunc_end3-store3 .ident "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 59bb18ddfec5e5d91079b163a535cb3a6879093e)" .section ".note.GNU-stack","",@progbits .addrsig
The difference are at function store0 and store2. For store0, it was failed to use instruction 'sw' as the alignment is 2, my change detect such case and use 2 'sh' for it; for store2, it was failed to use 'sh' because the extra shift(16), my change make it working for this case too.
llvm/test/CodeGen/RISCV/store-combine.ll | ||
---|---|---|
80 | I am not sure how to remove TBAA(!7). |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
7828 | Run clang-format, the formatting for much of your new code is wrong | |
llvm/test/CodeGen/RISCV/store-combine.ll | ||
44 | Do we need any of these attributes? | |
80 | But are any of them needed? | |
140 | This is entirely useless | |
141 | local_unnamed_addr #0 likely aren't needed? (and #0 doesn't exist) |
llvm/test/CodeGen/RISCV/store-combine.ll | ||
---|---|---|
80 | Yes, they are not needed. I fixed them. |
It'd be nice if we could be cleverer about this somehow as this is 90% copy/paste from below. Seems like we only use Offset as a key into a map in the following loop. Can we post-process that map instead?