This is an archive of the discontinued LLVM Phabricator instance.

apply two more cases for store combine
Needs ReviewPublic

Authored by BaoshanPang on Jan 29 2022, 6:57 PM.

Details

Summary
  1. handle the case with extra shift
  2. use two short(16-bits) store for 4 byte store when one word store is not allowed

Diff Detail

Event Timeline

BaoshanPang created this revision.Jan 29 2022, 6:57 PM
BaoshanPang requested review of this revision.Jan 29 2022, 6:57 PM
This comment was removed by BaoshanPang.
This comment was removed by BaoshanPang.

@spatel
Please help to review my code.

In the testing(https://reviews.llvm.org/harbormaster/build/217328/), there are some failures but I can't reproduce them locally. Any suggestion is welcome.

Thanks,
Baoshan

Adding more potential reviewers. It might help if the baseline tests are committed, so we can see the asm difference for each test. Is it the alignment or something else that prevents combining these stores currently?

I agree that pre-committed test cases would help here.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7757

It'd be nice if we could be cleverer about this somehow as this is 90% copy/paste from below. Seems like we only use Offset as a key into a map in the following loop. Can we post-process that map instead?

7911

Don't need to pre-declare these variables uninitialized like this.

llvm/test/CodeGen/RISCV/store-combine.ll
3

We normally check RV32 and RV64 in the same test. I think this change benefits RV64 too, right?

80

Do we need all of this metadata for the test to work? I presume we don't need TBAA since the stores trivially don't alias.

Oh and I wouldn't worry about those failing tests: those races tests are/were flaky. They may well resolve themselves then you upload the next version of this diff.

asb added inline comments.Feb 16 2022, 6:34 AM
llvm/test/CodeGen/RISCV/store-combine.ll
5

For a test like this, it's helpful to put a human-readable description of what behaviour it's trying to demonstrate.

Adding more potential reviewers. It might help if the baseline tests are committed, so we can see the asm difference for each test. Is it the alignment or something else that prevents combining these stores currently?

The .ll test case is derived from this C file:

struct ab {
    char a;
    char b;
    char c;
    char d;
} __attribute__ ((aligned (2)));

void store0(struct ab *_ab, unsigned v)
{
    _ab->a = v&0xff;
    _ab->b = (v>>8)&0xff;
    _ab->c = (v>>16)&0xff;
    _ab->d = (v>>24)&0xff;
}
void store1(struct ab *_ab, unsigned v)
{
    _ab->a = v&0xff;
    _ab->b = (v>>8)&0xff;
}
void store2(struct ab *_ab, unsigned v)
{
    _ab->c = (v>>16)&0xff;
    _ab->d = (v>>24)&0xff;
}
void store3(struct ab *_ab, unsigned v)
{
    _ab->c = (v)&0xff;
    _ab->d = (v>>8)&0xff;
}

Before my change, the generated asm code is like this:

    .text
    .attribute  4, 16
    .attribute  5, "rv32i2p0_m2p0_a2p0_c2p0"
    .file   "t.c"
    .globl  store0
    .p2align    1
    .type   store0,@function
store0:
    sb  a1, 0(a0)
    srli    a2, a1, 8
    sb  a2, 1(a0)
    srli    a2, a1, 16
    sb  a2, 2(a0)
    srli    a1, a1, 24
    sb  a1, 3(a0)
    ret
.Lfunc_end0:
    .size   store0, .Lfunc_end0-store0

    .globl  store1
    .p2align    1
    .type   store1,@function
store1:
    sh  a1, 0(a0)
    ret
.Lfunc_end1:
    .size   store1, .Lfunc_end1-store1

    .globl  store2
    .p2align    1
    .type   store2,@function
store2:
    srli    a2, a1, 16
    sb  a2, 2(a0)
    srli    a1, a1, 24
    sb  a1, 3(a0)
    ret
.Lfunc_end2:
    .size   store2, .Lfunc_end2-store2

    .globl  store3
    .p2align    1
    .type   store3,@function
store3:
    sh  a1, 2(a0)
    ret
.Lfunc_end3:
    .size   store3, .Lfunc_end3-store3

    .ident  "clang version 15.0.0 (git@172.30.80.200:/opt/repos/fortirisc_llvm.git efb383266d04c70b8adf4ffb3f4872b36bc4653f)"
    .section    ".note.GNU-stack","",@progbits
    .addrsig

With my change the genereated asm code is like this:

    .text
    .attribute  4, 16
    .attribute  5, "rv32i2p0_m2p0_a2p0_c2p0"
    .file   "t.c"
    .globl  store0
    .p2align    1
    .type   store0,@function
store0:
    srli    a2, a1, 16
    sh  a1, 0(a0)
    sh  a2, 16(a0)
    ret
.Lfunc_end0:
    .size   store0, .Lfunc_end0-store0

    .globl  store1
    .p2align    1
    .type   store1,@function
store1:
    sh  a1, 0(a0)
    ret
.Lfunc_end1:
    .size   store1, .Lfunc_end1-store1

    .globl  store2
    .p2align    1
    .type   store2,@function
store2:
    srli    a1, a1, 16
    sh  a1, 2(a0)
    ret
.Lfunc_end2:
    .size   store2, .Lfunc_end2-store2

    .globl  store3
    .p2align    1
    .type   store3,@function
store3:
    sh  a1, 2(a0)
    ret
.Lfunc_end3:
    .size   store3, .Lfunc_end3-store3

    .ident  "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 59bb18ddfec5e5d91079b163a535cb3a6879093e)"
    .section    ".note.GNU-stack","",@progbits
    .addrsig

The difference are at function store0 and store2. For store0, it was failed to use instruction 'sw' as the alignment is 2, my change detect such case and use 2 'sh' for it; for store2, it was failed to use 'sh' because the extra shift(16), my change make it working for this case too.

Fix issues found by code review

BaoshanPang marked 5 inline comments as done.Feb 19 2022, 6:41 PM
BaoshanPang added inline comments.
llvm/test/CodeGen/RISCV/store-combine.ll
80

I am not sure how to remove TBAA(!7).
!7 is referenced by !6, i6 is referenced by all other variables. If I remove !7, it seems I will need to remove all other variables.

BaoshanPang marked an inline comment as done.

update

Tests lost?

Yeal, I made a mistake with last update.

jrtc27 added inline comments.Feb 26 2022, 5:00 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
7828

Run clang-format, the formatting for much of your new code is wrong

llvm/test/CodeGen/RISCV/store-combine.ll
44

Do we need any of these attributes?

80

But are any of them needed?

140

This is entirely useless

141

local_unnamed_addr #0 likely aren't needed? (and #0 doesn't exist)

BaoshanPang marked 4 inline comments as done.Mar 5 2022, 4:12 PM
BaoshanPang added inline comments.
llvm/test/CodeGen/RISCV/store-combine.ll
80

Yes, they are not needed. I fixed them.

Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 4:12 PM
BaoshanPang marked 2 inline comments as done.Mar 5 2022, 4:13 PM