This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fold ADDIs into load/stores with nonzero offsets
ClosedPublic

Authored by luismarques on May 10 2020, 11:21 AM.

Details

Summary

We can often fold an ADDI into the offset of load/store instructions:

(load (addi base, off1), off2) -> (load base, off1+off2)
(store val, (addi base, off1), off2) -> (store val, base, off1+off2)

This is possible when the off1+off2 continues to fit the 12-bit immediate. We remove the previous restriction where we would never fold the ADDIs if the load/stores had nonzero offsets. We now do the fold the the resulting constant still fits a 12-bit immediate, or if off1 is a variable's address and we know based on that variable's alignment that off1+offs2 won't overflow. The first case doesn't seem to currently be exercised by the backend, but the code change is simple and easy to reason about, and handling it specially was actually making the code and the surrounding comments harder to understand.

Diff Detail

Event Timeline

luismarques created this revision.May 10 2020, 11:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2020, 11:21 AM

This is looking like a good improvement, thanks @luismarques!

I just have one question about the ConstantSDNode part of the change.

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
220–238

I don't understand where this code is being tested. Can you point to a testcase that changes because of this change?

luismarques marked an inline comment as done.May 11 2020, 3:39 AM
luismarques added inline comments.
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
220–238

That's the situation I mention in the review summary. That case doesn't seem to currently be generated by LLVM, so I couldn't generate a test that would be impacted by it. I can try to look harder or see if e.g. a MIR test could cover that case. Since the code was reasonably straightforward and the alternative (skipping on 0) was actually being harder to document in the code comments I kept the optimization, since it should be reasonably clear to reason about it being correct or not, despite the lack of specific test.

asb added a comment.May 12 2020, 12:19 AM

Could you rebase please? This isn't applying cleanly for me on current master.

Rebase. Now also handles the case of ConstantPoolSDNode.

lenary added inline comments.May 15 2020, 5:28 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
220–238

I though I had come up with two testcases for this, but I see I haven't, because LLVM never makes a selectiondag the equivalent of:

    t0: i32 = LUI <ConstAHi>
  t1: i32 = ADDI t0, <ConstALo>
t2: i32 = LW t1, 0
t3: i32 = LW t1, 4

instead it creates:

    t0: i32 = LUI <ConstAHi>
  t1: i32 = ADDI t0, <ConstALo>
t2: i32 = LW t1, 0
    t3: i32 = LUI <ConstBHi>
  t4: i32 = ADDI <ConstBLo>
t5: i32 = LW t4, 0

This means you still need to do this fold, but the case where the offset to the LW not being zero is not exercised by LLVM.

In any case, if you want it, here are two testcases, which. show you cannot skip the fold in the ConstantSDNode case, but also that your additions to the ConstantSDNode are not tested:

define i64 @load_const_ok() nounwind {
entry:
  %0 = load i64, i64* inttoptr (i32 2040 to i64*)
  ret i64 %0
}

define i64 @load_cost_overflow() nounwind {
entry:
  %0 = load i64, i64* inttoptr (i64 2044 to i64*)
  ret i64 %0
}

I'm not sure we test this kind of folding anywhere else, so I think we should add this to the testcases you add for this PR, even though there are no changes to the generated assembly after your patch.

efriedma added inline comments.
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
231

FYI, D80368 removes GlobalValue::getAlignment(). Value::getPointerAlignment() is the suggested replacement.

Handle removal of GlobalValue::getAlignment.

luismarques marked an inline comment as done.Jun 24 2020, 10:10 AM
luismarques added inline comments.
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
231

Thanks for the heads-up. I guess in this case we want GlobalObject::getAlignment()? If I understand correctly, that's the one that gives the alignment of the variable itself, which is what we need to rely upon to obtain the margin of safety before the 12-bit immediate overflow can occur.

efriedma added inline comments.Jun 24 2020, 11:25 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
231

GlobalObject::getAlignment() is deprecated in favor of GlobalObject::getAlign().

Using GlobalObject::getAlign() isn't wrong, but it's probably not the best approach. In this context, probably the biggest issue is that it can return None; you can handle that conservatively, but handling it correctly is sort of tricky. getPointerAlignment() wraps up all the necessary logic in a single call.

Use getPointerAlignment instead.

lenary accepted this revision.Jul 1 2020, 7:15 AM

LGTM!

This revision is now accepted and ready to land.Jul 1 2020, 7:15 AM
This revision was automatically updated to reflect the committed changes.