This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][CodeGenPrepare] Select the optimal base offset for GEPs with large offset
AbandonedPublic

Authored by dtcxzyw on May 18 2023, 4:41 AM.

Details

Summary

This patch splits the list of sorted distinct GEPs into groups and selects the optimal base offset for each group to ensure that each offset can fit into simm12 on RISC-V.
It also tries to align offsets when compressed load/store instructions are available.

It is an alternative to D150665.
Fixes https://github.com/llvm/llvm-project/issues/62734.

Diff Detail

Event Timeline

dtcxzyw created this revision.May 18 2023, 4:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2023, 4:42 AM
dtcxzyw requested review of this revision.May 18 2023, 4:42 AM
asb added a comment.May 18 2023, 11:05 AM

I'm getting compile-time assertion errors with this on various inputs from the GCC torture suite (Value.cpp:1242 "An asserting value handle still pointed to this value!"). e.g. pr60822.c for lp64 rv64gc, O1.

dtcxzyw updated this revision to Diff 523647.May 18 2023, 8:31 PM
dtcxzyw added a reviewer: asb.

+ Rebase
+ Fix assertion failures

luke added inline comments.May 19 2023, 7:02 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
16455

Just a heads up, using the source/result element types of GEPs to infer the type of memory access is not always accurate, I've encountered this myself in https://reviews.llvm.org/D149889

It's also mentioned in CodeGenPrepare.cpp:

// The result type of the GEP might not be the type of the memory
// access.

I don't think this should block this patch, I just wanted to point it out. The code here is consistent with how CodeGenPrepare is inferring it.

16469

You can use Subtarget.getXLen()

I think we still resolved part of this problem instead of all of it. The reason is for

define void @foo(ptr %dest) {
  %p1 = getelementptr i8, ptr %dest, i32 2048
  store i8 1, ptr %p1
  %p2 = getelementptr i8, ptr %dest, i32 2049
  store i8 1, ptr %p2
  %p3 = getelementptr i8, ptr %dest, i32 2050
  store i8 1, ptr %p3
  %p4 = getelementptr i8, ptr %dest, i32 2051
  store i8 1, ptr %p4
  ret void
}

the extra addi a1, a0, 1 is produced by 2048 can not be represented in Int<12>. And this patch just resolved this kind of case. Consider that if the offset is 4194305, the addi still exists since the instruction selector selects it to

lui	a1, 1024
addiw	a1, a1, 1

If you want to solve all of these kinds of cases, you need to repeat all the logic that selecting a large constant. So maybe handle it after instruction selection is a better way?

dtcxzyw updated this revision to Diff 526314.May 27 2023, 11:24 PM
  • Rebase
  • Fix assertion failures due to dangling value handle references
  • Add a test from @StephenFan, which tries to find an offset divisible by 4096 to use lui + add.
dtcxzyw marked 2 inline comments as done.May 27 2023, 11:38 PM

I think we still resolved part of this problem instead of all of it. The reason is for

define void @foo(ptr %dest) {
  %p1 = getelementptr i8, ptr %dest, i32 2048
  store i8 1, ptr %p1
  %p2 = getelementptr i8, ptr %dest, i32 2049
  store i8 1, ptr %p2
  %p3 = getelementptr i8, ptr %dest, i32 2050
  store i8 1, ptr %p3
  %p4 = getelementptr i8, ptr %dest, i32 2051
  store i8 1, ptr %p4
  ret void
}

the extra addi a1, a0, 1 is produced by 2048 can not be represented in Int<12>. And this patch just resolved this kind of case. Consider that if the offset is 4194305, the addi still exists since the instruction selector selects it to

lui	a1, 1024
addiw	a1, a1, 1

If you want to solve all of these kinds of cases, you need to repeat all the logic that selecting a large constant. So maybe handle it after instruction selection is a better way?

This case has been fixed by choosing an offset divisible by 4096. If we handle it after ISel, we will retrieve the whole 'add-imm' chain for each load/store instruction. As @craig.topper commented in D150665, it defeats the point of the common base optimization.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
16455

Thank you for your suggestion. I will fix this after D150583 is ready.

dtcxzyw marked an inline comment as done.May 27 2023, 11:38 PM

Polite ping :)

skatkov added inline comments.Jun 6 2023, 7:24 PM
llvm/lib/CodeGen/CodeGenPrepare.cpp
6091

If the only reason of introducing F argument to get entry block, why not using
GEP->getFunction()->getEntryBlock();
?

dtcxzyw updated this revision to Diff 534403.Jun 25 2023, 6:58 PM
  • Rebase
  • Address comments
  • use Instruction::getAccessType to infer the type of memory access
dtcxzyw marked an inline comment as done.Jun 25 2023, 6:58 PM
dtcxzyw abandoned this revision.Mon, Nov 20, 7:38 PM