Download Raw Diff

Details

Reviewers

Commits

rG2ccb2dbc8d04: [RISCV] Don't fold RISCVISD::VMV_V_X_VL series node and scalar load to vector…

Summary

We try to fold RISCVISD::VMV_V_X_VL series node + scalar load -> vector load. But if scalar load is indexed load (load update form), it's not profitable to fold because load update node can't be removed after fold.

Diff Detail

Event Timeline

zixuan-wu created this revision.Jun 5 2023, 7:42 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 7:42 PM

Herald added subscribers: jobnoorman, luke, VincentWu and 28 others. · View Herald Transcript

zixuan-wu requested review of this revision.Jun 5 2023, 7:42 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2023, 7:42 PM

Herald added subscribers: llvm-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

zixuan-wu edited the summary of this revision. (Show Details)Jun 5 2023, 7:43 PM

zixuan-wu edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B236790: Diff 528665.Jun 5 2023, 7:43 PM

When does a load and update node occur on RISC-v?

In D152222#4397974, @craig.topper wrote:

When does a load and update node occur on RISC-v?

It's actually less frequent because the standard extension does not have load update instruction. But XTHeadFMemIdx extension has load update instruction, so it appears in downstream.

In D152222#4397976, @zixuan-wu wrote:

In D152222#4397974, @craig.topper wrote:

When does a load and update node occur on RISC-v?

It's actually less frequent because the standard extension does not have load update instruction. But XTHeadFMemIdx extension has load update instruction, so it appears in downstream.

Should we even do the transform for an indexed load? Won’t we lose the update part of the load update?

In D152222#4397984, @craig.topper wrote:

In D152222#4397976, @zixuan-wu wrote:

In D152222#4397974, @craig.topper wrote:

When does a load and update node occur on RISC-v?

It's actually less frequent because the standard extension does not have load update instruction. But XTHeadFMemIdx extension has load update instruction, so it appears in downstream.

Should we even do the transform for an indexed load? Won’t we lose the update part of the load update?

It's not about the replacement of load update node, it's about VMV_V_X_VL node replacement. It just uses the data part of load update result. As the chain replacement, it just upgrades the chain from load update node to new vector load node because the new vector load node inserted after load update node in sequence order.

In D152222#4397994, @zixuan-wu wrote:

In D152222#4397984, @craig.topper wrote:

In D152222#4397976, @zixuan-wu wrote:

In D152222#4397974, @craig.topper wrote:

When does a load and update node occur on RISC-v?

It's actually less frequent because the standard extension does not have load update instruction. But XTHeadFMemIdx extension has load update instruction, so it appears in downstream.

Should we even do the transform for an indexed load? Won’t we lose the update part of the load update?

It's not about the replacement of load update node, it's about VMV_V_X_VL node replacement. It just uses the data part of load update result. As the chain replacement, it just upgrades the chain from load update node to new vector load node because the new vector load node inserted after load update node in sequence order.

It’s not legal to replace the chain result of the original node wasn’t deleted. We would need to insert a TokenFactor to collect the chains from the new and old load and replace all uses of the old load’s chain with the TokenFactor result. Otherwise the dependencies aren’t maintained.

It’s also not legal to duplicate a volatile load. I suspect we didn’t check that.

I think we should disable the load update case in isLegalToFold or isProfitableToFold.

It’s not legal to replace the chain result of the original node wasn’t deleted. We would need to insert a TokenFactor to collect the chains from the new and old load and replace all uses of the old load’s chain with the TokenFactor result.

OK, using TokenFactor is more formal, and I agree with it. But it's another issue to enhance the code, we still need think about load update load.

It’s also not legal to duplicate a volatile load. I suspect we didn’t check that.

I aggree.

I think we should disable the load update case in isLegalToFold or isProfitableToFold.

Why is load update not profitable or legal? I don't see any difference with normal load.

In D152222#4398167, @zixuan-wu wrote:

It’s not legal to replace the chain result of the original node wasn’t deleted. We would need to insert a TokenFactor to collect the chains from the new and old load and replace all uses of the old load’s chain with the TokenFactor result.

OK, using TokenFactor is more formal, and I agree with it. But it's another issue to enhance the code, we still need think about load update load.

It’s also not legal to duplicate a volatile load. I suspect we didn’t check that.

I aggree.

I think we should disable the load update case in isLegalToFold or isProfitableToFold.

Why is load update not profitable or legal? I don't see any difference with normal load.

We only fold normal loads if it removes the scalar load. So that's different than the load update which won't be removed.

It's not legal if we don't add the TokenFactor and fix the volatile issue. Making it not legal/profitable avoids those issues.

Ok I looked closer at the code. Neither isProfitableToFold or isLegalToFold are implemented in this file. I think we could add an override of isProfitableToFold, but might not be worth it. We can reject load update this code near where we call isProfitableToFold/isLegalToFold.

In D152222#4398201, @craig.topper wrote:

Ok I looked closer at the code. Neither isProfitableToFold or isLegalToFold are implemented in this file. I think we could add an override of isProfitableToFold, but might not be worth it. We can reject load update this code near where we call isProfitableToFold/isLegalToFold.

Yes, and I also think it makes sense more because rejecting load update is not common logic to add to isProfitableToFold/isLegalToFold.

We only fold normal loads if it removes the scalar load. So that's different than the load update which won't be removed.

I got.
You mean the scalar load update node can't be folded so that there are still two load(one scalar and one vector) result. Instead, it should be one vector load.

In D152222#4398226, @zixuan-wu wrote:

We only fold normal loads if it removes the scalar load. So that's different than the load update which won't be removed.

I got.
You mean the scalar load update node can't be folded so that there are still two load(one scalar and one vector) result. Instead, it should be one vector load.

Correct.

HasVendorXTHeadFMemIdx is in upstream, so we should be able to write a test for this?

zixuan-wu updated this revision to Diff 528719.Jun 6 2023, 12:15 AM

zixuan-wu retitled this revision from [RISCV] Fix the num of chain SDNode introduced in 9e0f9f113248093e737c4cf5450f0a3c2bcd90ba to [RISCV] Don't fold RISCVISD::VMV_V_X_VL series node and scalar load to vector load when scalar load is update load.

zixuan-wu edited the summary of this revision. (Show Details)

In D152222#4398334, @craig.topper wrote:

HasVendorXTHeadFMemIdx is in upstream, so we should be able to write a test for this?

Theoretically, it can construct one case with XTHeadFMemIdx extension, but it only appears in downstream codebase. I am trying to reproduce it without downstream code.

zixuan-wu updated this revision to Diff 528738.Jun 6 2023, 1:15 AM

This comment has been deleted.

llvm/test/CodeGen/RISCV/rvv/fold-scalar-load-crash.ll
11	Why is there no instructions here? All of them are optimized out?

Harbormaster completed remote builds in B236851: Diff 528738.Jun 6 2023, 1:38 AM

zixuan-wu added inline comments.Jun 6 2023, 1:45 AM

llvm/test/CodeGen/RISCV/rvv/fold-scalar-load-crash.ll
11	it's interesting. It's removed by dead-mi-elimination pass. But this is truly a case reduced from big original source to reproduce the compiling assert in DAG selection stage :)

• pcwang-thead added inline comments.Jun 6 2023, 2:04 AM

llvm/test/CodeGen/RISCV/rvv/fold-scalar-load-crash.ll

How about:

define i32 @test(i32 %size, ptr %add.ptr, i64 %const) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %add.ptr1 = getelementptr i8, ptr %add.ptr, i32 -1
  %add.ptr2 = getelementptr i8, ptr %add.ptr1, i32 %size
  %0 = load i8, ptr %add.ptr1, align 1
  %1 = load i8, ptr %add.ptr2, align 1
  %2 = insertelement <8 x i8> poison, i8 %0, i64 0
  %3 = insertelement <8 x i8> %2, i8 0, i64 %const
  %4 = insertelement <8 x i8> %3, i8 %1, i64 0
  %5 = icmp ult <8 x i8> %4, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
  %6 = bitcast <8 x i1> %5 to i8
  %7 = zext i8 %6 to i32
  %cond = icmp eq i32 %7, 0
  br i1 %cond, label %if.then381, label %for.body

if.then381:                                       ; preds = %for.body
  ret i32 0
}

zixuan-wu added inline comments.Jun 6 2023, 2:18 AM