This is an archive of the discontinued LLVM Phabricator instance.

llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
183	This was already generated. If you use i32 loads/stores with @g_4 then both RV32 and RV64 see codegen differences. Using an i64 on RV32 gets a TokenFactor to glue together the splitting of the illegal i64 store into two legal i32 stores and so gives you a root node that's not one of the nodes you want to optimise.
194–195	Without this patch these two lines were: addi a0, a0, %lo(g_8) addi a1, a1, 1 sd a1, 0(a0) i.e. the %lo wasn't folded into the store's immediate due to the store being the root node

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

jrtc27 added inline comments.Feb 16 2022, 6:08 PM

llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
198	Unless you feed the IR through opt, this isn't actually needed, you can just have entry's contents be if.then. The key thing is the ret isn't in the same basic block as the store as that would otherwise be the root for everything with a chain. Maybe it's a good idea to keep the dummy compare though so it's safe against optimisation silently folding the ret back into the basic block, as passing this through opt as it stands does nothing.

In D119934#3328022, @craig.topper wrote:

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

Harbormaster completed remote builds in B150117: Diff 409461.Feb 16 2022, 6:20 PM

In D119934#3328030, @jrtc27 wrote:

In D119934#3328022, @craig.topper wrote:

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

With the current post-processing on X86 I don't thin you could get a failure. None of the opcodes that are being looked for have chain outputs so they can't be the root.

In D119934#3328037, @craig.topper wrote:

In D119934#3328030, @jrtc27 wrote:

In D119934#3328022, @craig.topper wrote:

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

With the current post-processing on X86 I don't thin you could get a failure. None of the opcodes that are being looked for have chain outputs so they can't be the root.

Ok, so a possible bug waiting to happen but probably not currently an issue.

update test case

Harbormaster completed remote builds in B150121: Diff 409469.Feb 16 2022, 7:33 PM

ping...

LGTM

This revision is now accepted and ready to land.Feb 24 2022, 7:16 PM

This revision was landed with ongoing or failed builds.Feb 25 2022, 4:38 AM

Closed by commit rG865fe131f87c: [RISCV] Fix a mistake in PostprocessISelDAG (authored by Luhaocong, committed by benshi001). · Explain Why

This revision was automatically updated to reflect the committed changes.

benshi001 added a commit: rG865fe131f87c: [RISCV] Fix a mistake in PostprocessISelDAG.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelDAGToDAG.cpp

3 lines

test/

CodeGen/

RISCV/

fold-addi-loadstore.ll

31 lines

Diff 411380

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	for (SelectionDAG::allnodes_iterator I = CurDAG->allnodes_begin(),
// Now that we did that, the node is dead. Increment the iterator to the		// Now that we did that, the node is dead. Increment the iterator to the
// next node to process, then delete N.		// next node to process, then delete N.
++I;		++I;
CurDAG->DeleteNode(N);		CurDAG->DeleteNode(N);
}		}
}		}

void RISCVDAGToDAGISel::PostprocessISelDAG() {		void RISCVDAGToDAGISel::PostprocessISelDAG() {
		HandleSDNode Dummy(CurDAG->getRoot());
SelectionDAG::allnodes_iterator Position = CurDAG->allnodes_end();		SelectionDAG::allnodes_iterator Position = CurDAG->allnodes_end();

bool MadeChange = false;		bool MadeChange = false;
while (Position != CurDAG->allnodes_begin()) {		while (Position != CurDAG->allnodes_begin()) {
SDNode N = &--Position;		SDNode N = &--Position;
// Skip dead nodes and any non-machine opcodes.		// Skip dead nodes and any non-machine opcodes.
if (N->use_empty() \|\| !N->isMachineOpcode())		if (N->use_empty() \|\| !N->isMachineOpcode())
continue;		continue;

MadeChange \|= doPeepholeSExtW(N);		MadeChange \|= doPeepholeSExtW(N);
MadeChange \|= doPeepholeLoadStoreADDI(N);		MadeChange \|= doPeepholeLoadStoreADDI(N);
MadeChange \|= doPeepholeMaskedRVV(N);		MadeChange \|= doPeepholeMaskedRVV(N);
}		}

		CurDAG->setRoot(Dummy.getValue());

if (MadeChange)		if (MadeChange)
CurDAG->RemoveDeadNodes();		CurDAG->RemoveDeadNodes();
}		}

static SDNode selectImmWithConstantPool(SelectionDAG CurDAG, const SDLoc &DL,		static SDNode selectImmWithConstantPool(SelectionDAG CurDAG, const SDLoc &DL,
const MVT VT, int64_t Imm,		const MVT VT, int64_t Imm,
const RISCVSubtarget &Subtarget) {		const RISCVSubtarget &Subtarget) {
assert(VT == MVT::i64 && "Expecting MVT::i64");		assert(VT == MVT::i64 && "Expecting MVT::i64");
▲ Show 20 Lines • Show All 2,127 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll

	Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines
	; RV64I-NEXT: lui a0, %hi(g_8)			; RV64I-NEXT: lui a0, %hi(g_8)
	; RV64I-NEXT: sd zero, %lo(g_8)(a0)			; RV64I-NEXT: sd zero, %lo(g_8)(a0)
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	entry:			entry:
	store i64 0, i64* @g_8			store i64 0, i64* @g_8
	ret void			ret void
	}			}

				; Check if we can fold ADDI into the offset of store instructions,
				; when store instructions is the root node in DAG.

				@g_4_i32 = global i32 0, align 4

				define dso_local void @inc_g_i32() nounwind {
				; RV32I-LABEL: inc_g_i32:
				; RV32I: # %bb.0: # %entry
				; RV32I-NEXT: lui a0, %hi(g_4_i32)
				; RV32I-NEXT: lw a1, %lo(g_4_i32)(a0)
				; RV32I-NEXT: addi a1, a1, 1
				; RV32I-NEXT: sw a1, %lo(g_4_i32)(a0)
				; RV32I-NEXT: ret
				;
				; RV64I-LABEL: inc_g_i32:
				jrtc27Unsubmitted Not Done Reply Inline Actions This was already generated. If you use i32 loads/stores with @g_4 then both RV32 and RV64 see codegen differences. Using an i64 on RV32 gets a TokenFactor to glue together the splitting of the illegal i64 store into two legal i32 stores and so gives you a root node that's not one of the nodes you want to optimise. jrtc27: This was already generated. If you use i32 loads/stores with @g_4 then both RV32 and RV64 see…
				; RV64I: # %bb.0: # %entry
				; RV64I-NEXT: lui a0, %hi(g_4_i32)
				; RV64I-NEXT: lw a1, %lo(g_4_i32)(a0)
				; RV64I-NEXT: addiw a1, a1, 1
				; RV64I-NEXT: sw a1, %lo(g_4_i32)(a0)
				; RV64I-NEXT: ret
				entry:
				%0 = load i32, i32* @g_4_i32
				%inc = add i32 %0, 1
				store i32 %inc, i32* @g_4_i32
				br label %if.end

				jrtc27Unsubmitted Not Done Reply Inline Actions Without this patch these two lines were: addi a0, a0, %lo(g_8) addi a1, a1, 1 sd a1, 0(a0) i.e. the %lo wasn't folded into the store's immediate due to the store being the root node jrtc27: Without this patch these two lines were: ``` addi a0, a0, %lo(g_8) addi…
				if.end:
				ret void
				}
				jrtc27Unsubmitted Not Done Reply Inline Actions Unless you feed the IR through opt, this isn't actually needed, you can just have entry's contents be if.then. The key thing is the ret isn't in the same basic block as the store as that would otherwise be the root for everything with a chain. Maybe it's a good idea to keep the dummy compare though so it's safe against optimisation silently folding the ret back into the basic block, as passing this through opt as it stands does nothing. jrtc27: Unless you feed the IR through opt, this isn't actually needed, you can just have entry's…

	; Check for folds in accesses to the second element of an i64 array.			; Check for folds in accesses to the second element of an i64 array.

	@ga_8 = dso_local local_unnamed_addr global [2 x i64] zeroinitializer, align 8			@ga_8 = dso_local local_unnamed_addr global [2 x i64] zeroinitializer, align 8
	@ga_16 = dso_local local_unnamed_addr global [2 x i64] zeroinitializer, align 16			@ga_16 = dso_local local_unnamed_addr global [2 x i64] zeroinitializer, align 16

	define dso_local i64 @load_ga_8() nounwind {			define dso_local i64 @load_ga_8() nounwind {
	; RV32I-LABEL: load_ga_8:			; RV32I-LABEL: load_ga_8:
	; RV32I: # %bb.0: # %entry			; RV32I: # %bb.0: # %entry
	▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fix a mistake in PostprocessISelDAGClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 411380

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll

[RISCV] Fix a mistake in PostprocessISelDAG
ClosedPublic