This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fix a mistake in PostprocessISelDAG
ClosedPublic

Authored by Luhaocong on Feb 16 2022, 4:52 AM.

Details

Summary

With the condition N->use_empty() , the root node of DAG always misses peephole optimization.
A dummy node is needed here.

Diff Detail

Event Timeline

Luhaocong created this revision.Feb 16 2022, 4:52 AM
Luhaocong requested review of this revision.Feb 16 2022, 4:52 AM

Do you have test case?

Luhaocong updated this revision to Diff 409461.Feb 16 2022, 5:46 PM
Luhaocong retitled this revision from [RISCV][NFC] Fix a mistake in PostprocessISelDAG to [RISCV] Fix a mistake in PostprocessISelDAG.

upload a test case

Do you have test case?

store i64 %inc, i64* @g_8 is the root node of CurDAG,with N->use_empty() is true

jrtc27 added inline comments.Feb 16 2022, 5:56 PM
llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
183 ↗(On Diff #409461)

This was already generated. If you use i32 loads/stores with @g_4 then both RV32 and RV64 see codegen differences. Using an i64 on RV32 gets a TokenFactor to glue together the splitting of the illegal i64 store into two legal i32 stores and so gives you a root node that's not one of the nodes you want to optimise.

194–195 ↗(On Diff #409461)

Without this patch these two lines were:

addi    a0, a0, %lo(g_8)
addi    a1, a1, 1
sd      a1, 0(a0)

i.e. the %lo wasn't folded into the store's immediate due to the store being the root node

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

jrtc27 added inline comments.Feb 16 2022, 6:08 PM
llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
198 ↗(On Diff #409461)

Unless you feed the IR through opt, this isn't actually needed, you can just have entry's contents be if.then. The key thing is the ret isn't in the same basic block as the store as that would otherwise be the root for everything with a chain. Maybe it's a good idea to keep the dummy compare though so it's safe against optimisation silently folding the ret back into the basic block, as passing this through opt as it stands does nothing.

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

With the current post-processing on X86 I don't thin you could get a failure. None of the opcodes that are being looked for have chain outputs so they can't be the root.

I suppose we inherited this bug from PowerPC. @nemanjai maybe you want to fix this for PowerPC?

X86 is the same from the looks of it. The only other implementation of PostprocessISelDAG is AMDGPU which does things a bit differently and doesn't seem to have an equivalent "check for uses".

With the current post-processing on X86 I don't thin you could get a failure. None of the opcodes that are being looked for have chain outputs so they can't be the root.

Ok, so a possible bug waiting to happen but probably not currently an issue.

Luhaocong updated this revision to Diff 409469.Feb 16 2022, 6:49 PM

update test case

This revision is now accepted and ready to land.Feb 24 2022, 7:16 PM
This revision was landed with ongoing or failed builds.Feb 25 2022, 4:38 AM
This revision was automatically updated to reflect the committed changes.