This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][NFC] Preserve PDTWrapperPass in UnifyDivergentExitNodes
Changes PlannedPublic

Authored by gandhi21299 on Apr 30 2023, 8:51 PM.

Diff Detail

Event Timeline

gandhi21299 created this revision.Apr 30 2023, 8:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2023, 8:51 PM
gandhi21299 requested review of this revision.Apr 30 2023, 8:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2023, 8:51 PM
gandhi21299 added a project: Restricted Project.
foad added a comment.May 2 2023, 2:57 AM

Should probably fix the other FIXME from 7c8b8063b66c7b936d41a0c4069c506669e13115 at the same time.

Should probably fix the other FIXME from 7c8b8063b66c7b936d41a0c4069c506669e13115 at the same time.

Assertion at llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:290 fails: "SimplifyCFG is not yet capable of maintaining validity of a " "PostDomTree, so don't ask for it."

Internal CI passed.

arsenm accepted this revision.May 2 2023, 4:51 PM
This revision is now accepted and ready to land.May 2 2023, 4:51 PM
foad added a comment.May 3 2023, 12:31 AM

Should probably fix the other FIXME from 7c8b8063b66c7b936d41a0c4069c506669e13115 at the same time.

Assertion at llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:290 fails: "SimplifyCFG is not yet capable of maintaining validity of a " "PostDomTree, so don't ask for it."

So we're using SimplifyCFG which is not capable of preserving PDT, but somehow we get lucky and don't actually hit any cases where SimplifyCFG invalidates the PDT?

uabelho added a subscriber: uabelho.May 3 2023, 4:38 AM

Hi,

With this patch a whole bunch of lit tests fail when compiled with EXPENSIVE_CHECKS on.

Failed Tests (51):
  LLVM :: CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
  LLVM :: CodeGen/AMDGPU/amdpal_scratch_mergedshader.ll
  LLVM :: CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll
  LLVM :: CodeGen/AMDGPU/branch-condition-and.ll
  LLVM :: CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
  LLVM :: CodeGen/AMDGPU/branch-relaxation.ll
  LLVM :: CodeGen/AMDGPU/build-vector-insert-elt-infloop.ll
  LLVM :: CodeGen/AMDGPU/cf-loop-on-constant.ll
  LLVM :: CodeGen/AMDGPU/control-flow-optnone.ll
  LLVM :: CodeGen/AMDGPU/dagcombine-lshr-and-cmp.ll
  LLVM :: CodeGen/AMDGPU/divergence-at-use.ll
  LLVM :: CodeGen/AMDGPU/infinite-loop.ll
  LLVM :: CodeGen/AMDGPU/insert-delay-alu-bug.ll
  LLVM :: CodeGen/AMDGPU/kill-infinite-loop.ll
  LLVM :: CodeGen/AMDGPU/loop-live-out-copy-undef-subrange.ll
  LLVM :: CodeGen/AMDGPU/mdt-preserving-crash.ll
  LLVM :: CodeGen/AMDGPU/mixed-wave32-wave64.ll
  LLVM :: CodeGen/AMDGPU/move-to-valu-worklist.ll
  LLVM :: CodeGen/AMDGPU/multi-divergent-exit-region.ll
  LLVM :: CodeGen/AMDGPU/nested-loop-conditions.ll
  LLVM :: CodeGen/AMDGPU/operand-folding.ll
  LLVM :: CodeGen/AMDGPU/optimize-negated-cond.ll
  LLVM :: CodeGen/AMDGPU/ret_jump.ll
  LLVM :: CodeGen/AMDGPU/salu-to-valu.ll
  LLVM :: CodeGen/AMDGPU/sdwa-peephole.ll
  LLVM :: CodeGen/AMDGPU/si-annotate-cf-noloop.ll
  LLVM :: CodeGen/AMDGPU/si-annotate-cf-unreachable.ll
  LLVM :: CodeGen/AMDGPU/si-annotate-cf.ll
  LLVM :: CodeGen/AMDGPU/si-annotate-cfg-loop-assert.ll
  LLVM :: CodeGen/AMDGPU/si-annotate-nested-control-flows.ll
  LLVM :: CodeGen/AMDGPU/si-lower-control-flow-unreachable-block.ll
  LLVM :: CodeGen/AMDGPU/si-scheduler.ll
  LLVM :: CodeGen/AMDGPU/si-unify-exit-multiple-unreachables.ll
  LLVM :: CodeGen/AMDGPU/si-unify-exit-return-unreachable.ll
  LLVM :: CodeGen/AMDGPU/skip-if-dead.ll
  LLVM :: CodeGen/AMDGPU/switch-default-block-unreachable.ll
  LLVM :: CodeGen/AMDGPU/tuple-allocation-failure.ll
  LLVM :: CodeGen/AMDGPU/uniform-cfg.ll
  LLVM :: CodeGen/AMDGPU/unigine-liveness-crash.ll
  LLVM :: CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll
  LLVM :: CodeGen/AMDGPU/update-phi.ll
  LLVM :: CodeGen/AMDGPU/valu-i1.ll
  LLVM :: CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.ll
  LLVM :: CodeGen/AMDGPU/vgpr-descriptor-waterfall-loop-idom-update.ll
  LLVM :: CodeGen/AMDGPU/vgpr-liverange-ir.ll
  LLVM :: CodeGen/AMDGPU/vgpr-spill-placement-issue61083.ll
  LLVM :: CodeGen/AMDGPU/wave32.ll
  LLVM :: CodeGen/AMDGPU/wmma_modifiers.ll
  LLVM :: Transforms/LoopStrengthReduce/AMDGPU/different-addrspace-crash.ll
  LLVM :: Transforms/LoopStrengthReduce/AMDGPU/lsr-void-inseltpoison.ll
  LLVM :: Transforms/LoopStrengthReduce/AMDGPU/lsr-void.ll

It can also be reproduced by adding -verify-dom-info.

E.g.

build-all/bin/llc -march=amdgcn < test/Transforms/LoopStrengthReduce/AMDGPU/lsr-void.ll -verify-dom-info

fails with

=============================--------------------------------
Inorder PostDominator Tree: DFSNumbers invalid: 0 slow queries.
  [1]  <<exit node>> {4294967295,4294967295} [0]
    [2] %for.body {4294967295,4294967295} [1]
      [3] %entry {4294967295,4294967295} [2]
Roots: %for.body

	Freshly computed tree:
=============================--------------------------------
Inorder PostDominator Tree: DFSNumbers invalid: 0 slow queries.
  [1]  <<exit node>> {4294967295,4294967295} [0]
    [2] %DummyReturnBlock {4294967295,4294967295} [1]
      [3] %for.body {4294967295,4294967295} [2]
        [4] %entry {4294967295,4294967295} [3]
Roots: %DummyReturnBlock
gandhi21299 reopened this revision.May 3 2023, 8:37 AM
This revision is now accepted and ready to land.May 3 2023, 8:37 AM
gandhi21299 planned changes to this revision.May 3 2023, 1:18 PM