Page MenuHomePhabricator
Feed Advanced Search

Tue, Feb 23

alex-t accepted D97218: [AMDGPU] Set threshold for regbanks reassign pass.

LGTM

Tue, Feb 23, 3:39 AM · Restricted Project

Dec 28 2020

alex-t committed rG644da789e364: [AMDGPU] Split edge to make si_if dominate end_cf (authored by alex-t).
[AMDGPU] Split edge to make si_if dominate end_cf
Dec 28 2020, 6:22 AM
alex-t closed D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Dec 28 2020, 6:22 AM · Restricted Project

Dec 24 2020

alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

The testcase is splitted into 2 separate functions. LLC run line added.

Dec 24 2020, 1:29 AM · Restricted Project
alex-t added inline comments to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Dec 24 2020, 12:35 AM · Restricted Project
alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

There should be both tests, not one larger function that hits both

Dec 24 2020, 12:08 AM · Restricted Project

Dec 23 2020

alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

New test created to cover both failed cases:

  • end_cf in the block that has yet another predecessor besides that one defining the exec mask
  • given the pattern above - not any visited node denotes a loop, only when we have a backedge i.e. block's successor dominates the block.
Dec 23 2020, 4:44 AM · Restricted Project

Dec 18 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Needs a testcase for the second broken case

Dec 18 2020, 12:54 PM · Restricted Project

Dec 17 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

There was a bug in SIAnnotateControlFlow. Visited node is not necessarily means loop. It may be CF join instead.
Added check that Term successor visited and dominates Terms's parent.

Dec 17 2020, 8:29 AM · Restricted Project
alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Bug in SIAnnotateControlFlow fixed: simple CF join should not be treated as loop

Dec 17 2020, 8:26 AM · Restricted Project

Dec 16 2020

alex-t committed rG35ec3ff76dee: Disable Jump Threading for the targets with divergent control flow (authored by alex-t).
Disable Jump Threading for the targets with divergent control flow
Dec 16 2020, 3:46 PM
alex-t closed D93302: Disable Jump Threading for the targets with divergent control flow.
Dec 16 2020, 3:45 PM · Restricted Project
alex-t updated the summary of D93302: Disable Jump Threading for the targets with divergent control flow.
Dec 16 2020, 2:07 PM · Restricted Project
alex-t updated the diff for D93302: Disable Jump Threading for the targets with divergent control flow.

REUIRES x86 && amdgpu clause added in test

Dec 16 2020, 2:00 PM · Restricted Project
alex-t updated the diff for D93302: Disable Jump Threading for the targets with divergent control flow.

Test that ensures optimization disabled for the target with divergent CF and enabled otherwise.

Dec 16 2020, 9:55 AM · Restricted Project

Dec 15 2020

alex-t added a reviewer for D93302: Disable Jump Threading for the targets with divergent control flow: rampitec.
Dec 15 2020, 8:18 AM · Restricted Project
alex-t requested review of D93302: Disable Jump Threading for the targets with divergent control flow.
Dec 15 2020, 8:17 AM · Restricted Project

Dec 12 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Could you share the original testcase then? I only have that reduced one attached to the Jira ticket.
And it works for it.

Dec 12 2020, 1:48 AM · Restricted Project

Nov 26 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

ping

Nov 26 2020, 9:15 AM · Restricted Project

Nov 24 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Ping

Nov 24 2020, 8:58 AM · Restricted Project

Nov 23 2020

alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.

@alex-t are you still planning to work on this? Or has it been (partly or wholly) superseded by
@piotr's rG0045786f146e78afee49eee053dc29ebc842fee1?

Nov 23 2020, 8:02 AM · Restricted Project
alex-t added inline comments to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Nov 23 2020, 7:43 AM · Restricted Project
alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

dyn_cast changed to cast

Nov 23 2020, 7:43 AM · Restricted Project
alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.
Nov 23 2020, 7:35 AM · Restricted Project

Nov 19 2020

alex-t added a comment to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Ping!

Nov 19 2020, 5:07 AM · Restricted Project

Nov 16 2020

alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

The odd lines removed from the test

Nov 16 2020, 8:44 AM · Restricted Project
alex-t added inline comments to D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Nov 16 2020, 8:41 AM · Restricted Project
alex-t updated the summary of D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Nov 16 2020, 5:47 AM · Restricted Project
alex-t added a comment to D86833: [AMDGPU] AMDGPUAAResult::pointsToConstantMemory should not use the default MaxLookup (i.e., 6) to limit getUnderlyingObject.

I would agree with Stas here.
In case you can identify the patterns that require the lookup deeper then 6 levels, you probably can formulate the exact threshold.
And adding tests for such a pattern would make it clear.

Nov 16 2020, 4:41 AM · Restricted Project
alex-t added a comment to D88485: [SDag][AMDGPU] Maintain DAG divergence through instruction selection.

It is not clear to me why do we need to query divergence information for MachineSDNode?
After unstruction selection is done we should have all the instructions selected correctly to VALU vs SALU basing on the information that is available at the selection stage.
Thus, we can use isDivergent bit value set for the MachineSDNode in case we need to recompute or update divergence information after selection.
So, instead of adding machine opcodes to isSDNodeSourceOfDivergence it is better to mark that opcodes right away as they are selected.

Nov 16 2020, 4:34 AM · Restricted Project

Nov 13 2020

alex-t updated the diff for D91435: [AMDGPU] Split edge to make si_if dominate end_cf.

Test added

Nov 13 2020, 9:13 AM · Restricted Project
alex-t requested review of D91435: [AMDGPU] Split edge to make si_if dominate end_cf.
Nov 13 2020, 8:59 AM · Restricted Project

Oct 30 2020

alex-t committed rGa4f7e4264cfc: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the… (authored by alex-t).
[AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the…
Oct 30 2020, 4:46 AM
alex-t closed D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.
Oct 30 2020, 4:46 AM · Restricted Project
alex-t added inline comments to D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.
Oct 30 2020, 4:10 AM · Restricted Project
alex-t updated the diff for D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.

Typo in assert message corrected.

Oct 30 2020, 4:09 AM · Restricted Project
alex-t added a comment to D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.

The patch is passed PSDB.

Oct 30 2020, 4:06 AM · Restricted Project

Oct 29 2020

alex-t updated the diff for D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.

assert expression fixed.

Oct 29 2020, 5:00 AM · Restricted Project

Oct 28 2020

alex-t added a comment to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

The new review opened to address curent improvements : https://reviews.llvm.org/D90314

Oct 28 2020, 9:05 AM · Restricted Project
alex-t requested review of D90314: [AMDGPU] SILowerControlFlow::removeMBBifRedundant. Refactoring plus fix for the null MBB pointer in MF->splice.
Oct 28 2020, 9:04 AM · Restricted Project

Oct 27 2020

alex-t added a comment to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

BTW, new change has successfully passed ePSDB

Oct 27 2020, 10:06 AM · Restricted Project
alex-t added a comment to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

It seems to remove the test?

Oct 27 2020, 9:08 AM · Restricted Project

Oct 26 2020

alex-t added a reviewer for D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough: foad.
Oct 26 2020, 10:33 AM · Restricted Project

Oct 23 2020

alex-t updated the diff for D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

This change addresses the refactoring adviced by foad. It also contain the fix for the case when getNextNode is null if the successor block is the last in MachineFunction.

Oct 23 2020, 2:18 PM · Restricted Project
alex-t reopened D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

Reopened to address refactoring and bugfixing

Oct 23 2020, 2:15 PM · Restricted Project
alex-t added a comment to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

This broke 14 GL_NV_shader_atomic_int64 piglit tests (on Navi 14), e.g. tests/spec/nv_shader_atomic_int64/execution/ssbo-atomicAdd-int.shader_test:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  llvm::PointerIntPair<llvm::ilist_node_base<true>*, 1u, unsigned int, llvm::PointerLikeTypeTraits<llvm::ilist_node_base<true>*>, llvm::PointerIntPairInfo<llvm::ilist_node_base<true>*, 1u, llvm::PointerLikeTypeTraits<llvm::ilist_node_base<true>*> > >::getPointer (this=0x0) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/PointerIntPair.h:59
59	  PointerTy getPointer() const { return Info::getPointer(Value); }
[Current thread is 1 (Thread 0x7f590effd700 (LWP 825448))]
(gdb) bt
#0  llvm::PointerIntPair<llvm::ilist_node_base<true>*, 1u, unsigned int, llvm::PointerLikeTypeTraits<llvm::ilist_node_base<true>*>, llvm::PointerIntPairInfo<llvm::ilist_node_base<true>*, 1u, llvm::PointerLikeTypeTraits<llvm::ilist_node_base<true>*> > >::getPointer (this=0x0) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/PointerIntPair.h:59
#1  llvm::ilist_node_base<true>::getPrev (this=0x0) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist_node_base.h:42
#2  llvm::ilist_base<true>::transferBeforeImpl (Next=..., First=..., Last=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist_base.h:69
#3  llvm::ilist_base<true>::transferBefore<llvm::ilist_node_impl<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void> > > (Next=..., First=..., Last=...)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist_base.h:86
#4  llvm::simple_ilist<llvm::MachineBasicBlock>::splice (this=<optimized out>, I=..., First=..., Last=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/simple_ilist.h:249
#5  llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock> >::transfer (this=<optimized out>, position=..., L2=..., first=..., last=...)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist.h:293
#6  llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock> >::splice (this=<optimized out>, where=..., L2=..., first=...)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist.h:336
#7  llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock> >::splice (this=<optimized out>, where=..., L2=..., N=0x7f590014b8a8)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/ADT/ilist.h:345
#8  llvm::MachineFunction::splice (this=<optimized out>, InsertPt=..., MBB=0x7f590014b8a8) at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/CodeGen/MachineFunction.h:754
#9  (anonymous namespace)::SILowerControlFlow::removeMBBifRedundant (this=0x7f5900034fe0, MBB=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp:731
#10 (anonymous namespace)::SILowerControlFlow::optimizeEndCf (this=0x7f5900034fe0) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp:627
#11 (anonymous namespace)::SILowerControlFlow::runOnMachineFunction (this=<optimized out>, MF=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp:822
#12 0x00007f591856b8dc in llvm::MachineFunctionPass::runOnFunction (this=0x7f5900034fe0, F=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73
#13 0x00007f59183357f6 in llvm::FPPassManager::runOnFunction (this=<optimized out>, F=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1519
#14 0x00007f591945ac24 in (anonymous namespace)::CGPassManager::RunPassOnSCC (this=<optimized out>, P=0x7f5900039220, CurSCC=..., CG=..., CallGraphUpToDate=<optimized out>, DevirtualizedCall=<optimized out>)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:178
#15 (anonymous namespace)::CGPassManager::RunAllPassesOnSCC (this=<optimized out>, CurSCC=..., CG=..., DevirtualizedCall=<optimized out>) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:476
#16 (anonymous namespace)::CGPassManager::runOnModule (this=<optimized out>, M=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:541
#17 0x00007f5918335f26 in (anonymous namespace)::MPPassManager::runOnModule (this=<optimized out>, M=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1634
#18 llvm::legacy::PassManagerImpl::run (this=0x7f5900013980, M=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:615
#19 0x00007f591833be8e in llvm::legacy::PassManager::run (this=this@entry=0x7f5900013968, M=...) at /home/daenzer/src/llvm-git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1761
#20 0x00007f591b57979f in ac_compile_module_to_elf (p=p@entry=0x7f5900013910, module=<optimized out>, pelf_buffer=pelf_buffer@entry=0x5629c6cc29e0, pelf_size=pelf_size@entry=0x5629c6cc29e8)
    at /home/daenzer/src/llvm-git/llvm-project/llvm/include/llvm/IR/Module.h:906
#21 0x00007f591b4c1844 in si_compile_llvm (sscreen=sscreen@entry=0x5629c662a400, binary=binary@entry=0x5629c6cc29e0, conf=conf@entry=0x5629c6cc29f8, compiler=compiler@entry=0x5629c662acb0, ac=ac@entry=0x7f590effb500, 
    debug=debug@entry=0x5629c6cc2360, stage=MESA_SHADER_COMPUTE, name=0x7f591a7ccc24 "Compute Shader", less_optimized=false) at ../src/gallium/drivers/radeonsi/si_shader_llvm.c:104
#22 0x00007f591b4b8aa1 in si_llvm_compile_shader (sscreen=sscreen@entry=0x5629c662a400, compiler=compiler@entry=0x5629c662acb0, shader=shader@entry=0x5629c6cc2920, debug=debug@entry=0x5629c6cc2360, nir=<optimized out>, 
    nir@entry=0x5629c6ce2040, free_nir=<optimized out>) at ../src/gallium/drivers/radeonsi/si_shader.c:1591
#23 0x00007f591b4b9e9f in si_compile_shader (sscreen=0x5629c662a400, compiler=0x5629c662acb0, shader=<optimized out>, debug=0x5629c6cc2360) at ../src/gallium/drivers/radeonsi/si_shader.c:1871
#24 0x00007f591b4badf7 in si_create_shader_variant (sscreen=sscreen@entry=0x5629c662a400, compiler=compiler@entry=0x5629c662acb0, shader=shader@entry=0x5629c6cc2920, debug=debug@entry=0x5629c6cc2360)
    at ../src/gallium/drivers/radeonsi/si_shader.c:2405
#25 0x00007f591b491711 in si_create_compute_state_async (job=job@entry=0x5629c6cc2330, thread_index=thread_index@entry=0) at ../src/gallium/drivers/radeonsi/si_compute.c:185
#26 0x00007f591af59fb1 in util_queue_thread_func (input=input@entry=0x5629c662b9c0) at ../src/util/u_queue.c:308
#27 0x00007f591af59b18 in impl_thrd_routine (p=<optimized out>) at ../include/c11/threads_posix.h:87
#28 0x00007f591c46aea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#29 0x00007f591d071d4f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Oct 23 2020, 7:09 AM · Restricted Project

Oct 15 2020

alex-t added inline comments to D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.
Oct 15 2020, 1:24 PM · Restricted Project
alex-t committed rG42ed38812008: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB… (authored by alex-t).
[AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB…
Oct 15 2020, 1:22 PM
alex-t closed D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.
Oct 15 2020, 1:22 PM · Restricted Project
alex-t updated the diff for D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

minor bugfix

Oct 15 2020, 3:34 AM · Restricted Project

Oct 14 2020

alex-t updated the diff for D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.

Changed according the reviewer request.

Oct 14 2020, 11:08 AM · Restricted Project
alex-t requested review of D89397: [AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough.
Oct 14 2020, 8:39 AM · Restricted Project

Sep 22 2020

alex-t added a comment to D87882: [AMDGPU] Fix merging m0 inits.

Doesn't loop block self dominate?

Sep 22 2020, 12:31 PM · Restricted Project

Sep 17 2020

alex-t committed rG0efbb70b719e: [AMDGPU] should expand ROTL i16 to shifts. (authored by alex-t).
[AMDGPU] should expand ROTL i16 to shifts.
Sep 17 2020, 7:35 AM
alex-t closed D87618: [AMDGPU] should expand ROTL i16 to shifts..
Sep 17 2020, 7:34 AM · Restricted Project

Sep 16 2020

alex-t updated the diff for D87618: [AMDGPU] should expand ROTL i16 to shifts..

tests moved to existing rotl/rotr tests

Sep 16 2020, 2:04 AM · Restricted Project

Sep 15 2020

alex-t updated the diff for D87618: [AMDGPU] should expand ROTL i16 to shifts..

Tests added. ROTR case added.

Sep 15 2020, 10:32 AM · Restricted Project

Sep 14 2020

alex-t requested review of D87618: [AMDGPU] should expand ROTL i16 to shifts..
Sep 14 2020, 9:55 AM · Restricted Project

Sep 8 2020

alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

The idea is:

For the block that is queried

  1. Look for it's predecessors that can pass control through the S_EXECZ/EXECNZ
  2. If found one, look for exec restoring code starting the beginning of the block being queried.
  3. Since exec restoring code always belong to the block prologue, search the prologue and if not found return false.

Considering your comment that exec == 0 does not matter, we'd rather search upwards before the immediate dominator block in encountered to check what we met first - exec modify or exec restore. The problem here is that XOR can be both.

Thanks. More or less you want to disable split in an empty block preceded by c_branch_exec[n]z. I can understand why this would be a problem. In reality such block can be either empty or contain another branch, because it does not make any sense to pass a control to a block with vector instructions and have no active lanes.

But this leaves couple problems:

  1. It does not disallow a split in a block prologue before exec is restored. This creates exactly the same problem.
  2. It does not disallow a split even in an empty block where EXEC is not zero, but just wrong. The problem is not zero EXEC, the problem is wrong EXEC, zero is just once case of this.

JBTW, even with all of this it is still OK to split an LI of SGPR. I'd say at the very least callback needs to take an LI in question as well.

Sep 8 2020, 11:31 AM · Restricted Project

Sep 7 2020

alex-t committed rG2480a31e5d69: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block (authored by alex-t).
[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
Sep 7 2020, 9:38 AM
alex-t closed D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.
Sep 7 2020, 9:37 AM · Restricted Project
alex-t added a comment to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

Enhanced PSDB passed

Sep 7 2020, 8:57 AM · Restricted Project
alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

The idea is:

Sep 7 2020, 7:14 AM · Restricted Project

Sep 4 2020

alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

Also I still think that disabling a whole "endif" block is an overkill.

It only is disabled if S_OR_B64 exec, ... is in the middle of the block that should never happen.

while (isBasicBlockPrologue(*J)) {
  if (IsExecRestore(&*J))
    return true;

assumes that if exec is restored in the block prologue it is valid

So practically it never happens and split is effectively only disabled in an empty block? I said it already: it does not matter that exec is zero, what matters is that it does not match. It does not matter that a block is empty as well, it is enough to split before s_or to hit the bug.

Sep 4 2020, 1:28 PM · Restricted Project
alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

Also I still think that disabling a whole "endif" block is an overkill.

Sep 4 2020, 11:30 AM · Restricted Project
alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

So, since we now have sensible diff to discuss...
Why I decided to disallow split in any block that gets control with exec == 0 and has no restoring code in prologue?
I just did it by example of the code that already does same for the blocks with interference - a bit later below:

Sep 4 2020, 10:27 AM · Restricted Project
alex-t updated the diff for D87107: [AMDGPU] Target hook to apply target specific split constraint.

Now the correct diff uploaded

Sep 4 2020, 10:22 AM · Restricted Project
alex-t added a comment to D87107: [AMDGPU] Target hook to apply target specific split constraint.

Oops. In fact the diff above is not that I was intended to upload. The SIInstrInfo::IsValidForLISplit is a complete nonsense. Probably deleted a part of the function by accident...
I'll get back and upload the working one.

Sep 4 2020, 9:15 AM · Restricted Project

Sep 3 2020

alex-t requested review of D87107: [AMDGPU] Target hook to apply target specific split constraint.
Sep 3 2020, 12:05 PM · Restricted Project

Sep 2 2020

alex-t updated the diff for D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

diff rebased to latest trunk

Sep 2 2020, 2:34 AM · Restricted Project

Sep 1 2020

alex-t updated the diff for D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

No redundant branches anymore

Sep 1 2020, 7:15 AM · Restricted Project

Aug 28 2020

alex-t added a comment to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

The only difference is that now these redundant branch is inserted by MachineBasicBlock::updateTerminator() as Matt suggested.

Aug 28 2020, 3:04 PM · Restricted Project
alex-t added inline comments to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.
Aug 28 2020, 3:02 PM · Restricted Project
alex-t updated the diff for D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

changed as requested by reviewer

Aug 28 2020, 1:56 PM · Restricted Project
alex-t added inline comments to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.
Aug 28 2020, 9:19 AM · Restricted Project
alex-t updated the diff for D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

Added MachineDominatorTree and MachineLoopInfo update after redundant block removal.

Aug 28 2020, 9:19 AM · Restricted Project

Aug 27 2020

alex-t added a comment to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

Should also add a few cases with other empty block situations, including with debug info.

Also should add an example where the original problem occurred

Aug 27 2020, 12:43 PM · Restricted Project
alex-t added inline comments to D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.
Aug 27 2020, 12:39 PM · Restricted Project
alex-t updated the diff for D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.

Changes as requested by reviewer.

Aug 27 2020, 12:37 PM · Restricted Project

Aug 26 2020

alex-t requested review of D86634: [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block.
Aug 26 2020, 8:58 AM · Restricted Project

Jun 26 2020

alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.

This small piece was missed from the change.

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5f1afdd7f10..7180e0a8d52 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -634,6 +634,9 @@ void SIInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
   }
Jun 26 2020, 8:44 AM · Restricted Project
alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.

This change broke thousands of piglit gpu profile tests with Mesa radeonsi on Navi 14.

Jun 26 2020, 1:36 AM · Restricted Project
alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.

and as well as the failures it caused spurious debug output like:

Test case 'dEQP-VK.subgroups.arithmetic.framebuffer.subgroupmax_int_tess_eval'..
S_CMP_LG_U32 killed $sgpr2_sgpr3, 0, implicit-def $scc
S_CMP_LG_U32 killed $sgpr0_sgpr1, 0, implicit-def $scc
  Fail (Failed!)
Jun 26 2020, 1:04 AM · Restricted Project

Jun 25 2020

alex-t added a comment to D82194: [AMDGPU] Enable compare operations to be selected by divergence.

This change broke thousands of piglit gpu profile tests with Mesa radeonsi on Navi 14.

Jun 25 2020, 8:34 AM · Restricted Project

Jun 24 2020

alex-t committed rG521ac0b5cea0: [AMDGPU] Enable compare operations to be selected by divergence (authored by alex-t).
[AMDGPU] Enable compare operations to be selected by divergence
Jun 24 2020, 2:08 AM
alex-t closed D82194: [AMDGPU] Enable compare operations to be selected by divergence.
Jun 24 2020, 2:08 AM · Restricted Project
alex-t added inline comments to D82194: [AMDGPU] Enable compare operations to be selected by divergence.
Jun 24 2020, 1:02 AM · Restricted Project

Jun 23 2020

alex-t updated the diff for D82194: [AMDGPU] Enable compare operations to be selected by divergence.

udivrem.ll checks updated

Jun 23 2020, 5:49 AM · Restricted Project
alex-t added inline comments to D82248: AMDGPU: Don't ignore carry out user when expanding add_co_pseudo.
Jun 23 2020, 3:40 AM · Restricted Project
alex-t updated the diff for D82194: [AMDGPU] Enable compare operations to be selected by divergence.

Formatting fixed. test extract_vector_dynelt.ll changed.

Jun 23 2020, 2:04 AM · Restricted Project

Jun 20 2020

alex-t updated the diff for D82194: [AMDGPU] Enable compare operations to be selected by divergence.

Code changed according to the reviewer request

Jun 20 2020, 7:54 AM · Restricted Project

Jun 19 2020

alex-t created D82194: [AMDGPU] Enable compare operations to be selected by divergence.
Jun 19 2020, 8:05 AM · Restricted Project

May 28 2020

alex-t committed rGb726d071b4aa: [AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move… (authored by alex-t).
[AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move…
May 28 2020, 9:53 AM
alex-t closed D80434: [AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move immediate.
May 28 2020, 9:51 AM · Restricted Project

May 27 2020

alex-t committed rGeb1092ada32d: [AMDGPU] Fix for the lost CarryOut/CarryIn register operands in… (authored by alex-t).
[AMDGPU] Fix for the lost CarryOut/CarryIn register operands in…
May 27 2020, 1:04 PM
alex-t closed D80158: [AMDGPU] Fix for the lost CarryOut/CarryIn register operands in S_ADD/SUB_CO_PSEUDO..
May 27 2020, 1:04 PM · Restricted Project
alex-t added inline comments to D80434: [AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move immediate.
May 27 2020, 3:45 AM · Restricted Project
alex-t updated the diff for D80434: [AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move immediate.

test corrected

May 27 2020, 3:44 AM · Restricted Project
alex-t added a comment to D80158: [AMDGPU] Fix for the lost CarryOut/CarryIn register operands in S_ADD/SUB_CO_PSEUDO..

Ping again. Could you please take a look?

May 27 2020, 3:12 AM · Restricted Project

May 26 2020

alex-t reopened D70085: [AMDGPU] NFC target dependent requiresUniformRegister refactored out.
May 26 2020, 10:49 AM · Restricted Project