This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Don't combine DPP if DPP register is used more than once per instruction
ClosedPublic

Authored by vpykhtin on Jun 25 2020, 7:30 AM.

Diff Detail

Event Timeline

vpykhtin created this revision.Jun 25 2020, 7:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2020, 7:30 AM
rampitec added inline comments.Jun 25 2020, 9:21 AM
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
364

I assume this exploits the fact we have no instructions which may have more than two uses and are dpp combinable at the same time? If so it deserves a comment.

vpykhtin marked an inline comment as done.Jun 25 2020, 9:44 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
364

I'm not sure what do you mean - DPP extension can only be applied to src0 operand.

addUse is called in the order of defuse chain traversal: defusechain_iterator::operator++ assumes that all per instruction uses comes continuosly - so I decided to assume the same.

rampitec added inline comments.Jun 25 2020, 10:25 AM
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
364

Hm... This looks like a hack to me.

vpykhtin marked an inline comment as done.Jun 25 2020, 10:45 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
364

It's a cornerstone for MachineRegisterInfo's use_instr_.. iterators, otherwise it wouldn't be cheap to walk.

rampitec accepted this revision.Jun 25 2020, 10:52 AM
This revision is now accepted and ready to land.Jun 25 2020, 10:52 AM
foad added inline comments.Jun 26 2020, 1:28 AM
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
539–541

It would be much simpler to add a check here that src0 and src1 are not the same register.

foad requested changes to this revision.Jun 26 2020, 1:28 AM
This revision now requires changes to proceed.Jun 26 2020, 1:28 AM
vpykhtin updated this revision to Diff 273625.Jun 26 2020, 2:07 AM

Indeed, thanks Jay, updated.

foad added inline comments.Jun 26 2020, 2:15 AM
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
521

This doesn't seem right. Use could be Src2 or some other operand, couldn't it?

vpykhtin marked an inline comment as done.Jun 26 2020, 2:24 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
521

I cound't find example for this, but ok, lets check this too.

vpykhtin updated this revision to Diff 273636.Jun 26 2020, 2:50 AM

Rebased, added check if use is Src0 or Src1

foad accepted this revision.Jun 26 2020, 3:27 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
522–525

This looks technically OK now. I still think it would be cleaner to structure the code as:

if (Use == Src0) {
  // do it
} else if (Use == Src1 && commutable && Src0 not identical to Src1) {
  // commute and do it
} else {
  // fail
}

But I'll leave that up to your judgement.

523–525

Could use MachineOperand::isIdenticalTo ?

This revision is now accepted and ready to land.Jun 26 2020, 3:27 AM
vpykhtin marked an inline comment as done.Jun 26 2020, 3:29 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
522–525

I think currently we have better debug messages.

vpykhtin updated this revision to Diff 275060.Jul 2 2020, 4:27 AM

Update before commit: used isIdenticalTo, rebased.

This revision was automatically updated to reflect the committed changes.