Page MenuHomePhabricator

[MachineCopyPropagation] Extend pass to do COPY source forwarding

Authored by gberry on Jan 8 2018, 1:15 PM.



This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.

This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).

The AMDGPU target is modified by this change to mark more opcodes as
hasExtraSrcRegAllocReq so that their operands will be marked as not
renamable, to avoid copy forwarding violating the constraint that only
one operand may use the constant bus.

This change is a continuation of the work started in D30751.

Diff Detail


Event Timeline

gberry created this revision.Jan 8 2018, 1:15 PM
gberry added a comment.Jan 8 2018, 1:17 PM

@tstellar could you specifically take a look at the AMDGPU backend changes?

mcrosier added inline comments.Jan 8 2018, 3:32 PM
20 ↗(On Diff #128974)

veg1 -> reg1

gberry marked an inline comment as done.Jan 16 2018, 11:51 AM


Hi Geoff,

Looks mostly good for the generic parts.

Comments below.


249 ↗(On Diff #128974)

You mean the source of Copy, right?

251 ↗(On Diff #128974)

It looks like this function is meant to run only on allocated code.
If that's the case, say it in the comment.

260 ↗(On Diff #128974)

Unless you expect CopySrcReg to always be a physical register, this check needs to be adapted.

262 ↗(On Diff #128974)

Nit: to reduce indentation
if (!UseI.isCopy())

return false;
288 ↗(On Diff #128974)

Just personal taste but I feel like a for loop would be more natural here.
for (const TargetRegisterClass *SuperRC = UseDstRC, TargetRegisterClass::sc_iterator SuperRCI = UseDstRC->getSuperClasses(), SuperRC; SuperRC = *SuperRCI++)

if (SuperRC->contains(CopySrcReg))
return true;

(Technically, I admit this has one more check, but I believe it doesn't matter for performance.)

346 ↗(On Diff #128974)

I expect reserved registers shouldn't be renamable.
Am I missing something?

edit: Oh you may introduce this after forwarding a reserved register, if I am reading the code correctly. Then, unless I am missing something again, we should be fine rewriting this one as well.

360 ↗(On Diff #128974)

Add a DEBUG message (and maybe an optimization remarks) for that case to see how often it happens and thus, if it is worth fixing.

Could be a follow-up patch.

373 ↗(On Diff #128974)

Add a debug print saying we're skipping because of debug counter.

386 ↗(On Diff #128974)

Hmm, I have mixed feeling about this.
Is renamable a property of the register or of the machine operand?

If this is the machine operand, then this particular use of the register will still be renamable.

What I am saying is depending on the semantic we give to renamable, this change makes sense or not.

gberry marked 5 inline comments as done.Jan 19 2018, 2:33 PM
gberry added inline comments.
251 ↗(On Diff #128974)

This pass is meant to only run on allocated code. There is already a NoVRegs pass property check and an assert in CopyPropagateBlock that checks for virtual registers. I'll add 'physical' to the comment.

260 ↗(On Diff #128974)

As noted above, I do expect it to always be a physical register. I can add asserts here, or perhaps rename the function to reduce confusion?

346 ↗(On Diff #128974)

I believe this is a redundant check with the isRenamable that is there from the previous version. I'll confirm by doing some testing with this changed to an assert before removing it.

edit: After trying this out, it appears that there are a few cases of targets generating renamable reserved registers. One specific case is X86 and float-stack registers. I'll try adding this check to the verifier to see if we can flush out all these cases before committing this change.

386 ↗(On Diff #128974)

If renamability is only a property of the opcode, then I don't think we need it, since we could always just check the opcode property.
My thinking is that it applies more to the individual physical register, and implies the value needs to be in that particular register, whether due to opcode constraints, ABI constraints, inline asm constrains, GC-interaction constraints, etc.

qcolombet added inline comments.Jan 19 2018, 2:45 PM
251 ↗(On Diff #128974)

I thought we were running before the rewriter, but I guess we don't need that anymore since we have the renamable flag.
Thanks for pointing that out.

260 ↗(On Diff #128974)

No given the pass itself doesn't expect vreg, we're fine.

346 ↗(On Diff #128974)

Sounds good

386 ↗(On Diff #128974)

In that case given we're really renaming the operand, I would think it should stay this way, but again, we could argue in both ways.

We should just agree what we're modeling with that flag and verify that in the machine verifier.

gberry updated this revision to Diff 131323.Jan 24 2018, 11:52 AM

Updates based on Quentin's comments

I've uploaded a new version that addresses most of the issues brought up by @qcolombet.

@arsenm could you take a look at the AMDGPU part of this change?

386 ↗(On Diff #128974)

I'm not sure exactly what you're suggesting we add to the machine verifier here.

D42449 adds the machine verifier check that no renamable operands are assigned to reserved registers

The changes to AMDGPU tests and .td files look good to me.

It would be nice for the backend to have a hook that can be queried when renamable is false (but obviously this current patch doesn't have to introduce that).

This revision is now accepted and ready to land.Jan 26 2018, 12:18 PM
This revision was automatically updated to reflect the committed changes.