This is an archive of the discontinued LLVM Phabricator instance.

Teach peephole optimizer to not emit sub-register defs
ClosedPublic

Authored by saghir on May 31 2021, 6:58 AM.

Details

Summary

Peephole optimizer should not be introducing sub-reg definitions
as they are illegal in machine SSA phase. This patch modifies
the optimizer to not emit sub-register definitions.

Diff Detail

Event Timeline

saghir created this revision.May 31 2021, 6:58 AM
saghir requested review of this revision.May 31 2021, 6:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2021, 6:58 AM

Does it trigger on any of the existing tests in Codegen/PowerPC/?

saghir updated this revision to Diff 348973.Jun 1 2021, 7:33 AM

Added test case.

saghir edited the summary of this revision. (Show Details)Jun 1 2021, 7:35 AM
saghir added reviewers: Restricted Project, nemanjai.

I think that the original author (Matthias Braun) no longer contributes to LLVM as I haven't seen any activity from him so I'll add some reviewers that may be familiar with this code.

nemanjai added inline comments.Jun 1 2021, 2:29 PM
llvm/lib/CodeGen/DetectDeadLanes.cpp
343–345

Can this be conditional so we don't change the behaviour of the common case?

if (unsigned SubIdx = Def.getSubReg())
  DefinedLanes = TRI->composeSubRegIndexLaneMask(SubIdx, DefinedLanes);
arsenm requested changes to this revision.Jun 1 2021, 2:31 PM

This assert is correct. Subregister defs in SSA are illegal, and this does run on SSA.

Also could use a mir test

This revision now requires changes to proceed.Jun 1 2021, 2:31 PM
arsenm added a comment.Jun 1 2021, 2:33 PM

This assert is correct. Subregister defs in SSA are illegal, and this does run on SSA.

Also could use a mir test

It's also a bug if you aren't seeing the verifier fail on this

I'm not sure this is the right fix.

From my understanding, REG_SEQUENCE, INSERT_SUBREG, EXTRACT_SUBREG, and PHI should never target a subregister. For COPY, we can target a subregister, but only if that register is a physical register. So this patch only affects COPY instructions.

Given that, I think we should be bailing out sooner; trying to understand subregister defs of physical registers seems tricky, and not really related to the purpose of this pass.

arsenm added a comment.Jun 1 2021, 2:50 PM

I'm not sure this is the right fix.

From my understanding, REG_SEQUENCE, INSERT_SUBREG, EXTRACT_SUBREG, and PHI should never target a subregister. For COPY, we can target a subregister, but only if that register is a physical register.

Physical registers do not have subregister indexes. TRI::getSubReg resolves the physreg with index to a different physreg value. It's invalid to combine the two

I'm not sure this is the right fix.

From my understanding, REG_SEQUENCE, INSERT_SUBREG, EXTRACT_SUBREG, and PHI should never target a subregister. For COPY, we can target a subregister, but only if that register is a physical register.

Physical registers do not have subregister indexes. TRI::getSubReg resolves the physreg with index to a different physreg value. It's invalid to combine the two

Oh, I see; thanks for the clarification.

It certainly does sound odd to have a subreg on a def of a virtual register in SSA. We'll see where this is coming from and why but just a quick look at the -print-after-all output reveals that it is the peephole optimizer that produces it. We'll look into whether this is a bug in the peephole optimizer or if this is some PPC callback called from the peephole optimizer that does this.

saghir updated this revision to Diff 353438.Jun 21 2021, 11:39 AM

Address comments:

  • Subreg defs are indeed illegal; taught peephole optimizer not to emit these.
  • Added mir test case.
saghir retitled this revision from DetectDeadLanes: Remove assert for subregister defs to Teach peephole optimizer to not emit sub-register defs.Jun 21 2021, 11:41 AM
saghir edited the summary of this revision. (Show Details)
saghir updated this revision to Diff 353472.Jun 21 2021, 1:24 PM

NFC change:
Changed the name of a variable.

arsenm added inline comments.Jun 21 2021, 5:00 PM
llvm/lib/CodeGen/PeepholeOptimizer.cpp
593 ↗(On Diff #353472)

I don't see why you need to special case the use opcode

612 ↗(On Diff #353472)

The SubIdx should not be passed through here?

llvm/test/CodeGen/PowerPC/peephole-subreg.mir
5–30 ↗(On Diff #353472)

IR section is not necessary

32 ↗(On Diff #353472)

Should use a relevant name and add a descriptive comment

35–42 ↗(On Diff #353472)

Registers section not necessary

saghir updated this revision to Diff 354420.Jun 24 2021, 7:26 PM

Simplified test case and removed the code that introduces subreg def.

saghir marked 3 inline comments as done.Jun 24 2021, 7:51 PM
saghir added inline comments.
llvm/lib/CodeGen/PeepholeOptimizer.cpp
593 ↗(On Diff #353472)

Thanks, you are right, it is not really needed. I have removed the check for the opcode. I have also refined the approach to handle the case for when UseSrcSubIdx is true.

612 ↗(On Diff #353472)

I think, we need to copy 32-bit subreg value from the 64-bit register. So we need to generate something like this:
%6:gprc_and_gprc_nor0 = COPY %1.sub_32:g8rc_and_g8rc_nox0

Removing SubIdx, leads to copying the complete 64-bit register and we end up generating:
%6:gprc_and_gprc_nor0 = COPY %1:g8rc_and_g8rc_nox0

which does not look right, unless I am missing something here.

llvm/test/CodeGen/PowerPC/peephole-subreg.mir
5–30 ↗(On Diff #353472)

I have simplified the IR to generate the test case. However, removing the IR section, gives me error for use of undefined global value '@strlen'.

saghir updated this revision to Diff 354423.Jun 24 2021, 7:51 PM
saghir marked an inline comment as done.

Added test case description.

saghir added inline comments.Jun 24 2021, 8:00 PM
llvm/test/CodeGen/PowerPC/peephole-subreg-def.mir
8–26 ↗(On Diff #354423)

Since I renamed the file, I am writing my response to the earlier comment, of removing the IR section, here for better visibility:

"I have simplified the IR to generate the test case. However, removing the IR section, gives me error for use of undefined global value @strlen."

Does it trigger on any of the existing tests in Codegen/PowerPC/?

Sorry for the late reply, I did not see any of the existing test cases trigger on the subreg definition that peephole opt was introducing. I observed it when using -ppc-track-subreg-liveness option for certain test cases.

arsenm added inline comments.Jun 25 2021, 6:12 AM
llvm/test/CodeGen/PowerPC/peephole-subreg-def.mir
8–26 ↗(On Diff #354423)

You can strip the strlen reference from the MIR, either by replacing with 0 or just deleting the calli. Peephole opt doesn't care about the call here

saghir updated this revision to Diff 354526.Jun 25 2021, 9:26 AM

Removed the IR section from the test case.

saghir marked an inline comment as done.Jun 25 2021, 9:27 AM
saghir added inline comments.
llvm/test/CodeGen/PowerPC/peephole-subreg-def.mir
8–26 ↗(On Diff #354423)

Thanks for your patience. I have been able to remove the IR section.

arsenm accepted this revision.Jun 25 2021, 10:09 AM
arsenm added inline comments.
llvm/lib/CodeGen/PeepholeOptimizer.cpp
600 ↗(On Diff #354526)

I don't see why this second copy is necessary since %3 and %6 have the same class, but I guess that's not a big deal

This revision is now accepted and ready to land.Jun 25 2021, 10:09 AM
This revision was landed with ongoing or failed builds.Jun 28 2021, 7:24 AM
This revision was automatically updated to reflect the committed changes.
saghir marked an inline comment as done.