Page MenuHomePhabricator

[PowerPC] Fix bugs in sign-/zero-extension elimination
AcceptedPublic

Authored by stefanp on Nov 28 2017, 6:20 AM.

Details

Summary

This patch fixes the following two bugs in PPCInstrInfo::isSignOrZeroExtended helper, which is used from sign-/zero-extension elimination in PPCMIPeephole pass.

  • Registers defined by load with update (e.g. LBZU) were identified as already sign or zero-extended. But it is true only for the first def (loaded value) and not for the second def (i.e. updated pointer).
  • Registers defined by ORIS/XORIS were identified as already sign-extended. But, it is not true for sign extension depending on the immediate (while it is ok for zero extension).

To handle the first case, the parameter for the helpers is changed from MachineInstr to a register number to distinguish first and second defs. Also, this patch moves the initialization of PPCMIPeepholePass to allow mir test case.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
nemanjai requested changes to this revision.Nov 29 2017, 3:58 AM

To be honest, this code seems even more counter-intuitive to me now. I initially thought that this should essentially be a known-bits calculation on virtual registers and suggested that. With these changes, the code even more closely resembles that, but in a very roundabout fashion. Seems like it would be much simpler to implement and follow this if we just built and maintained a map from virtual registers to known zero/sign bits. Then all we need to check in order to decide if we want to get rid of an EXTS[BHW] is whether the correct number of leading bits are known to be sign bits. Similarly, we'd know if it is safe to remove a CLRLDI/RLWINM/... if the correct number of leading bits are known to be zero. Furthermore, with this approach we wouldn't need to consider the depth that we look through.

Of course, we can leave that as a future simplification.

The reason I've requested changes is to get rid of the debugging code and change the names of the functions.

lib/Target/PowerPC/PPCInstrInfo.cpp
2164 ↗(On Diff #124562)

Similarly to isZeroExtendingOp(), this name no longer makes sense.

2215 ↗(On Diff #124562)

This name no longer makes sense. This is really a query for whether or not the virtual register contains a value that is zero-extended.

lib/Target/PowerPC/PPCMIPeephole.cpp
196 ↗(On Diff #124562)

This whole section needs to be cleaned up. Lines too long. Code commented-out. Pre-processor conditions. Presumably leftovers from debugging during development.

Actually, on second look, this injects trap instructions for debugging and should probably just be removed.

This revision now requires changes to proceed.Nov 29 2017, 3:58 AM
inouehrs updated this revision to Diff 124721.Nov 29 2017, 4:36 AM
inouehrs edited edge metadata.
  • addressed comments from @nemanjai
  • rebased to the latest code base and fixed merge conflict
inouehrs marked 3 inline comments as done.Nov 29 2017, 4:41 AM
inouehrs added inline comments.
lib/Target/PowerPC/PPCInstrInfo.cpp
2164 ↗(On Diff #124562)

I changed the method name. Do you have idea on an alternative name?

lib/Target/PowerPC/PPCMIPeephole.cpp
196 ↗(On Diff #124562)

Thank you so much for pointing this out. I removed debug code section.

inouehrs updated this revision to Diff 125162.Dec 1 2017, 9:23 AM
inouehrs marked 2 inline comments as done.
  • fixed a bug
hfinkel added inline comments.Dec 14 2017, 3:50 PM
lib/Target/PowerPC/PPCInstrInfo.h
306 ↗(On Diff #125162)

Comment out of date?

313 ↗(On Diff #125162)

Comment out of date.

jtony added a comment.Dec 20 2017, 7:29 AM

As you mentioned some registers defined by ORIS/ANDIS/XORIS were identified as already sign-extended. But, it is not true for sign extension depending on the immediate (while it is ok for zero extension). Can you provide some details (maybe in a comment) about how they depend on the immediate? This would be helpful for future maintainers.

lib/Target/PowerPC/PPCInstrInfo.cpp
2164 ↗(On Diff #125162)

We need to update this function name.

2201 ↗(On Diff #125162)

Can you put a comment here to explain why only when the the immediate is great or equal than 33 the value is sign-extended?

inouehrs updated this revision to Diff 130561.Jan 18 2018, 11:19 PM
  • rebased on the latest code
inouehrs updated this revision to Diff 132808.Feb 5 2018, 4:21 AM

update MIR test case to comply with the recent change in the sigil for MIR physical register

nemanjai added inline comments.Feb 5 2018, 9:57 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
3048 ↗(On Diff #132808)

I think the number of opcodes here is large enough to warrant storing them in a data structure with efficient searching such as a dense set. In addition to readability at the point where you're querying this, it can be very clear from the name what the data structure contains.

For example, if you have a container called OpcodesZExt32To64, it is very clear what it is for and any future code that needs such functionality can easily access that container rather than re-writing this whole list.

Of course, the same applies to the similar list for zero extending ops.

3249 ↗(On Diff #132808)

Seems a bit misleading to have a depth parameter that is not actually respected if we're looking through a chain of copies. Although of course, deep chains of copies shouldn't happen in practice.

3254 ↗(On Diff #132808)

Why does the register input matter for and-immediate?

3261 ↗(On Diff #132808)

This recursive call as well as the next also don't check the depth, right? I realize that these immediate-form binary logical ops won't be chained, but it's still good to add a comment explaining we don't need to limit the depth on such ops.

3317 ↗(On Diff #132808)

Comment with the assert.

lib/Target/PowerPC/PPCInstrInfo.h
343 ↗(On Diff #132808)

I still think this setup with separate queries for sign/zero/any extend seems unnecessarily complex. Seems like it would be clearer if we had an enum for the extension type (sign, zero, none) and we just have a single query that will return this. Therefore no need for Boolean parameters, no need for multiple functions, etc.

inouehrs updated this revision to Diff 132996.Feb 6 2018, 7:24 AM

addressed comments

inouehrs updated this revision to Diff 132997.Feb 6 2018, 7:27 AM
inouehrs edited the summary of this revision. (Show Details)
inouehrs marked 8 inline comments as done.Feb 6 2018, 7:34 AM
inouehrs added inline comments.
lib/Target/PowerPC/PPCInstrInfo.cpp
3249 ↗(On Diff #132808)

I changed the names of variable and constant to indicate they shows the depth of binary operators.

3254 ↗(On Diff #132808)

Good catch. I fixed.

3317 ↗(On Diff #132808)

I added comment, but I feel this assert is not necessary.

lib/Target/PowerPC/PPCInstrInfo.h
343 ↗(On Diff #132808)

I agree that the single query approach is cleaner. But I am afraid that it will increase the cost of compilation slightly since it always checks for both sign- and zero-extensions. Do you think the potential increase in compilation cost does not matter too much here?

nemanjai added inline comments.Feb 7 2018, 12:39 PM
lib/Target/PowerPC/PPCInstrInfo.h
343 ↗(On Diff #132808)

It seems to me that the checks are not costly except in chained operations cases (which we traverse the chain as it is). Or am I missing something. What is the worst case complexity of the computations for each?

inouehrs updated this revision to Diff 133414.Feb 8 2018, 6:46 AM
inouehrs marked 3 inline comments as done.
  • changed the return value of isSignOrZeroExtended to std::pair<bool, bool>
inouehrs marked an inline comment as done.Feb 8 2018, 6:48 AM
inouehrs added inline comments.
lib/Target/PowerPC/PPCInstrInfo.h
343 ↗(On Diff #132808)

Thanks. I modified the helper to return std::pair<bool, bool>.
Do you prefer an enum?

inouehrs updated this revision to Diff 142429.Apr 13 2018, 9:56 AM
inouehrs marked an inline comment as done.
  • rebased to the latest tree and ran tests again
  • fix formatting
inouehrs updated this revision to Diff 164397.Sep 7 2018, 6:09 AM
inouehrs removed a reviewer: jtony.
  • rebased to the latest source (fix conflicts and update unit tests)
amyk commandeered this revision.May 13 2019, 9:41 AM
amyk added a reviewer: inouehrs.
amyk added a subscriber: amyk.

Hi @inouehrs, I will be commandeering this revision since we are revisiting this patch.

@amyk I see. Please let me know you need some help.

amyk updated this revision to Diff 201651.May 28 2019, 6:33 AM
amyk added a subscriber: nemanjai.
  • Rebased to the latest source (fix conflicts, ran functional tests)
  • Fixed llvm/test/CodeGen/PowerPC/ppc64-P9-setb.ll to remove redundant sign extend instruction
  • Added the following check in PPCMIPeephole.cpp to only perform the LWZ->LWA transformation when the displacement is a multiple of 4
// The transformation from a zero-extending load to a sign-extending
// load is only legal when the displacement is a multiple of 4.
// If the displacement is not at least 4 byte aligned, don't perform
// the transformation.
if (SrcMI->getOperand(1).isGlobal() &&
    SrcMI->getOperand(1).getGlobal()->getAlignment() < 4)
  break;
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2019, 6:33 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
nemanjai accepted this revision.Jun 3 2019, 4:10 AM

LGTM.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
5316

The bitwise or seems like an odd choice here since IsSExt/IsZExt shouldn't possibly be true here and also I think logical or seems more natural for bool variables.
The change to logical or doesn't require another revision.

5328

These can probably be logical or's as well. Same with the below ones.

5369

Please remove this todo.

This revision is now accepted and ready to land.Jun 3 2019, 4:10 AM
stefanp commandeered this revision.Dec 2 2021, 6:56 AM
stefanp edited reviewers, added: amyk; removed: stefanp.

Taking over this patch to rebase it on top of current code and to see if it can be eventually checked in.

stefanp updated this revision to Diff 391319.Dec 2 2021, 7:02 AM

Rebased the patch to top of trunk.
A number of test cases had to be re-generated. This rebase shouldn't really
change what the patch does.

stefanp updated this revision to Diff 391321.Dec 2 2021, 7:07 AM

Fixed some formatting issues.

stefanp updated this revision to Diff 393264.Dec 9 2021, 1:06 PM

Found that POPCNTW instruction does not ensure that the upper 32 bits
are cleared. As a result I have removed it from the list.

stefanp updated this revision to Diff 394359.Dec 14 2021, 12:49 PM

Fixed a typo in the handling of the PPC::AND.

stefanp updated this revision to Diff 397308.Tue, Jan 4, 8:14 AM

Added more early breaks to make sure that we do not convert from a zero extending
load to a sign extending load with an invalid displacement.

amyk added inline comments.Fri, Jan 7, 7:11 AM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
5091–5092

Question: should we update this to say if the register and not machine instruction since it is changed to Reg for the parameter?
Or should it stay as machine instruction since you're checking for the different MachineInstr opcodes in this function?
The same question I have applies to definedByZeroExtendingOp().

5130
5316

If the change to logical or still applies, we should probably update these accordingly.

stefanp updated this revision to Diff 400676.Mon, Jan 17, 4:58 PM

Updated a couple of comments and changed the OR.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
5091–5092

I think it should probably say "machine instruction that defines this register" because that is what we are really interested in. We want to make sure that this register is already sign extended by the instruction that defines it. I'm going to re-write the comment to make it clear.

5316

Sure. I can change it to logical.