Page MenuHomePhabricator

[PowerPC] Fix bugs in sign-/zero-extension elimination
Needs ReviewPublic

Authored by stefanp on Nov 28 2017, 6:20 AM.



This patch fixes the following two bugs in PPCInstrInfo::isSignOrZeroExtended helper, which is used from sign-/zero-extension elimination in PPCMIPeephole pass.

  • Registers defined by load with update (e.g. LBZU) were identified as already sign or zero-extended. But it is true only for the first def (loaded value) and not for the second def (i.e. updated pointer).
  • Registers defined by ORIS/XORIS were identified as already sign-extended. But, it is not true for sign extension depending on the immediate (while it is ok for zero extension).

To handle the first case, the parameter for the helpers is changed from MachineInstr to a register number to distinguish first and second defs. Also, this patch moves the initialization of PPCMIPeepholePass to allow mir test case.

Diff Detail

Unit TestsFailed

60,060 msx64 debian > MLIR.Examples/standalone::test.toy
Script: -- : 'RUN: at line 1'; /usr/bin/cmake /var/lib/buildkite-agent/builds/llvm-project/mlir/examples/standalone -G "Ninja" -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang -DLLVM_ENABLE_LIBCXX=OFF -DMLIR_DIR=/var/lib/buildkite-agent/builds/llvm-project/build/lib/cmake/mlir ; /usr/bin/cmake --build . --target check-standalone | tee /var/lib/buildkite-agent/builds/llvm-project/build/tools/mlir/test/Examples/standalone/Output/test.toy.tmp | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/mlir/test/Examples/standalone/test.toy
60,040 msx64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::restore_stack.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=thread -Wall -m64 -msse4.2 -gline-tables-only -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/../ -std=c++11 -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/../ -O1 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/restore_stack.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/restore_stack.cpp.tmp && not /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/restore_stack.cpp.tmp 2>&1 | FileCheck /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/restore_stack.cpp

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
nemanjai added inline comments.Feb 5 2018, 9:57 AM
3048 ↗(On Diff #132808)

I think the number of opcodes here is large enough to warrant storing them in a data structure with efficient searching such as a dense set. In addition to readability at the point where you're querying this, it can be very clear from the name what the data structure contains.

For example, if you have a container called OpcodesZExt32To64, it is very clear what it is for and any future code that needs such functionality can easily access that container rather than re-writing this whole list.

Of course, the same applies to the similar list for zero extending ops.

3249 ↗(On Diff #132808)

Seems a bit misleading to have a depth parameter that is not actually respected if we're looking through a chain of copies. Although of course, deep chains of copies shouldn't happen in practice.

3254 ↗(On Diff #132808)

Why does the register input matter for and-immediate?

3261 ↗(On Diff #132808)

This recursive call as well as the next also don't check the depth, right? I realize that these immediate-form binary logical ops won't be chained, but it's still good to add a comment explaining we don't need to limit the depth on such ops.

3317 ↗(On Diff #132808)

Comment with the assert.

343 ↗(On Diff #132808)

I still think this setup with separate queries for sign/zero/any extend seems unnecessarily complex. Seems like it would be clearer if we had an enum for the extension type (sign, zero, none) and we just have a single query that will return this. Therefore no need for Boolean parameters, no need for multiple functions, etc.

inouehrs updated this revision to Diff 132996.Feb 6 2018, 7:24 AM

addressed comments

inouehrs updated this revision to Diff 132997.Feb 6 2018, 7:27 AM
inouehrs edited the summary of this revision. (Show Details)
inouehrs marked 8 inline comments as done.Feb 6 2018, 7:34 AM
inouehrs added inline comments.
3249 ↗(On Diff #132808)

I changed the names of variable and constant to indicate they shows the depth of binary operators.

3254 ↗(On Diff #132808)

Good catch. I fixed.

3317 ↗(On Diff #132808)

I added comment, but I feel this assert is not necessary.

343 ↗(On Diff #132808)

I agree that the single query approach is cleaner. But I am afraid that it will increase the cost of compilation slightly since it always checks for both sign- and zero-extensions. Do you think the potential increase in compilation cost does not matter too much here?

nemanjai added inline comments.Feb 7 2018, 12:39 PM
343 ↗(On Diff #132808)

It seems to me that the checks are not costly except in chained operations cases (which we traverse the chain as it is). Or am I missing something. What is the worst case complexity of the computations for each?

inouehrs updated this revision to Diff 133414.Feb 8 2018, 6:46 AM
inouehrs marked 3 inline comments as done.
  • changed the return value of isSignOrZeroExtended to std::pair<bool, bool>
inouehrs marked an inline comment as done.Feb 8 2018, 6:48 AM
inouehrs added inline comments.
343 ↗(On Diff #132808)

Thanks. I modified the helper to return std::pair<bool, bool>.
Do you prefer an enum?

inouehrs updated this revision to Diff 142429.Apr 13 2018, 9:56 AM
inouehrs marked an inline comment as done.
  • rebased to the latest tree and ran tests again
  • fix formatting
inouehrs updated this revision to Diff 164397.Sep 7 2018, 6:09 AM
inouehrs removed a reviewer: jtony.
  • rebased to the latest source (fix conflicts and update unit tests)
amyk commandeered this revision.May 13 2019, 9:41 AM
amyk added a reviewer: inouehrs.
amyk added a subscriber: amyk.

Hi @inouehrs, I will be commandeering this revision since we are revisiting this patch.

@amyk I see. Please let me know you need some help.

amyk updated this revision to Diff 201651.May 28 2019, 6:33 AM
amyk added a subscriber: nemanjai.
  • Rebased to the latest source (fix conflicts, ran functional tests)
  • Fixed llvm/test/CodeGen/PowerPC/ppc64-P9-setb.ll to remove redundant sign extend instruction
  • Added the following check in PPCMIPeephole.cpp to only perform the LWZ->LWA transformation when the displacement is a multiple of 4
// The transformation from a zero-extending load to a sign-extending
// load is only legal when the displacement is a multiple of 4.
// If the displacement is not at least 4 byte aligned, don't perform
// the transformation.
if (SrcMI->getOperand(1).isGlobal() &&
    SrcMI->getOperand(1).getGlobal()->getAlignment() < 4)
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2019, 6:33 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
nemanjai accepted this revision.Jun 3 2019, 4:10 AM



The bitwise or seems like an odd choice here since IsSExt/IsZExt shouldn't possibly be true here and also I think logical or seems more natural for bool variables.
The change to logical or doesn't require another revision.


These can probably be logical or's as well. Same with the below ones.


Please remove this todo.

This revision is now accepted and ready to land.Jun 3 2019, 4:10 AM
stefanp commandeered this revision.Dec 2 2021, 6:56 AM
stefanp edited reviewers, added: amyk; removed: stefanp.

Taking over this patch to rebase it on top of current code and to see if it can be eventually checked in.

stefanp updated this revision to Diff 391319.Dec 2 2021, 7:02 AM

Rebased the patch to top of trunk.
A number of test cases had to be re-generated. This rebase shouldn't really
change what the patch does.

stefanp updated this revision to Diff 391321.Dec 2 2021, 7:07 AM

Fixed some formatting issues.

stefanp updated this revision to Diff 393264.Dec 9 2021, 1:06 PM

Found that POPCNTW instruction does not ensure that the upper 32 bits
are cleared. As a result I have removed it from the list.

stefanp updated this revision to Diff 394359.Dec 14 2021, 12:49 PM

Fixed a typo in the handling of the PPC::AND.

stefanp updated this revision to Diff 397308.Jan 4 2022, 8:14 AM

Added more early breaks to make sure that we do not convert from a zero extending
load to a sign extending load with an invalid displacement.

amyk added inline comments.Jan 7 2022, 7:11 AM

Question: should we update this to say if the register and not machine instruction since it is changed to Reg for the parameter?
Or should it stay as machine instruction since you're checking for the different MachineInstr opcodes in this function?
The same question I have applies to definedByZeroExtendingOp().


If the change to logical or still applies, we should probably update these accordingly.

stefanp updated this revision to Diff 400676.Jan 17 2022, 4:58 PM

Updated a couple of comments and changed the OR.


I think it should probably say "machine instruction that defines this register" because that is what we are really interested in. We want to make sure that this register is already sign extended by the instruction that defines it. I'm going to re-write the comment to make it clear.


Sure. I can change it to logical.

stefanp updated this revision to Diff 409971.Feb 18 2022, 10:20 AM

Rebased this patch to the top of main and retested.

stefanp planned changes to this revision.Apr 11 2022, 1:59 PM

Need to rebase this patch again. Will request another review once I have done that.

Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2022, 1:59 PM
stefanp updated this revision to Diff 422500.Apr 13 2022, 6:30 AM

Rebased patch to top of trunk.

This revision is now accepted and ready to land.Apr 13 2022, 6:30 AM
stefanp requested review of this revision.Apr 13 2022, 6:34 AM

Fairly significant changes have been made since I took over this patch and I would prefer to have another set of eyes take a look.
I will request a new review.

Sorry, I had some comments that I haven't submitted because I had not gotten through the entire patch. I am not sure if they all still apply but thought I'd submit them before restarting the review.


I think all explanations of what sign-extending means should be removed and you should just note that this is a test explicitly for sign extension from 32 to 64 bits.


Rather than having tables of these opcodes in the C++ code, we should probably just add a couple of flags in our .td files for the two types of extension. Perhaps isSExt32To64, isZExt32To64. Then we can simply query that on the MI.

Of course, the ones below this set probably have to stay because they have further constraints.


Is there no DenseSet<>::count()?


This block ends up getting very deeply nested and hard to follow. Can we flip the conditions and do early exits? Something like:

if (SrcReg != PPC::X3)
  return ...
if previous instr is not the stack adjust or the one before is not a call...
  return ...

and so on.


This code might need to move up if we are flipping the previous condition.

// Return true if the register is sign-extended from 32 to 64 bits.
// Return true if the register is zero-extended from 32 to 64 bits.
stefanp updated this revision to Diff 422958.Apr 14 2022, 1:57 PM

Added two new flags ZExt32To64 and SExt32To64.
Marked instructions that zero extend or sign extend in the TD files with those
Fixed a few comments as well.
Tried to extract an if statement to reduce the indentation a little.

stefanp updated this revision to Diff 423390.Apr 18 2022, 7:22 AM

Rebased the patch to top of trunk again.

stefanp updated this revision to Diff 430795.Thu, May 19, 1:41 PM

Rebased to top of main branch.