Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[PowerPC] Prevent the optimizer from producing wide vector types in IR.

Authored by amyk on Nov 15 2021, 7:34 AM.



This patch prevents the optimizer from producing wide vectors in the IR, specifically the
MMA types (v256i1, v512i1). The idea is that on Power, we only want to be producing
these types only if the vector_pair and vector_quad types are used specifically.

To prevent the optimizer from producing these types in the IR, vectorCostAdjustmentFactor()
is updated to return an instruction cost factor or an invalid instruction cost if the current type is
that of an MMA type. An invalid instruction cost returned by this function signifies to other
cost computing functions to return the maximum instruction cost to inform the optimizer that
producing these types within the IR is expensive, and should not be produced in the first place.

This issue was first seen in the test case included within this patch.

Diff Detail

Event Timeline

amyk created this revision.Nov 15 2021, 7:34 AM
amyk requested review of this revision.Nov 15 2021, 7:34 AM
nemanjai added inline comments.Nov 15 2021, 9:24 AM

Can you please explain why we don't want to do this check in the above condition?


If we are exiting early, might as well do it before we do the ISD/Cost computations above.

lei added a comment.Nov 16 2021, 12:17 PM

Instead of adding:

if (isMMAType(Src))
    return InstructionCost::getMax();

to all the different cost calculating functions, is it possible to just add them the underlying functions?

For example I see this check was added multiple functions that also calls vectorCostAdjustment() or getMemoryOpCost().
I haven't tried this so not sure if it will work, but was thinking it would make sense to at least add this to vectorCostAdjustment()
since it's a vector type.

amyk updated this revision to Diff 388704.Nov 20 2021, 8:34 AM

Discussed on refactoring this approach with Nemanja and Lei outside of this review.
The suggestion was to instead, transform the vectorCostAdjustment() function to return a cost adjustment
factor to be multiplied by rather than the newly computed cost.

This enables us to place code to check for the MMA types within vectorCostAdjustment() and call the function
first in all of the instruction cost functions, while allowing us to exit early if we happen to find an MMA type.
Moreover, since many functions rely on vectorCostAdjustment() for cost computation anyway, it reduces the
chances of missing the opportunity to return the maximum instruction cost when an MMA type is found.
Essentially, we want to guarantee that every instruction cost function can return a maximum instruction cost for
the MMA types and this change can assist in this goal.

nemanjai accepted this revision.Nov 24 2021, 10:10 AM

LGTM aside from a name and return type change.


This should be returning unsigned. And the name should change to indicate that this is a factor.

This revision is now accepted and ready to land.Nov 24 2021, 10:10 AM
lei accepted this revision.Nov 24 2021, 11:41 AM

LGTM other then the comment about the unecessary code block.


I don't think this is needed....

amyk updated this revision to Diff 389815.Nov 25 2021, 8:51 AM

Update the patch to utilize an invalid InstructionCost, and return InstructionCost in vectorCostAdjustment. This was discussed with Nemanja and Lei outside of the Phabricator review.

nemanjai accepted this revision.Nov 25 2021, 8:55 AM

LGTM. Thanks for the update.


This comment is now out of date. Please update it on the commit.

amyk edited the summary of this revision. (Show Details)Nov 25 2021, 9:14 AM
amyk added inline comments.

Thanks Nemanja for the review. I'll update it on the commit.

lkail added a subscriber: lkail.EditedDec 8 2021, 11:03 PM

I happened to read the internal issue this patch trying to resolve, my question may be out of this patch, is LLVM back-end supposed to generate proper code for any legal LLVM IR? As I still see llc failed to compile attached LLVM IR with -mcpu=pwr10, but succeed with -mcpu=pwr9

A fact is that new types will be added to MVT, is back-end for any target responsible to handle all newly added MVTs(setting actions to custom, expand, promote and etc.)?

fhahn added a subscriber: fhahn.Jan 14 2022, 4:05 AM
fhahn added inline comments.

Is there a reason this test checks the full pipeline compare to just the SLP vectorizer pass? The changes look completely unrelated to PGO/profiling.

amyk added inline comments.

Thanks for taking a look at this. I originally found this test case under the full PGO pipeline. Then, I realized the SLP vectorizer was also an important culprit in reproducing the issue, but it appears to be a bit more complex than that. It seems like there are other passes that come into play when trying to reproduce this issue.

I've attempted to find a minimal set of passes and I've posted a patch to hopefully be a bit more specific with the pass pipeline:
I'd appreciate your opinion on the updated pass pipeline.