This is an archive of the discontinued LLVM Phabricator instance.

refactor / improve getScalarizationOverhead()
ClosedPublic

Authored by jonpa on Jan 23 2017, 4:36 AM.

Details

Summary

getScalarizationOverhead() was duplicated and found in three different places. It was also lacking in that the number of unique operands were not checked, so it could be that extract costs for two operands using same Value could be computed. It could also happen in LoopVectorizer that for e.g. an Add instruction, only extracts for one operand are accounted for (true for all arithmetic instructions).

This patch improves on this:

  • There is a single function definition in BasicTTIImpl, which is now public so that it can be called by TargetTransformInfo, which also gets a method with same name so that LoopVectorizer can access it.
  • Removed X86 duplicated method and also useless declarations of the method in AArch64 and ARM derivations.
  • Removed the LoopVectorizer duplicated method and changed so that TTI is queried instead. I assumed that the check for void Type is only relevant for the RetTy.
  • A new method getOperandsScalarizationOverhead(), that takes a list of operands and computes the cost of extract operations needed for the unique Values among them. Default implementation of getArithmeticInstrCost() now uses this if operands are provided. If not it keeps current behavior by assuming just one operand.
  • LoopVectorizer improved by utilizing the new TTI method and thereby removing one of its wrapper static functions it used before. It should now also get the right number of extracts accounted for in getInstructionCost() for arithmetic instructions.

Even in just LoopVectorize.cpp, there are more places to go over and see what would work best. I am however happy at this point to ask for feedback and suggestions. Does this seem to be going in the right direction?

Discussion on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109382.html

Simon, apart from duplicate operands, you mentioned extracted immediates. Do you have any
example or reference of how to best treat that case?

Diff Detail

Event Timeline

jonpa created this revision.Jan 23 2017, 4:36 AM
jonpa updated this revision to Diff 85356.Jan 23 2017, 4:46 AM

Patch uploaded this time with all files - sorry.

hfinkel accepted this revision.Jan 25 2017, 4:49 PM

Please remove the assert and add a comment as noted below. Otherwise, LGTM.

include/llvm/CodeGen/BasicTTIImpl.h
307

I don't see the point in asserting here. If there are no arguments, then there's no cost, so you'll just return zero. That seems useful to avoid unnecessary special cases in potential callers.

366

Please add a comment here that, when no information on arguments is provided, as a heuristic, we add the cost associated with one argument.

This revision is now accepted and ready to land.Jan 25 2017, 4:49 PM
jonpa closed this revision.Jan 25 2017, 11:21 PM

Pushed as r293155. I updated patch per review, and I also removed a line-break in a comment near end of getArithmeticInstrCost().

@simon: Extracted immediates have not yet been handled, but it should be simple to do in getOperandsScalarizationOverhead(). I don't have a test case for this, so I would appreciate your help if you think it is needed.