Page MenuHomePhabricator

[ConstantFold][NFC] Compile time optimization for large vectors
ClosedPublic

Authored by ThomasRaoux on Mar 23 2020, 6:37 PM.

Details

Summary

Optimize the common case of splat vector constant. For large vector going through all the elements is expensive. For splat/broadcast cases we can skip going through all the elements.

Diff Detail

Event Timeline

ThomasRaoux created this revision.Mar 23 2020, 6:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2020, 6:37 PM
majnemer added inline comments.Mar 23 2020, 7:31 PM
llvm/include/llvm/IR/Constants.h
770

I'd move this into the initializer.

llvm/lib/Analysis/ValueTracking.cpp
179

Is it OK to set DemandedLHS bit 0 when DemandedElts[0] is false?

llvm/lib/IR/ConstantFold.cpp
69–72

Please add a comment for this.

892–897

Please add a comment here.

1388–1389

This doesn't depend on C1.

ThomasRaoux marked 8 inline comments as done.Mar 23 2020, 11:20 PM

Thanks David.

llvm/include/llvm/IR/Constants.h
770

Done.

llvm/lib/Analysis/ValueTracking.cpp
179

It is okay as long as DemandedElts has at least one bit set. With current code this function would never be called with an empty DemandedElts but I added an early check for this case to handle this case.

llvm/lib/IR/ConstantFold.cpp
69–72

Done.

892–897

Done.

1388–1389

Right, I pulled this part out of the if.

Please upload with full context.

ThomasRaoux marked 3 inline comments as done.

Please upload with full context.

Done, sorry about that.

kariddi added inline comments.Mar 24 2020, 1:55 PM
llvm/include/llvm/IR/Constants.h
770

This introduces a linear scan in the creation of the ConstantDataVector() class, I wonder if this might be wasteful if "isSplat()" never endsup to be called on a particular vector.

I wonder if moving this scan in "isSplat()" and caching the result might be better (as isSplat() already was an expensive method, while the construction of ConstantDataVector was not)

Thanks for the patch! Some speedup in this area was dearly needed!

majnemer added inline comments.Mar 24 2020, 2:37 PM
llvm/lib/Analysis/ValueTracking.cpp
177–184

I'd float this to the top before we create more APInts.

179

Is it OK that the loop bellow only sets DemandedLHS[0] if DemandedElts[i] is true for some multiple of NumElts while this logic will set it unconditionally?

ThomasRaoux marked 2 inline comments as done.Mar 24 2020, 3:37 PM
ThomasRaoux added inline comments.
llvm/lib/Analysis/ValueTracking.cpp
177–184

I believe we still need to set DemandedLHS and DemandedRHS to APInt::getNullValue(NumElts) so I cannot really move it up?

179

DemandedLHS[0] is set if for any DemandedElts[i] Shuf->getMaskValue(i) is 0 or NumElts.
In this case we know all the getMaskValue(i) are equal to 0 (since the shuffle mask is aggregateZero) so if any DemandedElts[i] is set it's associated Shuf->getMaskValue(i) will be 0 and DemandedLHS[0] will be set. Since we handle the case where all the bits of DemandedElts are 0 above we know at this point that at least one bit is set.

ThomasRaoux marked an inline comment as done.Mar 24 2020, 3:41 PM
ThomasRaoux added inline comments.
llvm/include/llvm/IR/Constants.h
770

Here is an alternative solution were we lazily set IsSplat. I assume it is going to be rare that we create a constant and not have any pass call isSplat on it but it could make sense to try to keep creation cheap. What do you think of this solution?

kariddi added inline comments.Mar 24 2020, 4:44 PM
llvm/include/llvm/IR/Constants.h
770

Exactly what I was thinking about thanks!

I think there are potentially cases where isSplat() is not called when a Constant is created. An example is serialization and deserialization of bitcode, where the bitcode is read into an LLVM representation is constructed and then it is just printed out (or viceversa).
Or very late legalization passes before instruction selection.

ThomasRaoux marked an inline comment as done.Mar 24 2020, 5:34 PM
ThomasRaoux added inline comments.
llvm/include/llvm/IR/Constants.h
770

Sounds good, let's leave it that way then.

majnemer added inline comments.Mar 24 2020, 5:48 PM
llvm/include/llvm/IR/Constants.h
771

You could save a byte by rolling your own optional. I think this is valuable because there may be many ConstantDataVector.

Maybe something like:

bool IsSplatSet : 1;
bool IsSplat : 1;
llvm/lib/Analysis/ValueTracking.cpp
177–184

I was just thinking of floating this bit up:

if (DemandedElts.isNullValue())
  return true;
llvm/lib/IR/ConstantFold.cpp
1389

Now that these are lazy, maybe only do C1->getSplatValue() if C2Splat is true.

llvm/lib/IR/Constants.cpp
2907

Formatting.

ThomasRaoux marked 7 inline comments as done.Mar 24 2020, 9:49 PM
ThomasRaoux added inline comments.
llvm/include/llvm/IR/Constants.h
771

Done.

llvm/lib/Analysis/ValueTracking.cpp
177–184

What I meant is that DemandedLHS DemandedRHS are out parameters and do need to be set before return true. So we cannot move this above the APInt above.

llvm/lib/IR/ConstantFold.cpp
1389

Done.

llvm/lib/IR/Constants.cpp
2907

Done.

majnemer accepted this revision.Mar 26 2020, 10:24 AM

LGTM

llvm/lib/IR/Instructions.cpp
1965

Instead of resize, I'd reserve + push_back. resize fills the vector with values you will overwrite anyway.

This revision is now accepted and ready to land.Mar 26 2020, 10:24 AM
ThomasRaoux marked 3 inline comments as done.
ThomasRaoux marked an inline comment as done.

Thanks David!

llvm/lib/IR/Instructions.cpp
1965

Done.

This revision was automatically updated to reflect the committed changes.