This is an archive of the discontinued LLVM Phabricator instance.

SelectionDAG: Reuse bigger sized constants in memset expansion.
ClosedPublic

Authored by MatzeB on Oct 11 2018, 4:02 PM.

Details

Summary

When implementing memset's today we often see this pattern:
$x0 = MOV 0xXYXYXYXYXYXYXYXY
store $x0, ...
$w1 = MOV 0xXYXYXYXY
store $w1, ...

We first create a 64bit constant in a 64bit register with all bytes the
same and then create a 32bit constant with all bytes the same in a 32bit
register. In many targets we could just access the lower part of the
64bit register instead.

  • Ideally this would be handled by the ConstantHoist pass but it runs too early when memset isn't expanded yet.
  • The memset expansion code already had this optimization implemented, however it didn't work as DAGCombiner would fold "trunc(bigconstant)" back to "smallconstant".
  • This patch makes the memset expansion mark the constant as Opaque and stop DAGCombiner from constant folding in this situation. (Similar to how ConstantHoisting marks things as Opaque to avoid folding ADD/SUB/etc.)

rdar://44153998

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Oct 11 2018, 4:02 PM
MatzeB edited the summary of this revision. (Show Details)Oct 11 2018, 4:03 PM
MatzeB updated this revision to Diff 169339.Oct 11 2018, 4:20 PM
  • drop accidental reformatting of existing code from the patch.
MatzeB added inline comments.Oct 11 2018, 4:22 PM
test/CodeGen/X86/pr38771.ll
8 ↗(On Diff #169339)

This code changes because ConstantHoisting touches the constants here, and the less aggressive behavior of trunc(OpaqueConstant)" somehow makes DAGCombiner realizes this whole expression being 0 now which it didn't do before. The testcase unfortunately doesn't have enough information to figure out how an equivalent tests would look like that wouldn't constant fold to zero. So this probably has to stay like this...

jfb accepted this revision.Oct 11 2018, 4:38 PM

Nice! LGTM, assuming folks are OK with the llvm.org/pr38771 change.

test/CodeGen/X86/pr38771.ll
8 ↗(On Diff #169339)

I expect @craig.topper can chime in on this one.

This revision is now accepted and ready to land.Oct 11 2018, 4:38 PM
craig.topper added inline comments.Oct 11 2018, 10:21 PM
test/CodeGen/X86/pr38771.ll
8 ↗(On Diff #169339)

I'm having trouble reducing to a better test case.

The existing test case creates an 'and' while legalizing a srl_parts with an opaque constant for a shift quantity. This failed an assert later in the same legalization. That shift was already sketchy because it used a shift amount that was larger than the bit.

I tried to just make an and with two opaque constant operands. Constant folding in visitAND in DAG combine obeys the opaqueness, but SimplifyDemandedBits called later in visitAND does not. So DAG combine simplifies it when it probably shouldn't.

I'm not very concerned about losing this test case. It's only verifying an assert. There was no code changes associated with it.

MatzeB added inline comments.Oct 23 2018, 4:20 PM
test/CodeGen/X86/pr38771.ll
8 ↗(On Diff #169339)

Thanks for looking into this! I guess I will go ahead and drop the testcase as part of the commit then.

This revision was automatically updated to reflect the committed changes.
llvm/trunk/test/CodeGen/AArch64/arm64-memset-inline.ll