This is an archive of the discontinued LLVM Phabricator instance.

[AssumeBundles] add cannonicalisation to the assume builder
ClosedPublic

Authored by Tyker on May 8 2020, 2:07 PM.

Details

Summary

this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

cost of enable-knowledge-retention before the patch

Metric: inst_count
Program                                                      baseline      patched       diff
 test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test              118759192395  155068323810 30.6%
 test-suite :: CTMark/Bullet/bullet.test                      132173382943  172092586752 30.2%
 test-suite :: CTMark/7zip/7zip-benchmark.test                193010622254  201710179449  4.5%
 test-suite :: CTMark/kimwitu++/kc.test                       61391711480   64129872325   4.5%
 test-suite :: CTMark/ClamAV/clamscan.test                    80896366241   79491189655  -1.7%
 test-suite :: CTMark/sqlite3/sqlite3.test                    77983944971   77321823801  -0.8%
 test-suite :: CTMark/consumer-typeset/consumer-typeset.test  56565054641   56127618358  -0.8%
 test-suite :: CTMark/mafft/pairlocalalign.test               54476149303   54268277583  -0.4%
 test-suite :: CTMark/SPASS/SPASS.test                        65588484451   65405266465  -0.3%
 Geomean difference                                                                       6.6%

Metric: compile_time
Program                                                      baseline patched diff
 test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test              38.60    49.38  27.9%
 test-suite :: CTMark/Bullet/bullet.test                      43.90    54.25  23.6%
 test-suite :: CTMark/kimwitu++/kc.test                       22.88    24.20   5.8%
 test-suite :: CTMark/7zip/7zip-benchmark.test                66.93    70.27   5.0%
 test-suite :: CTMark/sqlite3/sqlite3.test                    24.70    24.47  -0.9%
 test-suite :: CTMark/ClamAV/clamscan.test                    24.12    23.91  -0.9%
 test-suite :: CTMark/mafft/pairlocalalign.test               15.91    15.84  -0.5%
 test-suite :: CTMark/consumer-typeset/consumer-typeset.test  17.09    17.10   0.1%
 test-suite :: CTMark/SPASS/SPASS.test                        21.81    21.82   0.1%
 Geomean difference                                                            6.2%

cost after this patch

Metric: inst_count
Program                                                      baseline      patched       diff
 test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test              118763840162  138467480074 16.6%
 test-suite :: CTMark/Bullet/bullet.test                      132174006965  136291227509  3.1%
 test-suite :: CTMark/kimwitu++/kc.test                       61393799157   62997635803   2.6%
 test-suite :: CTMark/7zip/7zip-benchmark.test                192990097336  197727495942  2.5%
 test-suite :: CTMark/ClamAV/clamscan.test                    80896861556   79484165212  -1.7%
 test-suite :: CTMark/sqlite3/sqlite3.test                    77990896082   77343953874  -0.8%
 test-suite :: CTMark/consumer-typeset/consumer-typeset.test  56564081236   56118858184  -0.8%
 test-suite :: CTMark/mafft/pairlocalalign.test               54474697759   54263079601  -0.4%
 test-suite :: CTMark/SPASS/SPASS.test                        65590532152   65405772667  -0.3%
 Geomean difference                                                                       2.2%

Metric: compile_time
Program                                                      baseline patched diff
 test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test              38.91    46.75  20.1%
 test-suite :: CTMark/7zip/7zip-benchmark.test                67.93    71.22   4.8%
 test-suite :: CTMark/Bullet/bullet.test                      44.90    46.55   3.7%
 test-suite :: CTMark/kimwitu++/kc.test                       23.58    24.28   3.0%
 test-suite :: CTMark/mafft/pairlocalalign.test               16.09    16.39   1.8%
 test-suite :: CTMark/ClamAV/clamscan.test                    24.63    24.19  -1.8%
 test-suite :: CTMark/sqlite3/sqlite3.test                    25.25    25.37   0.5%
 test-suite :: CTMark/consumer-typeset/consumer-typeset.test  17.28    17.33   0.2%
 test-suite :: CTMark/SPASS/SPASS.test                        22.36    22.39   0.1%
 Geomean difference                                                            3.4%

Diff Detail

Event Timeline

Tyker created this revision.May 8 2020, 2:07 PM
Herald added a project: Restricted Project. · View Herald Transcript

Nice. I guess we should explicitly look into tramp3d afterwards, 20% probably offers nice opportunities to reduce the cost ;)

Some comments and questions below.

llvm/lib/IR/Operator.cpp
46

The maximum alignment is MaximumAlignment in Value.h, please use that over the shifting stuff.

Can you explain the logic here in a comment at the function entry, please.

llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
85

It will return \p RK not an empty one, right?

117

The above functions can be static I guess. I'm a little worried that we add yet another "stripPointer" function here. I guess you cannot reuse the existing ones but maybe we should extend those instead.

Why do you strip this again:
RK.WasOn = V->stripInBoundsOffsets();
That seems wrong. V is the base we can use, stripping dynamic inbounds offsets will invalidate the deref value, right?

134

I'm not super happy about the last part of the check but it is not incorrect. Let's keep it for now.

Btw. We can also weed out (if we don't do it already) the trivial cases:
deref and nonnull for alloca, byval arguments, or certain global values for example.

llvm/test/Analysis/BasicAA/featuretest.ll
41

Can you explain why we drop nonnull but not deref here? I mean, it is good but I don't get how. (As mentioned above, we could also drop deref).

Tyker updated this revision to Diff 263464.May 12 2020, 10:09 AM
Tyker marked 5 inline comments as done.

addressed comments

llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
85

yes the comment was outdated.

117

The above functions can be static I guess.

it is in an inline namespace so it is already implicitly static.

I'm a little worried that we add yet another "stripPointer" function here. I guess you cannot reuse the
existing ones but maybe we should extend those instead.

Yes i agree that there are already implementations of this (which i partially ripped of) but they are not general enough for my use case because i need to update the alignment value during the stripping based on the instructions being stripped.

We should probably refactor the various striping and GetPointerBaseWithConstantOffset such that we have one generic implementation and all similar utilities are specialization of that generic implementation.

Why do you strip this again:
RK.WasOn = V->stripInBoundsOffsets();
That seems wrong. V is the base we can use, stripping dynamic inbounds offsets will invalidate the deref value, right?

yeah we can't assume this because dynamic offsets maybe negative.

134

i added a few rules to remove more useless knowledge to cover byval and globals.

I'm not super happy about the last part of the check but it is not incorrect

can you elaborate on why ? the idea is that salvageKnowledge is intended to be called on an instruction about to be removed. and usually when removing an instruction we keep going recursively. this changed makes sure we don't preserve knowledge in a way that is going to prevent from removing instructions recursively.
we will need to deal with droppable uses in RecursivelyDeleteTriviallyDeadInstructions, isInstructionTriviallyDead and others. but there are also a few custom implementations of those.

llvm/test/Analysis/BasicAA/featuretest.ll
41

nonnull and align is dropped because the canonicalizedKnowledge was able to canonicalize it through the inbounds gep of dynamic index and see that the underlying pointer is an alloca so it isn't worth preserving.

however i don't think it is possible to canonicalize a dereferencable through the inbounds gep of dynamic index since the index may be negative.

but i will add a GetUnderlyingObject to the check of whether the pointer is an alloca this way dereferencable will also be removed.

jdoerfert accepted this revision.May 12 2020, 10:59 AM

I left two comments. Could you verify the Value.h change does not impact compile time. Other than than, LGTM.

llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
134

can you elaborate on why ?

Maybe I am wrong but I thought there is no enforced connection between InsertBeforeInstruction and the knowledge (or the instruction from which we retain knowledge). We use it a certain way, true, but my concerns are with potential other usages. As I said, it is not wrong and what I fear might never happen.

136

If the kind is Deref you can ask the value for the known deref bytes with Value::getPointerDereferenceableBytes. For align it is Value::getPointerAlignment. Unsure if that is needed.

This revision is now accepted and ready to land.May 12 2020, 10:59 AM
This revision was automatically updated to reflect the committed changes.