This is an archive of the discontinued LLVM Phabricator instance.

[DSE,MSSA] Treat `store 0` after calloc as noop stores.
ClosedPublic

Authored by fhahn on Jun 19 2020, 9:48 AM.

Details

Summary

This patch extends storeIsNoop to also detect stores of 0 to an calloced
object. This basically ports the logic from legacy DSE to the MemorySSA
backed version.

It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3
LTO:

Same hash: 218 (filtered out)
Remaining: 19
Metric: dse.NumNoopStores

Program base patch2 diff
test-suite...CFP2000/177.mesa/177.mesa.test 1.00 15.00 1400.0%
test-suite...6/482.sphinx3/482.sphinx3.test 1.00 14.00 1300.0%
test-suite...lications/ClamAV/clamscan.test 2.00 28.00 1300.0%
test-suite...CFP2006/433.milc/433.milc.test 1.00 8.00 700.0%
test-suite...pplications/oggenc/oggenc.test 2.00 9.00 350.0%
test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0%
test-suite.../CINT2006/403.gcc/403.gcc.test NaN 137.00 nan%
test-suite...libquantum/462.libquantum.test NaN 3.00 nan%
test-suite...6/464.h264ref/464.h264ref.test NaN 7.00 nan%
test-suite...decode/alacconvert-decode.test NaN 2.00 nan%
test-suite...encode/alacconvert-encode.test NaN 2.00 nan%
test-suite...ications/JM/ldecod/ldecod.test NaN 9.00 nan%
test-suite...ications/JM/lencod/lencod.test NaN 39.00 nan%
test-suite.../Applications/lemon/lemon.test NaN 2.00 nan%
test-suite...pplications/treecc/treecc.test NaN 4.00 nan%
test-suite...hmarks/McCat/08-main/main.test NaN 4.00 nan%
test-suite...nsumer-lame/consumer-lame.test NaN 3.00 nan%
test-suite.../Prolangs-C/bison/mybison.test NaN 1.00 nan%
test-suite...arks/mafft/pairlocalalign.test NaN 30.00 nan%

Diff Detail

Event Timeline

fhahn created this revision.Jun 19 2020, 9:48 AM
asbirlea added inline comments.Jun 19 2020, 1:20 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1949

One small note here: this query will not cache any result, because it's checking access to the DefLoc. It seems like the scenario of having: store value 0 + calloc as underlying object may be rare enough that it will not kick in often without benefit, but it's also possible to see a compile-time regression if the following pattern is found often.

calloc 
store val, loc
store 0, loc

In contrast, doing getClobbering(Def) will cache and the info could be used by this or subsequent passes. Would it make sense to make this check here instead?

llvm/test/Transforms/DeadStoreElimination/MSSA/calloc-store.ll
113

Negative test: add before ret and check it does not get eliminated:

%p.3.2 = getelementptr i8, i8* %p, i32 3
 store i8 0, i8* %p.3.2, align 1
fhahn updated this revision to Diff 272305.Jun 21 2020, 5:03 AM
fhahn marked 2 inline comments as done.

Use getClobberingMemoryAccess(Def) and add extra test as suggested, thanks!

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1949

Thanks for pointing that out, I updated the code to use getClobberingMemoryAcess(Def).

getClobberingMemoryAccess(Access, Loc) is also used in getDomMemoryDef and profiling indicates that this is where 50%+ of the time in DSE with MemorySSA is spent. getClobberingMemoryAcess(Def)` could be used for the check in the first iteration, but for subsequent ones we need to pass in the original location I think. Any thoughts/suggestions on how to improve things compile-time wise there would be greatly appreciated. But that's for separate changes :)

llvm/test/Transforms/DeadStoreElimination/MSSA/calloc-store.ll
113

Done, I've added it as a separate test. Currently only the last store is left over, but as it is a store of 0, it could also be eliminated. We currently don't do that, because of the order in which we perform the eliminations.

asbirlea added inline comments.Jun 22 2020, 8:12 AM
llvm/test/Transforms/DeadStoreElimination/MSSA/calloc-store.ll
113

I was suggesting a test where the store would intentionally not be eliminated (negative test), and this seems correct below. There is a store of 5, then the store of 0 should not be eliminated because it would overwrite the 5.
Add a check that the store of value 5 is also there?

If you mean the store is the last before ret, and it should be eliminated because of *that*, then add another method call afterwards to defeat it. This way, we'll have a unit test to check that DSE works correctly on the particular feature added in this patch, the pattern:

calloc
store non-zero value
store zero is not eliminated (i.e. clobbering correctly find the non-zero store, not the calloc)

Does this make sense?

fhahn updated this revision to Diff 272667.Jun 23 2020, 3:46 AM
fhahn marked an inline comment as done.

Add additional test with read between non-zero and zero store.

llvm/test/Transforms/DeadStoreElimination/MSSA/calloc-store.ll
113

I was suggesting a test where the store would intentionally not be eliminated (negative test), and this seems correct below. There is a store of 5, then the store of 0 should not be eliminated because it would overwrite the 5.
Add a check that the store of value 5 is also there?

I might be missing something, but store i8 5, i8* %p.3 in @test11 is eliminated because it gets overwritten by the last store store i8 0, i8* %p.3.2. The check lines ensure that it is gone.

If you mean the store is the last before ret, and it should be eliminated because of *that*, then add another method call afterwards to defeat it.

We are already returning the pointer returned from calloc, which should be enough to ensure that non-zero stores cannot be removed because they are not read at the end of the function I think.

This way, we'll have a unit test to check that DSE works correctly on the particular feature added in this patch, the pattern:

calloc
store non-zero value
store zero is not eliminated (i.e. clobbering correctly find the non-zero store, not the calloc)
Does this make sense?

IIUC in the scenario above (which is should be reflected in @test11) we could eliminate both stores. The non-zero value is overwritten by 0 before it is read or the function returns, so calloc would already ensure the location is zero.

I'll add a test case with a read between the non-zero and zero store. This way, neither can be eliminated.

Does that make sense?

fhahn updated this revision to Diff 272673.Jun 23 2020, 4:20 AM

Move a calloc test from simple-todo.ll to simple.ll

asbirlea accepted this revision.Jun 23 2020, 10:30 AM

lgtm

llvm/test/Transforms/DeadStoreElimination/MSSA/calloc-store.ll
113

Yes, that's perfect. You're right, I missed the need for a use inbetween, otherwise the 5 stored also gets eliminated. Thank you for adding this.

This revision is now accepted and ready to land.Jun 23 2020, 10:30 AM
This revision was automatically updated to reflect the committed changes.