Page MenuHomePhabricator

[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to.
ClosedPublic

Authored by spatel on Mar 7 2017, 10:07 AM.

Details

Summary

This fixes PR31777.

If both stores' values are ConstantInt, we merge the two stores
(shifting the smaller store appropriately) and replace the earlier (and
larger) store with an updated constant.

In the future we should also support vectors of integers. And maybe
float/double if we can.

Diff Detail

Repository
rL LLVM

Event Timeline

filcab created this revision.Mar 7 2017, 10:07 AM
filcab updated this revision to Diff 90879.Mar 7 2017, 10:21 AM

Added a test where we need to merge the same byte several times.

junbuml added inline comments.Mar 7 2017, 10:26 AM
test/Transforms/DeadStoreElimination/merge-stores.ll
6 ↗(On Diff #90879)

I think the current code should handle this.

filcab added inline comments.Mar 7 2017, 10:40 AM
test/Transforms/DeadStoreElimination/merge-stores.ll
6 ↗(On Diff #90879)

This specific test is about making sure we still do the current optimization (remove a store that's later shadowed by other stores). The code I added doesn't fire on this one (and it shouldn't). I can probably move it to the combine-partial-overwrites.ll test to make it consistent, though.

Some of the tests there got changed due to this patch.

spatel added a subscriber: spatel.Mar 7 2017, 1:54 PM
filcab updated this revision to Diff 91053.Mar 8 2017, 10:55 AM

Removed unneeded tests and disabled the new optimization in the old tests.

RKSimon added inline comments.Mar 28 2017, 10:45 AM
lib/Transforms/Scalar/DeadStoreElimination.cpp
301 ↗(On Diff #91053)

As a NFC pre-commit please can you update these to match the style guide (OW_Begin ..... OW_Unknown)

test/Transforms/DeadStoreElimination/merge-stores.ll
6 ↗(On Diff #90879)

Commit test with current codegen? Is it feasible to use update_test_checks.py here?

filcab updated this revision to Diff 93500.Mar 30 2017, 10:33 AM

Added a big endian version of the test
Did the OW_... change in trunk, rebases
Used update_test_checks.py for the tests. A new run won't change the output.

Filipe, did you take a look at the work @bryant is doing? (cc: @dberlin) He's implementing a full-fledged PDSE which I expect should cover also this case (partial stores).
Now, well, I'm not sure how far away is that from being ready (and probably this is something we want anyway in the meanwhile), but I think it's something you may want to look at independently.

filcab added a comment.Apr 3 2017, 7:33 AM

Filipe, did you take a look at the work @bryant is doing? (cc: @dberlin) He's implementing a full-fledged PDSE which I expect should cover also this case (partial stores).
Now, well, I'm not sure how far away is that from being ready (and probably this is something we want anyway in the meanwhile), but I think it's something you may want to look at independently.

Thanks! I've looked at D29866 and it seems like it *might* deal with partial store merging/elimination in the future. But it's in a TODO, for now. So I think we could go ahead with this and, when PDSE grows that functionality, then remove it from here (or even just replace DSE with PDSE and we're done :-) ).
Thank you,
Filipe

Filipe, did you take a look at the work @bryant is doing? (cc: @dberlin) He's implementing a full-fledged PDSE which I expect should cover also this case (partial stores).
Now, well, I'm not sure how far away is that from being ready (and probably this is something we want anyway in the meanwhile), but I think it's something you may want to look at independently.

Thanks! I've looked at D29866 and it seems like it *might* deal with partial store merging/elimination in the future. But it's in a TODO, for now. So I think we could go ahead with this and, when PDSE grows that functionality, then remove it from here (or even just replace DSE with PDSE and we're done :-) ).
Thank you,

Filipe

Just FYI, i expect we're going to add it prior to first commit.
It's not that hard.

(i don't have an opinion on this patch)

sanjoy added a subscriber: sanjoy.May 14 2017, 2:38 PM
sanjoy added inline comments.
lib/Transforms/Scalar/DeadStoreElimination.cpp
1192 ↗(On Diff #93500)

You also need to copy over the atomicity of the previous store (which can only be unordered).

sanjoy requested changes to this revision.May 14 2017, 2:38 PM
This revision now requires changes to proceed.May 14 2017, 2:38 PM
filcab updated this revision to Diff 99024.May 15 2017, 10:49 AM
filcab edited edge metadata.
filcab marked an inline comment as done.

Address Sanjoy's comment and add Synchscope and atomic settings from the original store.

RKSimon edited edge metadata.Jun 1 2017, 2:36 AM

What is happening with this patch? Is PDSE likely to land anytime soon (e.g. for the 5.000 release?) and support this case?

filcab added a comment.Jun 1 2017, 4:09 AM

@sanjoy: Are you ok with the changes I made?

Thank you,
Filipe

sanjoy requested changes to this revision.Jun 4 2017, 12:46 PM

Some comments inline.

lib/Transforms/Scalar/DeadStoreElimination.cpp
307 ↗(On Diff #99024)

Update doc.

446 ↗(On Diff #99024)

Can LaterOff - EarlierOff sign overflow (in artificial but legal cases)?

452 ↗(On Diff #99024)

Shouldn't this be FullEarlierWithPartialLater?

In a second change, I'd suggest changing the naming scheme to:

OW_Begin - OW_Prefix
OW_Complete - OW_Complete
OW_End - OW_Suffix
OW_PartialEarlierWithFullLater - OW_Partial
OW_Unknown - OW_Unknown

1178 ↗(On Diff #99024)

I'd call this LShiftAmount.

1196 ↗(On Diff #99024)

I'm not sure if copyMetadata is correct here. Someone could be using metadata to track "the value I'm storing here is positive" which may no longer be true after the rewrite. I think you should copy whitelisted metadata types (like !tbaa).

1200 ↗(On Diff #99024)

Why is this needed? store instructions should not have users (and a WeakTrackingVH on a store is most likely doing the wrong thing).

This revision now requires changes to proceed.Jun 4 2017, 12:46 PM
filcab updated this revision to Diff 104882.Jun 30 2017, 9:49 AM
filcab edited edge metadata.
filcab marked 4 inline comments as done.

Address Sanjoy's review comments

Hi @sanjoy. Can you take a look at the revised changes?

Thank you,
Filipe

lib/Transforms/Scalar/DeadStoreElimination.cpp
446 ↗(On Diff #99024)

Probably yes. Redoing the math.

1 - Make sure LaterOff is between EarlierOff and EarlierOff + Earlier.Size. We know that addition won't overflow (if it does, the bug is in GetPointerBaseWithConstantOffset), so we'll know we can subtract EarlierOff from LaterOff later without problems, since they're ordered that way and the difference will be smaller than Earlier.Size.

452 ↗(On Diff #99024)

Partial earlier store gets overwritten by a latter store (and the latter store doesn't write outside the "full" earlier store).

It's a weird way to phrase it, but I don't think changing it to your suggestion makes much sense (but I think I know what you mean).
I do think your renaming proposal for the whole enum makes sense, so I'll do that one.

1200 ↗(On Diff #99024)

Removed.

filcab updated this revision to Diff 108928.Jul 31 2017, 7:46 AM

Update for newer SynchScope API.
@sanjoy: Any thoughts?

Thank you,
Filipe

sanjoy accepted this revision.Jul 31 2017, 10:00 AM

LGTM

lib/Transforms/Scalar/DeadStoreElimination.cpp
446 ↗(On Diff #104882)

latter is correct, but I'd s/latter/later/ since that's what the source says.

1213 ↗(On Diff #108928)

s|////|//|

This revision is now accepted and ready to land.Jul 31 2017, 10:00 AM
This revision was automatically updated to reflect the committed changes.
filcab marked an inline comment as done.
rnk added a subscriber: rnk.Aug 4 2017, 1:15 PM

We're seeing null derefs on some code in Chromium after this patch:

ninja -v -d keeprsp -C out\clang_tot obj/third_party/angle/libANGLE/Caps.obj
ninja.exe -v -d keeprsp -C out\clang_tot obj/third_party/angle/libANGLE/Caps.obj -j 46 -l 48
ninja: Entering directory `out\clang_tot'
[1 processes, 1/1 @ 0.2/s : 4.732s ] ../../third_party/llvm-build/Release+Asserts/bin/clang-cl.exe /nologo /showIncludes  @obj/third_party/angle/libANGLE/Caps.obj.rsp /c ../../third_party/angle/src/libANGLE/Caps.cpp /Foobj/third_party/angle/libANGLE/Caps.obj /Fd"obj/third_party/angle/libANGLE_cc.pdb"
FAILED: obj/third_party/angle/libANGLE/Caps.obj
../../third_party/llvm-build/Release+Asserts/bin/clang-cl.exe /nologo /showIncludes  @obj/third_party/angle/libANGLE/Caps.obj.rsp /c ../../third_party/angle/src/libANGLE/Caps.cpp /Foobj/third_party/angle/libANGLE/Caps.obj /Fd"obj/third_party/angle/libANGLE_cc.pdb"
Assertion failed: Val && "isa<> used on a null pointer", file C:\src\chromium3\src\third_party\llvm\include\llvm/Support/Casting.h, line 106
Wrote crash dump file "c:\src\Temp\clang-cl.exe-d4113f.dmp"
#0 0x00007ff6f6338775 HandleAbort c:\src\chromium3\src\third_party\llvm\lib\support\windows\signals.inc:411:0
#1 0x00007ff6f7b99769 raise minkernel\crts\ucrt\src\appcrt\misc\signal.cpp:516:0
#2 0x00007ff6f7b86df8 abort minkernel\crts\ucrt\src\appcrt\startup\abort.cpp:71:0
#3 0x00007ff6f7b86532 common_assert_to_stderr<wchar_t> minkernel\crts\ucrt\src\appcrt\startup\assert.cpp:149:0
#4 0x00007ff6f7b865ce _wassert minkernel\crts\ucrt\src\appcrt\startup\assert.cpp:404:0
#5 0x00007ff6f5ed6de3 llvm::GEPOperator::hasAllZeroIndices(void)const  c:\src\chromium3\src\third_party\llvm\include\llvm\ir\operator.h:456:0
#6 0x00007ff6f5ed32fe `anonymous namespace'::stripPointerCastsAndOffsets<1> c:\src\chromium3\src\third_party\llvm\lib\ir\value.cpp:483:0
#7 0x00007ff6f61bd100 isOverwrite c:\src\chromium3\src\third_party\llvm\lib\transforms\scalar\deadstoreelimination.cpp:326:0
#8 0x00007ff6f61b9bd6 eliminateDeadStores c:\src\chromium3\src\third_party\llvm\lib\transforms\scalar\deadstoreelimination.cpp:1131:0
#9 0x00007ff6f61ba7d7 eliminateDeadStores c:\src\chromium3\src\third_party\llvm\lib\transforms\scalar\deadstoreelimination.cpp:1265:0
#10 0x00007ff6f61bf55c `anonymous namespace'::DSELegacyPass::runOnFunction c:\src\chromium3\src\third_party\llvm\lib\transforms\scalar\deadstoreelimination.cpp:1308:0
#11 0x00007ff6f5ee35fd llvm::FPPassManager::runOnFunction(class llvm::Function &) c:\src\chromium3\src\third_party\llvm\lib\ir\legacypassmanager.cpp:1514:0
#12 0x00007ff6f5bdae07 `anonymous namespace'::CGPassManager::RunPassOnSCC c:\src\chromium3\src\third_party\llvm\lib\analysis\callgraphsccpass.cpp:149:0
#13 0x00007ff6f5bdab93 `anonymous namespace'::CGPassManager::RunAllPassesOnSCC c:\src\chromium3\src\third_party\llvm\lib\analysis\callgraphsccpass.cpp:418:0
#14 0x00007ff6f5bdba59 `anonymous namespace'::CGPassManager::runOnModule c:\src\chromium3\src\third_party\llvm\lib\analysis\callgraphsccpass.cpp:474:0
#15 0x00007ff6f5ee3a3e `anonymous namespace'::MPPassManager::runOnModule c:\src\chromium3\src\third_party\llvm\lib\ir\legacypassmanager.cpp:1591:0
#16 0x00007ff6f5ee2ef1 llvm::legacy::PassManagerImpl::run(class llvm::Module &) c:\src\chromium3\src\third_party\llvm\lib\ir\legacypassmanager.cpp:1695:0
#17 0x00007ff6f653810c `anonymous namespace'::EmitAssemblyHelper::EmitAssembly c:\src\chromium3\src\third_party\llvm\tools\clang\lib\codegen\backendutil.cpp:788:0
#18 0x00007ff6f653a140 clang::EmitBackendOutput(class clang::DiagnosticsEngine &,class clang::HeaderSearchOptions const &,class clang::CodeGenOptions const &,class clang::TargetOptions const &,class clang::LangOptions const &,class llvm::DataLayout const &,class llvm::Module *,enum clang::BackendAction,class std::unique_ptr<class llvm::raw_pwrite_stream,struct std::default_delete<class llvm::raw_pwrite_stream> >) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\codegen\backendutil.cpp:1148:0
#19 0x00007ff6f7cc3a31 clang::BackendConsumer::HandleTranslationUnit(class clang::ASTContext &) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\codegen\codegenaction.cpp:265:0
#20 0x00007ff6f68c6cbc clang::MultiplexConsumer::HandleTranslationUnit(class clang::ASTContext &) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\frontend\multiplexconsumer.cpp:305:0
#21 0x00007ff6f6f224d6 clang::ParseAST(class clang::Sema &,bool,bool) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\parse\parseast.cpp:162:0
#22 0x00007ff6f688654d clang::ASTFrontendAction::ExecuteAction(void) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\frontend\frontendaction.cpp:1005:0
#23 0x00007ff6f7cc30fb clang::CodeGenAction::ExecuteAction(void) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\codegen\codegenaction.cpp:992:0
#24 0x00007ff6f68863db clang::FrontendAction::Execute(void) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\frontend\frontendaction.cpp:906:0
#25 0x00007ff6f6873b65 clang::CompilerInstance::ExecuteAction(class clang::FrontendAction &) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\frontend\compilerinstance.cpp:981:0
#26 0x00007ff6f68f7bfd clang::ExecuteCompilerInvocation(class clang::CompilerInstance *) c:\src\chromium3\src\third_party\llvm\tools\clang\lib\frontendtool\executecompilerinvocation.cpp:252:0
#27 0x00007ff6f4fc5ff4 cc1_main(class llvm::ArrayRef<char const *>,char const *,void *) c:\src\chromium3\src\third_party\llvm\tools\clang\tools\driver\cc1_main.cpp:221:0
#28 0x00007ff6f4fc1d40 ExecuteCC1Tool c:\src\chromium3\src\third_party\llvm\tools\clang\tools\driver\driver.cpp:313:0
#29 0x00007ff6f4fc3b3a main c:\src\chromium3\src\third_party\llvm\tools\clang\tools\driver\driver.cpp:387:0
#30 0x00007ff6f7b6349d __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283:0
#31 0x00007ff855298364 (C:\WINDOWS\System32\KERNEL32.DLL+0x8364)
#32 0x00007ff855e870d1 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x670d1)
clang-cl.exe: error: clang frontend command failed due to signal (use -v to see invocation

I'll try to get a repro, but heads up.

RKSimon reopened this revision.Aug 5 2017, 8:00 AM

Reopening as it was reverted at rL310123 due to PR34074

This revision is now accepted and ready to land.Aug 5 2017, 8:00 AM
RKSimon requested changes to this revision.Aug 5 2017, 8:00 AM

PR34074

This revision now requires changes to proceed.Aug 5 2017, 8:00 AM
spatel added inline comments.Sep 25 2017, 8:00 AM
llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp
1219–1221

I think @filcab is unable to get back to this patch currently and D29866 is not progressing.

I don't have a full understanding of DSE, but I'll propose a one-word fix to this patch (and an additional test case). It should avoid the bug that caused the patch to be reverted ( https://bugs.llvm.org/show_bug.cgi?id=34074 ).

This 'continue' should be a 'break'. We killed 'Inst' above, so 'InstDep' is no longer valid. Therefore, we can't carry on in the inner loop; we need to break back out to the outer loop and really 'start over' as the comment already says. :)

spatel commandeered this revision.Sep 25 2017, 9:11 AM
spatel added a reviewer: filcab.

Commandeering so I can post the updated patch here.

spatel updated this revision to Diff 116562.Sep 25 2017, 9:13 AM
spatel edited edge metadata.

Patch updated:

  1. Break from the inner loop after we successfully merge (kill) the earlier and later stores.
  2. Add test reduced from PR34074 that was crashing.
filcab edited edge metadata.Sep 25 2017, 9:55 AM

Patch updated:

  1. Break from the inner loop after we successfully merge (kill) the earlier and later stores.
  2. Add test reduced from PR34074 that was crashing.

Thanks for comandeering and fixing it, Sanjay!
Your fix looks good on my end.

davide removed a subscriber: davide.Sep 25 2017, 10:40 AM
RKSimon accepted this revision.Sep 26 2017, 7:16 AM

LGTM - cheers

This revision is now accepted and ready to land.Sep 26 2017, 7:16 AM
This revision was automatically updated to reflect the committed changes.