This is an archive of the discontinued LLVM Phabricator instance.

Eliminates dead store of an exisiting value
AcceptedPublic

Authored by dmccrevan on Oct 28 2020, 11:58 AM.

Details

Summary

Related to 16520

This change eliminates writes to variables where the value that is being written is already stored in the variable. This achieves the goal by looping through all memory defintions in the current state & each defintions uses. When there is a use where the write instruction is identical to the original instruction that created the memory definition, it will remove this redundant write.

For example,

void f() {
  x = 1;
  if foo() {
     x = 1;
     g();
  } else {
    h();
  }
}
void g();
void h();

The second x=1 will be eliminated since it is rewriting 1 to x. This pass will produce this:

void f() {
  x = 1;
  if foo() {
     g();
  } else {
    h();
  }
}
void g();
void h();

Diff Detail

Event Timeline

dmccrevan created this revision.Oct 28 2020, 11:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2020, 11:58 AM
dmccrevan requested review of this revision.Oct 28 2020, 11:58 AM
dmccrevan edited the summary of this revision. (Show Details)Oct 28 2020, 12:02 PM

Added test case for eliminateDeadStoresOfExisitingValues

When uploading a new diff, make sure that the patch you're uploading is against the git master, not the previously uploaded diff/etc

dmccrevan abandoned this revision.Oct 28 2020, 12:43 PM

When uploading a new diff, make sure that the patch you're uploading is against the git master, not the previously uploaded diff/etc

Sorry, yeah I just realized I did that when uploading the change with the test...is there a way to revert this? Or should I abandon the revision and open a new one?

dmccrevan reclaimed this revision.Oct 28 2020, 12:44 PM
dmccrevan updated this revision to Diff 301403.Oct 28 2020, 1:01 PM

Disregard the previous comment, fixed the revision.

fhahn added a comment.Oct 28 2020, 1:28 PM

Thanks for sharing the patch! Some initial comments inline. It would also be good to explain the approach in a comment and also in the patch description.

It would be interesting to know how often this triggers. For testing, you could try using a build of clang with the patch with https://www.llvm.org/docs/TestSuiteGuide.html

There's a TEST_SUITE_COLLECT_STATS which tells clang to collect statistics in the result.json file. The interesting one would probably be dse.NumFastStores

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2385 ↗(On Diff #301403)

Initially it would probably be best to start with just looking at the direct uses of the def and then go from there. One thing to watch out for is compile-time.

llvm/test/Transforms/DeadStoreElimination/MSSA/pr16520.ll
2

this test needs RUN & CHECK lines. You should be able to get some inspiration from other files in the directory.

dmccrevan added inline comments.Nov 2 2020, 9:50 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2385 ↗(On Diff #301403)

I've been trying to figure out how to get the uses of each def, but for some reason, I keep running into an error being thrown.

This is a snippet I am using to get the uses, but for some reason, getUser() is throwing an error.

MemoryAccess *DefAccess = dyn_cast<MemoryAccess>(Def);
if (isa<MemoryPhi>(DefAccess))
    continue;
for(Use &U : DefAccess->uses()) {
    MemoryAccess * UseAccess = cast<MemoryAccess>(U.getUser());

Any ideas how to get all of the uses from a Def?

fhahn added inline comments.Nov 2 2020, 9:54 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2385 ↗(On Diff #301403)

Any ideas how to get all of the uses from a Def?

That should be the way to do it.

What's the error?

One thing that looks suspicious is that you are using *DefAccess without checking if it is not null? (dyn_cast will return nullptr if the value does not have the requested type).

dmccrevan added inline comments.Nov 2 2020, 9:56 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2385 ↗(On Diff #301403)

Ah that must be why! I will try this, sorry about the late reply previously, I realized I never submitted my comment and it sat as a draft.

dmccrevan updated this revision to Diff 302880.Nov 4 2020, 10:01 AM
dmccrevan edited the summary of this revision. (Show Details)
dmccrevan edited the summary of this revision. (Show Details)

So I fixed up the implementation by removing the worklist & just looped through the uses. For some reason, I had to add a lot of NULL checks to ensure that each variable I used wasn't NULL, but I feel like there may be another way to do this using separate functions I may not be aware of.

I also changed up the test case file, but I am unsure if this works yet. I ran into a build issue when following the instructions for the test-suite. I am rebuilding my LLVM build again to see if there was some issue with my clang build, and I will see if this works. If not, I will post the error message if I cannot debug it myself. Sadly, building LLVM on my laptop takes a while since I do not have powerful hardware.

dmccrevan marked 2 inline comments as done.Nov 4 2020, 10:07 AM
fhahn added inline comments.Nov 4 2020, 2:11 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2395 ↗(On Diff #302880)

It is common to avoid compares with NULL/nullptr and instead use if (!DefAccess). Also, the condition can be combined with the one below?

2407 ↗(On Diff #302880)

Style guide suggest to not use {} for single line bodies.

2410 ↗(On Diff #302880)

You need to check if UseDefAccess is actually a use or def. It could be a MemoryPhi.

You could use something like MemoryAccess *UseDefAccess = dyn_cast_or_null<MemoryUseOrDef>(UseDef->getDefiningAccess())

Test case

define void @main(i32* %a, i32* %b, i1 %c) {
entry:
  br label %for.body

for.body:                                         ; preds = %if.end8, %entry
  br i1 %c, label %exit, label %loop.latch

loop.latch:                                          ; preds = %for.body
  store i32 0 , i32* %a
  store i32 1, i32* %b
  br label %for.body

exit:                                         ; preds = %for.body
  call void @foo()
  ret void
}

declare void @foo()
fhahn requested changes to this revision.Nov 4 2020, 4:58 PM

A few more comments inline. I think before checking the patch with Clang on C/C++ code, it would be good to make sure that all unit tests pass ( check-llvm-transforms-deadstoreelimination target)

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2393 ↗(On Diff #302880)

It is possible that memory defs in MemDefs get deleted. You need to skip defs in SkipStores.

2403 ↗(On Diff #302880)

If we change below to remove the UseInst, we need to use iterators with early increment here, to avoid iterator invalidation.

2406 ↗(On Diff #302880)

Why are we getting get defining access here? Shouldn't we just look at the uses?

2415 ↗(On Diff #302880)

This seems off, shouldn't we remove DefInst? Also, we need to make sure we only remove defs that do not also read memory I think.

This revision now requires changes to proceed.Nov 4 2020, 4:58 PM

So actually I just realized that my current implementation I have on this revision isn't correct. For some reason, I must've not noticed the test result is different.

I began to adapt my code to fix it, but I am running into an error. When I use the WorkList/Visited approach, this works and it seems to not increase the amount of time it loops through the code.

This implementation works:

SmallVector<MemoryAccess *, 4> WorkList;
SmallPtrSet<MemoryAccess *, 8> Visited;
auto PushMemUses = [&WorkList, &Visited](MemoryAccess *Acc) {
  if (!Visited.insert(Acc).second)
    return;
  for (Use &U : Acc->uses())
    WorkList.push_back(cast<MemoryAccess>(U.getUser()));
};
PushMemUses(Def);
for(auto x : WorkList) {
  if(MemoryDef *UseDef = dyn_cast_or_null<MemoryDef>(x)) {
    Instruction *UseInst = UseDef->getMemoryInst();
    if (!UseInst)
      continue;
    if (UseInst->isIdenticalTo(DefInst)) {
      deleteDeadInstruction(UseInst);
      MadeChange = true;
    }
  }
}

This implementation throws the error below

for (Use &U : DefAccess->uses()) {
  if(!U) 
    continue;
  MemoryAccess *UseAccess = dyn_cast_or_null<MemoryAccess>(U.getUser());
  if(!UseAccess)
    continue;
  if (MemoryDef *UseDef = dyn_cast_or_null<MemoryDef>(UseAccess)) {
    Instruction *UseInst = UseDef->getMemoryInst();
    if (!UseInst)
      continue;
    if (UseInst->isIdenticalTo(DefInst)) {
      deleteDeadInstruction(UseInst);
      MadeChange = true;
    }
  }
}
Stack dump:
0.	Program arguments: ./opt -dse -S -debug pr16520.ll 
1.	Running pass 'Function Pass Manager' on module 'pr16520.ll'.
2.	Running pass 'Dead Store Elimination' on function '@_Z1fbRb'
 #0 0x000000000366d017 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Support/Unix/Signals.inc:563:22
 #1 0x000000000366d0ce PrintStackTraceSignalHandler(void*) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Support/Unix/Signals.inc:630:1
 #2 0x000000000366b0ea llvm::sys::RunSignalHandlers() /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Support/Signals.cpp:71:20
 #3 0x000000000366ca6a SignalHandler(int) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Support/Unix/Signals.inc:405:1
 #4 0x00007fd893bd0a20 __restore_rt (/lib64/libpthread.so.0+0x13a20)
 #5 0x00000000008b1b4e llvm::Value::getValueID() const /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/IR/Value.h:529:12
 #6 0x000000000222aa83 llvm::MemoryAccess::classof(llvm::Value const*) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Analysis/MemorySSA.h:157:32
 #7 0x000000000223d163 llvm::isa_impl<llvm::MemoryAccess, llvm::User, void>::doit(llvm::User const&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:59:3
 #8 0x000000000223bcd2 llvm::isa_impl_cl<llvm::MemoryAccess, llvm::User const*>::doit(llvm::User const*) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:106:3
 #9 0x000000000223a649 llvm::isa_impl_wrap<llvm::MemoryAccess, llvm::User const*, llvm::User const*>::doit(llvm::User const* const&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:132:3
#10 0x0000000002237ef7 llvm::isa_impl_wrap<llvm::MemoryAccess, llvm::User* const, llvm::User const*>::doit(llvm::User* const&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:124:3
#11 0x0000000002235518 bool llvm::isa<llvm::MemoryAccess, llvm::User*>(llvm::User* const&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:144:1
#12 0x00000000032ab97d llvm::cast_retty<llvm::MemoryAccess, llvm::User*>::ret_type llvm::dyn_cast_or_null<llvm::MemoryAccess, llvm::User>(llvm::User*) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/include/llvm/Support/Casting.h:368:15
#13 0x00000000032a365a (anonymous namespace)::DSEState::eliminateDeadStoresOfExisitingValues() /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:2431:65
#14 0x00000000032a4e14 (anonymous namespace)::eliminateDeadStoresMemorySSA(llvm::Function&, llvm::AAResults&, llvm::MemorySSA&, llvm::DominatorTree&, llvm::PostDominatorTree&, llvm::TargetLibraryInfo const&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:2719:14
#15 0x00000000032a533b (anonymous namespace)::DSELegacyPass::runOnFunction(llvm::Function&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp:2795:45
#16 0x0000000002c28b1b llvm::FPPassManager::runOnFunction(llvm::Function&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1519:20
#17 0x0000000002c28d84 llvm::FPPassManager::runOnModule(llvm::Module&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1565:13
#18 0x0000000002c291ae (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1634:20
#19 0x0000000002c247ba llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:615:13
#20 0x0000000002c29995 llvm::legacy::PassManager::run(llvm::Module&) /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1762:1
#21 0x00000000008e9914 main /home/daniel/Projects/recurse/llvm-stuff/llvm-project/llvm/tools/opt/opt.cpp:973:15
#22 0x00007fd893683e0a __libc_start_main (/lib64/libc.so.6+0x27e0a)
#23 0x00000000008b144a _start /home/abuild/rpmbuild/BUILD/glibc-2.32/csu/../sysdeps/x86_64/start.S:122:0
fish: “./opt -dse -S -debug pr16520.ll” terminated by signal SIGSEGV (Address boundary error)

The line it is throwing the error on is MemoryAccess *UseAccess = dyn_cast_or_null<MemoryAccess>(U.getUser()); I tried debugging this, but got no luck, should I stick with the worklist idea? I think it doesn't break because of the visited set, but is this something that i could adapt to the other impl?

fhahn added a comment.Nov 6 2020, 12:59 PM

The line it is throwing the error on is MemoryAccess *UseAccess = dyn_cast_or_null<MemoryAccess>(U.getUser()); I tried debugging this, but got no luck, should I stick with the worklist idea? I think it doesn't break because of the visited set, but is this something that i could adapt to the other impl?

By deleting a memory access, you are modifying the use list you are iterating over, which can invalidate the iterator. Yo need to increment the iterator before removing the memory access, e.g.

for (auto UI = DefAccess->use_begin(), IE = DefAccess->use_end(); UI != IE;) {
  MemoryAccess *UseAccess = cast<MemoryAccess>((UI++)->getUser());
fhahn added a comment.Nov 24 2020, 1:07 PM

reverse ping. Any progress? :)

reverse ping. Any progress? :)

Sorry, the past couple of weeks have been quite busy for me (just started a new software engineering job ( first one! :-) ). Going to work on the testing this weekend!

So I tried to build the check-llvm-transforms-deadstoreelimination target, but it seems that 4 tests are failing:

********************
********************
Failed Tests (4):
  LLVM :: Transforms/DeadStoreElimination/MSSA/multiblock-loops.ll
  LLVM :: Transforms/DeadStoreElimination/MSSA/multiblock-memoryphis.ll
  LLVM :: Transforms/DeadStoreElimination/MSSA/multiblock-simple.ll
  LLVM :: Transforms/DeadStoreElimination/MSSA/simple.ll

I am not too sure how to interpret the test results, but I am attaching a file to this comment with the build output.

I also have my current implementation below incase you want to comment on anything you see to have issues

/// Eliminates writes to variables where the value that is being written
/// is already stored in the variable
bool eliminateDeadStoresOfExisitingValues() {
  bool MadeChange = false;
  LLVM_DEBUG(dbgs() << "Trying to eliminate MemoryDefs that write the "
                       "already exisiting value\n");
  for (int I = MemDefs.size() - 1; I >= 0; I--) {
    MemoryDef *Def = MemDefs[I];
    if (SkipStores.find(Def) != SkipStores.end() ||
        !isRemovable(Def->getMemoryInst()))
      continue;
    MemoryAccess *DefAccess = dyn_cast<MemoryAccess>(Def);
    if (!DefAccess || isa<MemoryPhi>(DefAccess))
      continue;
    Instruction *DefInst = cast<MemoryUseOrDef>(DefAccess)->getMemoryInst();
    if (!DefInst)
      continue;
    for (auto UI = DefAccess->use_begin(), IE = DefAccess->use_end();
         UI != IE;) {
      MemoryAccess *UseAccess = cast<MemoryAccess>((UI++)->getUser());
      if (MemoryDef *UseDef = dyn_cast_or_null<MemoryDef>(UseAccess)) {
        Instruction *UseInst = UseDef->getMemoryInst();
        if (!UseInst)
          continue;
        if (UseInst->isIdenticalTo(DefInst)) {
          deleteDeadInstruction(UseInst);
          MadeChange = true;
        }
      }
    }
  }
  return MadeChange;
}

xbolva00 added a subscriber: xbolva00.EditedDec 1 2020, 4:56 PM

Please use llvm/utils/update_test_checks.py to update them. Then please upload all changed files.

Please use llvm/utils/update_test_checks.py to update them. Then please upload all changed files.

I just tried to run this, but it is expecting a test pattern. Is there a specific pattern I should use? Both MSSA & check-llvm-transforms-deadstoreelimination didn't match anything

The script will update “check patterns” - then we can say if new changes are fine.

Got it, I ran it like ./utils/update_test_checks.py --opt=../build/bin/opt test/Transforms/DeadStoreElimination/MSSA/simple.ll for each failed test.

xbolva00 added a comment.EditedDec 1 2020, 5:36 PM

Right. Please update this patch now.

dmccrevan updated this revision to Diff 308826.Dec 1 2020, 5:47 PM

Nice!

llvm/test/Transforms/DeadStoreElimination/MSSA/simple.ll
507 ↗(On Diff #308826)

Would be good to update this comment somehow.

So I just confirmed that the new test case runs & passes, and every other test passed (except the 3 that "expectedly fail").

Is there anything else needed for this to be merged in?

Terminology check: the store being eliminated isn't so much dead,
it's that the stored value is identical to the currently-stored value,
i.e. the store is effectively a no-op.

Just to check, there is a test for the following:

int data;
data = 0; // keep
data = 0; // eliminate
data = 1; // keep
data = 0; // DON'T eliminate, since last stored value is 1.
data = 0; // eliminate
data = 1; // keep
data = 1; // eliminate

?

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2392–2393 ↗(On Diff #308826)
2403–2405 ↗(On Diff #308826)
fhahn added a comment.Dec 3 2020, 2:22 PM

Thanks for the update! I think it would be best to adjust the existing tests so that this optimization does not trigger, because they test different things. You can use some of those tests in a new file (something like MSSA/redundant-stores.ll), together with the one for PR16520.

Terminology check: the store being eliminated isn't so much dead,
it's that the stored value is identical to the currently-stored value,
i.e. the store is effectively a no-op.

Yep, it would be better to refer to redundant or no-op stores.

Just to check, there is a test for the following:

int data;
data = 0; // keep
data = 0; // eliminate
data = 1; // keep
data = 0; // DON'T eliminate, since last stored value is 1.
data = 0; // eliminate
data = 1; // keep
data = 1; // eliminate

?

I think in this particular case, the last store kills all other stores regardless of the value, so this transform wouldn't trigger. But we need more unit tests specifically for the transform in the patch.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2388 ↗(On Diff #308826)

Instead of talking about dead stores, I think using the term 'redundant' would be better.

2411 ↗(On Diff #308826)

Could you add a debug message and increment NumRedundantStores similar to the code around line 2682?

2410 ↗(On Diff #302880)

Could you add this case as unit tests?

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-loops.ll
186 ↗(On Diff #308826)

I think here it would be better to change the test to make sure the optimization does not trigger, to preserve the original spirit of the test.

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-memoryphis.ll
17 ↗(On Diff #308826)

I think here it would be better to change the test to make sure the optimization does not trigger, to preserve the original spirit of the test.

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-simple.ll
41 ↗(On Diff #308826)

I think here it would be better to change the test to make sure the optimization does not trigger, to preserve the original spirit of the test.

102 ↗(On Diff #308826)

I think here it would be better to change the test to make sure the optimization does not trigger, to preserve the original spirit of the test.

156 ↗(On Diff #308826)

I think here it would be better to change the test to make sure the optimization does not trigger, to preserve the original spirit of the test.

llvm/test/Transforms/DeadStoreElimination/MSSA/pr16520.ll
6

this should not be required for DSE.

8

Could you also add some more variants of the scenarios from the other tests?

fhahn added a comment.Jan 7 2021, 6:09 AM

reverse ping.

I think this is almost ready, it would be great to get the last remaining comments with respect to wording & tests wrapped up.

If you won't be able to follow up on this, it would be great if you could let us know, so potentially someone else could pick it up.

reverse ping.

I think this is almost ready, it would be great to get the last remaining comments with respect to wording & tests wrapped up.

If you won't be able to follow up on this, it would be great if you could let us know, so potentially someone else could pick it up.

Sorry, still in a busy time.

I switched the name of the function to eliminateRedundantStoresOfExisitingValues. When I reverted my changes in some of the test files, one test now fails unexpectedly. The only reason those changed was because I ran that python script to update the tests.

For the additional tests, if someone else would like to pick that up, I am more than happy to go with that. I am not as well versed with my LLVM IR as many of you are :-).

Sorry about the slowness of this. I was planning on having it wrapped up weeks ago.

fhahn updated this revision to Diff 316512.Jan 13 2021, 2:03 PM
fhahn edited the summary of this revision. (Show Details)

reverse ping.

I think this is almost ready, it would be great to get the last remaining comments with respect to wording & tests wrapped up.

If you won't be able to follow up on this, it would be great if you could let us know, so potentially someone else could pick it up.

Sorry, still in a busy time.

I switched the name of the function to eliminateRedundantStoresOfExisitingValues. When I reverted my changes in some of the test files, one test now fails unexpectedly. The only reason those changed was because I ran that python script to update the tests.

For the additional tests, if someone else would like to pick that up, I am more than happy to go with that. I am not as well versed with my LLVM IR as many of you are :-).

I precommitted additional tests in 6077d55381a6aa3e947ef7abdc36a7515c598c8a, including changes to update some of the existing tests so the new optimization does not trigger.

Sorry about the slowness of this. I was planning on having it wrapped up weeks ago.

No worries, thanks for all the work on this so far. I think this needs an additional check for self-reads; I think the memmoves in llvm/test/Transforms/DeadStoreElimination/MSSA/simple.ll should not be removed. I'll take a look at that tomorrow.

fhahn updated this revision to Diff 316601.Jan 14 2021, 2:34 AM

Updated to skip self-reads and simplified the code a bit.

fhahn accepted this revision.Jan 14 2021, 2:35 AM

I think this looks good now. I am planning to submit the patch tomorrow or so, unless there are further comments.

This revision is now accepted and ready to land.Jan 14 2021, 2:35 AM
nikic added inline comments.Jan 14 2021, 12:22 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
2371 ↗(On Diff #316601)

SkipStores.contains(Def)

2372 ↗(On Diff #316601)

This isRemovable checks is performed not on the store we'll try to remove. I think this is mostly okay because isIdenticalTo makes sure that if DefI is removable, so is UseI. The only part that doesn't directly follow is the CB->use_empty() part. Could it be that we first have a libcall with an unused return, and then a libcall with a used return, but otherwise identical?

nikic added a comment.Jan 14 2021, 1:16 PM

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=a3904cc77f181cff7355357688edfc392a236f5d&to=b5ccbbb5b1d8fc76e2291a860e4cf0a4787b77d4&stat=instructions Seems mostly fine, only outlier is kimwitu++ in NewPM-O3 configuration (k.cc has 1.25% regression).

Something I don't understand about the general approach here is why we start at the Def store and walk all uses to find a Use store to eliminate. Wouldn't it be more efficient to start at the Use store, and check the defining access? With the current approach, I think we'll only find the cases where Def is the defining access of Use, as everything else would be an indirect use. This approach should also be much easier to make more powerful (by using getClobberingMemoryAccess()) without running into compile-time issues from visiting indirect uses.

fhahn added a comment.Jan 14 2021, 2:24 PM

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=a3904cc77f181cff7355357688edfc392a236f5d&to=b5ccbbb5b1d8fc76e2291a860e4cf0a4787b77d4&stat=instructions Seems mostly fine, only outlier is kimwitu++ in NewPM-O3 configuration (k.cc has 1.25% regression).

Something I don't understand about the general approach here is why we start at the Def store and walk all uses to find a Use store to eliminate. Wouldn't it be more efficient to start at the Use store, and check the defining access? With the current approach, I think we'll only find the cases where Def is the defining access of Use, as everything else would be an indirect use. This approach should also be much easier to make more powerful (by using getClobberingMemoryAccess()) without running into compile-time issues from visiting indirect uses.

I don't remember if there was any particular reason to choose this approach to start with. I can't think of anything right now. Let me see if there are any cases that we would miss if we flip things.

fhahn added a comment.Jan 14 2021, 2:45 PM

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=a3904cc77f181cff7355357688edfc392a236f5d&to=b5ccbbb5b1d8fc76e2291a860e4cf0a4787b77d4&stat=instructions Seems mostly fine, only outlier is kimwitu++ in NewPM-O3 configuration (k.cc has 1.25% regression).

Something I don't understand about the general approach here is why we start at the Def store and walk all uses to find a Use store to eliminate. Wouldn't it be more efficient to start at the Use store, and check the defining access? With the current approach, I think we'll only find the cases where Def is the defining access of Use, as everything else would be an indirect use. This approach should also be much easier to make more powerful (by using getClobberingMemoryAccess()) without running into compile-time issues from visiting indirect uses.

I don't remember if there was any particular reason to choose this approach to start with. I can't think of anything right now. Let me see if there are any cases that we would miss if we flip things.

Ah I think I remember now. I think I had an extension in mind that also looked at the uses that are loads to remove any redundant loads read the same location. Not sure how relevant this is in practice. I'll check.

xbolva00 added a comment.EditedJan 21 2021, 2:02 AM

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=a3904cc77f181cff7355357688edfc392a236f5d&to=b5ccbbb5b1d8fc76e2291a860e4cf0a4787b77d4&stat=instructions Seems mostly fine, only outlier is kimwitu++ in NewPM-O3 configuration (k.cc has 1.25% regression).

Something I don't understand about the general approach here is why we start at the Def store and walk all uses to find a Use store to eliminate. Wouldn't it be more efficient to start at the Use store, and check the defining access? With the current approach, I think we'll only find the cases where Def is the defining access of Use, as everything else would be an indirect use. This approach should also be much easier to make more powerful (by using getClobberingMemoryAccess()) without running into compile-time issues from visiting indirect uses.

I don't remember if there was any particular reason to choose this approach to start with. I can't think of anything right now. Let me see if there are any cases that we would miss if we flip things.

Ah I think I remember now. I think I had an extension in mind that also looked at the uses that are loads to remove any redundant loads read the same location. Not sure how relevant this is in practice. I'll check.

Looking at llvm tracker + “remove noop loads” it looks fine, maybe lencod regressed in CT a bit more than expected (0,5% - ReleaseLTO)

nikic added a comment.Mar 3 2021, 6:17 AM

I still think that this is not the right way to do it. Even if we want to perform store to load forwarding as well (which is not really the job of DSE, but even so) then it would still make more sense to iterate over all memory accesses (rather than over all defs) and look at the defining access for each.

Hi, is there anybody actively working on this? It looks like "almost" done. @dmccrevan: are you going to move on with patch and address last @nikic comment? If not I'd like to give it a shot.

Precommited tests for PR50339 and PR49927, so please rebase.

Hi, is there anybody actively working on this? It looks like "almost" done. @dmccrevan: are you going to move on with patch and address last @nikic comment? If not I'd like to give it a shot.

Please feel free to take this.

Precommited tests for PR50339 and PR49927, so please rebase.

Sure.

Please feel free to take this.

Ok. Thanks.

Opened new review since I couldn't figure out how to push it here (different authors): https://reviews.llvm.org/D111727