This is an archive of the discontinued LLVM Phabricator instance.

[DSE] Support traversing MemoryPhis.
ClosedPublic

Authored by fhahn on Jan 3 2020, 6:12 AM.

Details

Summary

For MemoryPhis, we have to avoid that the MemoryPhi may be executed
before before the access we are currently looking at.

To do this we do a post-order numbering of the basic blocks in the
function and bail out once we reach a MemoryPhi with a larger (or equal)
post-order block number than the current MemoryAccess.
This changes the order in which we visit stores for elimination.

This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access.

Diff Detail

Event Timeline

fhahn created this revision.Jan 3 2020, 6:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2020, 6:12 AM

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

fhahn added a comment.Jan 17 2020, 9:32 AM

Needs updating to work bottom-up.

fhahn updated this revision to Diff 241618.Jan 30 2020, 5:44 PM

Updated to work with bottom up approach, iteratively.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

fhahn updated this revision to Diff 242174.Feb 3 2020, 1:36 PM

Rebased after updates to D72700.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

fhahn updated this revision to Diff 243260.Feb 7 2020, 11:51 AM

Rebased after latest changes to earlier patch.

fhahn updated this revision to Diff 243555.Feb 10 2020, 7:28 AM

Ping. Rebased after D72700 landed.

One general comment: this seems to do more than just lifting the MemoryPhi restriction. It's adding some cases that were were not added to the initial version, and that's great, but it would be good to add this in the patch description at least.
(e.g. continue upward search in DSE.cpp:1871, adding removal of partially overlapping stores)

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1620–1621

cast and remove null check below?

1671–1672

Can you clarify this?
KillingDef is a MemoryDef, UseAccess is a MemoryPhi inside this if. How can they be equal?

fhahn edited the summary of this revision. (Show Details)Feb 21 2020, 9:11 AM
fhahn updated this revision to Diff 245890.Feb 21 2020, 9:15 AM
fhahn marked 3 inline comments as done.

j One general comment: this seems to do more than just lifting the MemoryPhi restriction. It's adding some cases that were were not added to the initial version, and that's great, but it would be good to add this in the patch description at least.

(e.g. continue upward search in DSE.cpp:1871, adding removal of partially overlapping stores)

It looks like the overlapping store handling slipped in at some point while rebasing the series. I've removed the changes and will update the separate patch. Sorry about that.

I've extended the description to include the change to how candidates for elimination are explored now. It's quite tied to handling MemoryPhis I think, but I can split it off, if you prefer.

fhahn added inline comments.Feb 21 2020, 9:19 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1671–1672

I think this was a left-over before switching to a SetVector worklist. With the SetVector, we will never visit the node twice. I've added an assertion to be sure to check for cycles.

asbirlea added inline comments.Feb 21 2020, 10:37 AM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1671–1672

Sure, but this equality cannot happen, right? They're different types, and one if not a subtype of the other. One is a MemoryPhi and one a MemoryDef (KillingDef, the store that started the search), they cannot be equal even if you didn't check for cycles.

Sounds like you're thinking of not seeing DomAccess again as the worklist is a set, but you may see it AFAICT. If at :1617 DomAccess uses itself, it will be added to the worklist. I'm inclined to believe the change you meant is:

if (DomAccess == UseAccess)
  continue;

But just as well not having anything is fine. The PushMemUses will be a noop, as all uses have already been added.

1690–1694

Should this be Current == UseAccess ?

fhahn updated this revision to Diff 246121.Feb 23 2020, 12:50 PM

Remove unnecessary assertion, add comment to check.

fhahn marked 4 inline comments as done.Feb 23 2020, 12:56 PM
fhahn added inline comments.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1671–1672

Sure, but this equality cannot happen, right?

Ah yes! I thought it was needed for a test case I added, but after the refactoring it is not necessary any longer. I've removed it

1690–1694

I've added a comment why this check is there. We have to make sure the KillingDef does not read the memory location. The easiest way is to just consider it as regular use, but skip the overwrite check. It probably makes more sense to just check for self reads before getDomMemoryDef. I've added a TODO.

On a related note, I've put up D75025, which should help with avoiding a number of unnecessary duplicated checks for the same killing def.

fhahn updated this revision to Diff 246779.Feb 26 2020, 10:03 AM
fhahn marked 2 inline comments as done.
uenoku added a subscriber: uenoku.Feb 29 2020, 6:11 PM
fhahn updated this revision to Diff 248813.Mar 6 2020, 1:33 PM

ping.

Add a missing isa<MemoryPhi> check after getting walker result + test case.

fhahn updated this revision to Diff 249983.Mar 12 2020, 10:49 AM

Ping.

@asbirlea I think all your comments should be addressed in the latest version.

Also rebased and fixed debug printing to account for MemoryPhis.

asbirlea accepted this revision.Mar 19 2020, 1:43 PM

Thank you!

This revision is now accepted and ready to land.Mar 19 2020, 1:43 PM
This revision was automatically updated to reflect the committed changes.

@fhahn is this patch supposed to support the following case?

define i8* @foo(i1 %cond, i8* %p1, i8* %p2) {
entry:

%0 = icmp ne i1 %cond, 0
br i1 %0, label %dead_store, label %merge

dead_store:

store i8 1, i8* %p1
br label %merge

merge:

%p3 = phi i8* [ %p1, %dead_store ], [ %p2, %entry ]
store i8 2, i8* %p3
ret i8* %p3

}

If not, is there any plans or prototype which can handle it? Thanks.

fhahn added a comment.Aug 26 2020, 9:38 AM

@fhahn is this patch supposed to support the following case?

define i8* @foo(i1 %cond, i8* %p1, i8* %p2) {
entry:

%0 = icmp ne i1 %cond, 0
br i1 %0, label %dead_store, label %merge

dead_store:

store i8 1, i8* %p1
br label %merge

merge:

%p3 = phi i8* [ %p1, %dead_store ], [ %p2, %entry ]
store i8 2, i8* %p3
ret i8* %p3

}

If not, is there any plans or prototype which can handle it? Thanks.

Currently we do not phi-translate the location of the 'killing def' and hence do not eliminate the store in this case. I haven't thought too much about it so far, but I think doing the phi translation of the 'killing loc' should not be too hard. It might be as simple as adjusting the ToCheck set to contain a pair (Access to check, Phi-translated DefLoc) and do phi translation when we queue the incoming defs for MemoryPhis (https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp#L2284)

I won't have time to look into this in the near future unfortunately, as I am mainly focused on getting things in good shape to enable DSE & MemorySSA by default. It might be good to file a bug report so this opportunity does not drop off our radar. If anyone is interested in giving implementing it a try, even better :)

I won't have time to look into this in the near future unfortunately, as I am mainly focused on getting things in good shape to enable DSE & MemorySSA by default. It might be good to file a bug report so this opportunity does not drop off our radar. If anyone is interested in giving implementing it a try, even better :)

Florian, thank you for quick feedback. I prototyped phi translation approach and it does work on the posted test case. Unfortunately, it fails on a bit more complicated case which I will take a look tomorrow. If I have something working reasonably well in the end I will share it.

fhahn added a comment.Aug 27 2020, 9:04 AM

I won't have time to look into this in the near future unfortunately, as I am mainly focused on getting things in good shape to enable DSE & MemorySSA by default. It might be good to file a bug report so this opportunity does not drop off our radar. If anyone is interested in giving implementing it a try, even better :)

Florian, thank you for quick feedback. I prototyped phi translation approach and it does work on the posted test case. Unfortunately, it fails on a bit more complicated case which I will take a look tomorrow. If I have something working reasonably well in the end I will share it.

That's great! There are some changes to how we find MemoryDefs that may be killed by a killing def, rGe717fdb0f155 and D86487. This might actually make things a bit easier, as we now fully control the traversal of the defining accesses.

fhahn added a comment.Sep 18 2020, 5:53 AM

I won't have time to look into this in the near future unfortunately, as I am mainly focused on getting things in good shape to enable DSE & MemorySSA by default. It might be good to file a bug report so this opportunity does not drop off our radar. If anyone is interested in giving implementing it a try, even better :)

Florian, thank you for quick feedback. I prototyped phi translation approach and it does work on the posted test case. Unfortunately, it fails on a bit more complicated case which I will take a look tomorrow. If I have something working reasonably well in the end I will share it.

That's great! There are some changes to how we find MemoryDefs that may be killed by a killing def, rGe717fdb0f155 and D86487. This might actually make things a bit easier, as we now fully control the traversal of the defining accesses.

Hi @ebrevnov,

I just wanted to briefly check in to see if you made any progress on this or if you think it is worth adding a bug report to keep track of the improvement?

Cheers,
Florian

Hi @ebrevnov,

I just wanted to briefly check in to see if you made any progress on this or if you think it is worth adding a bug report to keep track of the improvement?

Hi Florian,

I have a prototype which is able to handle my case. I can upload it to the phabricator for the reference. It's based on about month old sources because if I rebase to newer things stop working. Another missing piece is caching results of phi translation. In the current version phi translation is done from scratch for each new memory access. It shouldn't be hard to restructure the code to reuse phi translated address from "previous" memory access.

Thanks
Evgeniy

fhahn added a comment.Oct 6 2020, 9:04 AM

Hi @ebrevnov,

I just wanted to briefly check in to see if you made any progress on this or if you think it is worth adding a bug report to keep track of the improvement?

Hi Florian,

I have a prototype which is able to handle my case. I can upload it to the phabricator for the reference. It's based on about month old sources because if I rebase to newer things stop working. Another missing piece is caching results of phi translation. In the current version phi translation is done from scratch for each new memory access. It shouldn't be hard to restructure the code to reuse phi translated address from "previous" memory access.

Yeah there have been some substantial changes to the way we find the candidates to remove, but I think the same idea should still apply. It would be ideal to update & rebase the patch, but if you won't have time to do so it would be great if you could share the older version regardless.

Yeah there have been some substantial changes to the way we find the candidates to remove, but I think the same idea should still apply. It would be ideal to update & rebase the patch, but if you won't have time to do so it would be great if you could share the older version regardless.

Rebased and uploaded here https://reviews.llvm.org/D90095