This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Hoisting invariant.group loads
Needs ReviewPublic

Authored by Prazek on Apr 1 2018, 2:05 PM.

Details

Summary

This patch introduce hoisting !invariant.group loads that
are guaranteed to execute when the loop is executed. This way
we ain't gonna drop any metadata. This is crucial as we don't know
yet if it is profitable to hoist !invariant.group loads speculatively.

Diff Detail

Event Timeline

Prazek created this revision.Apr 1 2018, 2:05 PM
kuhar accepted this revision.Apr 3 2018, 8:26 PM
kuhar added inline comments.
lib/Transforms/Scalar/LICM.cpp
522 ↗(On Diff #140595)

nit: I'd remove the second 'only'

This revision is now accepted and ready to land.Apr 3 2018, 8:26 PM

Can someone with experience in Loop passes take a look at this?

You say that you don't yet know if this is profitable yet. Do you have reason to believe that it might not be profitable (e.g., some example where it seems like we might want to constrain the behavior)? We almost never favor keeping metadata over doing other transformations, so I think it's worth being explicit about our reasoning here. Just saying "I don't know yet" is probably not sufficient, as we could say that about nearly all metadata, and in that case, change the default logic.

lib/Transforms/Scalar/LICM.cpp
1052 ↗(On Diff #140595)

This now sounds fairly tautological. How about saying, "Except when we can prove the metadata independent of any such conditions, strip it." (instead of "Conservatively strip all metadata on the instruction unless we can keep it.")

1055 ↗(On Diff #140595)

isGuaranteedToExecute -> canKeepMetadata

Prazek added a comment.Jun 2 2018, 8:28 AM

You say that you don't yet know if this is profitable yet. Do you have reason to believe that it might not be profitable (e.g., some example where it seems like we might want to constrain the behavior)? We almost never favor keeping metadata over doing other transformations, so I think it's worth being explicit about our reasoning here. Just saying "I don't know yet" is probably not sufficient, as we could say that about nearly all metadata, and in that case, change the default logic.

The reasoning behind it is that given function:

void foo(A *a) {

other(a);
while(...)
  something();
  a->bar();

}

we could hoist vtable and virtual function loads from the loop, but it would strip the metadata.
This could be very bad after we would inline the function in the context where the dynamic type is known by invariant.group

void caller() {
A a;
foo(&a);
}

now because we stripped the metadata, and because call to "other" blocks us from const propagating vptr value we would not be able to determine a->bar.
If we would not hoist vptr load then by providing other store or load with invariant group we would be able to const propagte it to the one inside the loop.

I think that instead of hoisiting the vtable load all the time we should unroll the loop one time, so that vptr value can be propagated from the first iteration. This way we would still keep the metadata.

I did not yet collected benchmarking data with and without this patch, but looking at one microbenchmark from LNT that had speedup arround 70% I am pretty confident that can make very similar benchmark, but that would use the example that I showed to show that stripping metadata while hoisting vtable load would not be as beneficial.

Prazek added a comment.Jun 2 2018, 8:35 AM

s/loop unrolling/loop peeling/
I will had to take a look at this at some time, but from what I've heard the state of loop peeling in LLVM is not very sofisticated, and implementing the logic that I propose might be a little hard.

Prazek updated this revision to Diff 151158.Jun 13 2018, 7:21 AM
Prazek marked 3 inline comments as done.
  • Fixed comments

@hfinkel is my response resonable?

friendly ping

Can we spec !invariant.group in a way that lets us always keep the metadata when hoisting? Right now it isn't clear what happens if its contract is violated, i.e. what the behavior of this program is:

store i32 0, i32* %ptr, !invariant.group !0
%x = load i32, i32* %ptr, !invariant.group !0
store i32 1, i32* %ptr, !invariant.group !0
%y = load i32, i32* %ptr, !invariant.group !0

though it seems like you're assuming the *load* has UB? Can we instead say that the second store has UB? That way we should be able to hoist the load instruction without dropping the metadata.

llvm/lib/Transforms/Scalar/LICM.cpp
542

lose

1075

Not sure what this replacement adds -- do you plan to add more stuff to canKeepMetadata in the future?

Prazek added inline comments.Aug 13 2018, 4:00 PM
llvm/lib/Transforms/Scalar/LICM.cpp
1075

It is a good way to document the code. This way I do not need to comment both calls to
isGuaranteedToExecute saying what is the logic behind it. It is also more future proof - if someone would like to
change the logic of canKeepMetadata, then all callers will be affected, which could not be the case if someone would only change one place.

Can we spec !invariant.group in a way that lets us always keep the metadata when hoisting? Right now it isn't clear what happens if its contract is violated, i.e. what the behavior of this program is:

store i32 0, i32* %ptr, !invariant.group !0
%x = load i32, i32* %ptr, !invariant.group !0
store i32 1, i32* %ptr, !invariant.group !0
%y = load i32, i32* %ptr, !invariant.group !0

though it seems like you're assuming the *load* has UB? Can we instead say that the second store has UB? That way we should be able to hoist the load instruction without dropping the metadata.

I talked with Sanjoy ofline, but for the record, consider:

%ptr = alloc 
if (false) { 
  %val = load %ptr, !invariant.group 
} 
store %ptr, 1, !invariant.group 
load %ptr, !invariant.group

I think that the current LangRef says that the store with invariant.group must store the same value (and it could also be removed if there is dominating store/load with invariant.group),
but even without it, in this example we could forward undef to the second load based on the hoisted first load, if we would keep the !invariant.group md.

Sanjoy sugested that such load could return poison instead of undef. I am not familiar with it, and I would expect that getting such change would probably take the same amount of time as getting this simple change (posted on April 1st),
so if there are no objections, I would like to put this one in the trunk.

@hfinkel, you had some objections. Can you take a look again?

I'm sorry I can't be more decisive here since I wasn't deeply involved with the devirtualization work early on and so lack a lot of context. The general direction here seems fine to me -- given that we use metadata to express a large range of things, it seems ok to have metadata specific MD dropping policies. However, given that Hal originally objected to this, we should make sure he is on board.

I'm sorry I can't be more decisive here since I wasn't deeply involved with the devirtualization work early on and so lack a lot of context. The general direction here seems fine to me -- given that we use metadata to express a large range of things, it seems ok to have metadata specific MD dropping policies. However, given that Hal originally objected to this, we should make sure he is on board.

I understand the motivation, thanks. The problem here is that, while we might want to avoid hoisting so that we don't strip the metadata, that only is better if we happen to inline into a function that then provides a concrete type. Otherwise, we should have hoisted. How about this: Add a parameter to LICM to control this choice: We can choose not to hoist during the early runs on LICM (which happen during inlining), and then hoist later (during the LICM invocation that runs after loop unrolling).

I'm sorry I can't be more decisive here since I wasn't deeply involved with the devirtualization work early on and so lack a lot of context. The general direction here seems fine to me -- given that we use metadata to express a large range of things, it seems ok to have metadata specific MD dropping policies. However, given that Hal originally objected to this, we should make sure he is on board.

I understand the motivation, thanks. The problem here is that, while we might want to avoid hoisting so that we don't strip the metadata, that only is better if we happen to inline into a function that then provides a concrete type. Otherwise, we should have hoisted. How about this: Add a parameter to LICM to control this choice: We can choose not to hoist during the early runs on LICM (which happen during inlining), and then hoist later (during the LICM invocation that runs after loop unrolling).

Sorry for late response, didn't have time to look into this.
I though about this and I would like to stick to the current implementation. I think that having different LICM behavior introduces complexity that is not worth the cost.
We would also need to introduce new named passes, like licm-pre-inline and licm-after-inline. It is also not clear what we should do after inlining when doing LTO.

I would stick to this approach, as it does not pessimize current implementation -- in some cases we can hoist vtable loads, but all of the vtable loads that would be hoisted before are also hoisted right now.

kuhar resigned from this revision.Sep 30 2019, 9:48 AM
This revision now requires review to proceed.Sep 30 2019, 9:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2019, 9:48 AM
Herald added a subscriber: asbirlea. · View Herald Transcript
reames resigned from this revision.Mar 25 2020, 1:53 PM
asbirlea removed a subscriber: asbirlea.Mar 25 2020, 2:29 PM
sanjoy resigned from this revision.Jan 29 2022, 5:31 PM