This is an archive of the discontinued LLVM Phabricator instance.

Fix `InstCombine` to not widen metadata on store-to-load forwarding
ClosedPublic

Authored by yuyichao on Jun 12 2016, 2:27 PM.

Details

Summary

The original check for load CSE or store-to-load forwarding is wrong
when the forwarded stored value happened to be a load.

The current logic was added in r241886. It looks safe but I'm wondering what kind of code pattern it used to mis-optimize. If there are two loads from the same address marked with either metadata, shouldn't it be valid to keep either of them? (After all, either of them had better be valid to start with).

Ref https://github.com/JuliaLang/julia/issues/16894

Diff Detail

Event Timeline

yuyichao updated this revision to Diff 60480.Jun 12 2016, 2:27 PM
yuyichao retitled this revision from to Fix `InstCombine` to not widen metadata on store-to-load forwarding.
yuyichao updated this object.
yuyichao added a subscriber: llvm-commits.

Can you explain what the problem is ? This patch doesn't seems to do much more than moving the check for the value to be a load from within InstCombineLoadStoreAlloca to FindAvailableLoadedValue via an out parameter (which seems like an hidden dependency, I'm not super fan TBH).

What do I miss here ?

In the CSE case, we aren't guaranteed that the metadata on the first load makes sense: the optimizer is allowed to insert a load anywhere with any metadata as long as it can guarantee it's safe to dereference. So we have to merge the metadata to ensure correct behavior. (Realistically, we might end up in this situation by hoisting a dead load.)


While you're in the area, would you mind fixing JumpThreading as well? (If you don't have time, I'll just file a bug instead.)

lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
827

Nit: "cast<LoadInst>()".

Can you explain what the problem is

See the test case. The store-to-load forwarding case can return a load instruction too.

the optimizer is allowed to insert a load anywhere with any metadata as long as it can guarantee it's safe to dereference.

Does it mean that if that store is not merged with something else in a similar way it can be incorrectly optimized?

yuyichao added a comment.EditedJun 12 2016, 6:11 PM

would you mind fixing JumpThreading as well?

I assume you mean the use of this function in JumpThreading? I did a grep before submitting this and didn't see that one having similar problem since it is not conditioning on the type of the returned instruction (not in an obvious way for me anyway). I'm willing to fix it if I can understand what's the problem.

P.S. somehow I don't get email notification from this. Not sure why and I've definitely got email from here before... NVM something triggered my email filter (not spam filter)....

yuyichao updated this revision to Diff 60483.Jun 12 2016, 6:38 PM

static_cast<LoadInst*> -> cast<LoadInst>

eli.friedman accepted this revision.Jun 12 2016, 7:22 PM
eli.friedman added a reviewer: eli.friedman.

LGTM.


Does it mean that if that store is not merged with something else in a similar way it can be incorrectly optimized?

Not sure what the question is here. You don't need to touch the metadata in store->load forwarding: you're erasing the load, and the semantics of the store aren't affected by a subsequent load.

would you mind fixing JumpThreading as well?

I assume you mean the use of this function in JumpThreading? I did a grep before submitting this and didn't see that one having similar problem since it is not conditioning on the type of the returned instruction (not in an obvious way for me anyway). I'm willing to fix it if I can understand what's the problem.

Well, the primary issue in JumpThreading is that it doesn't strip the load metadata using combineMetadata or something like that where appropriate. It also uses the metadata from the store in store->load forwarding in some cases, sort of the opposite of the instcombine issue. Taking another look, it's more than just a few lines to fix, so probably best to submit for review separately.

This revision is now accepted and ready to land.Jun 12 2016, 7:22 PM
yuyichao marked an inline comment as done.EditedJun 12 2016, 7:57 PM

Not sure what the question is here. You don't need to touch the metadata in store->load forwarding: you're erasing the load, and the semantics of the store aren't affected by a subsequent load.

Erasing a load is not the question, the question is that why is the optimizer allowed to create load with wrong metadata and the load CSE doesn't feel like the right place to handle this. It seems that if the optimizer somehow can't prove that the metadata is valid, it should just strip the metadata first instead of letting other passes remove the invalid metadata.

In another word, IMHO,

%v1 = load i64, i64 *%p1, !tbaa !0
%v2 = load i64, i64 *%p1, !tbaa !1

should be invalid IR (and the optimizer should not be allowed to create this) if !1 and !2 are non-aliasing tbaa node. Are there examples for other metadata where this is not the case?

Ping.

Is this good to go? If it is, I'll need someone else to commit it for me since I don't have commit access.

Yes, it's good to go; I'll commit it tonight (assuming nobody else gets to it first).

anna added a subscriber: anna.EditedJun 20 2016, 5:26 AM

Could you please add a comment explaining the new parameter isLoadCSE same as done for other parameters in FindAvailableLoadedValue, and what's the purpose of recording the fact that it's a load. It's not very obvious by looking at the change in the code in InstCombineLoadStoreAlloca.cpp.

This way other optimizations using FindAvailableLoadedValue can *correctly* use this hidden parameter, isLoadCSE.

I agree with @deadalnix can we fix the issue of missing metadata without using hidden parameters, and adding to the list of parameters kept track in FindAvailableLoadedValue?

I can add a comment; it would look something like this:

/// If \c isLoadCSE is non-null and a previous load from the same pointer is found, the value
/// is set to true; if a previous store to the same pointer is found, it is set to false.
/// This allows the caller to tell whether it needs to update the metadata of the returned value.

Is that sufficient? Or do I need to include an explanation of why load CSE involves updating the metadata?

anna added a comment.Jun 20 2016, 10:17 AM

This looks good to me. So, all callers will confirm if the metadata of the returning instruction needs to be changed and how the metadata should be changed.

if a previous store to the same pointer is found, it is set to false.

My patch actually doesn't do that in such case. Probably better to add that too?

Oh, right... there's a fix for that in http://reviews.llvm.org/D21460, but that still waiting for review. I'll split it off into a separate commit.