Page MenuHomePhabricator

[GVN] Also invalidate users of instructions replaced due to conditionals.
Needs ReviewPublic

Authored by fhahn on Jul 24 2019, 5:52 AM.

Details

Summary

We also need to invalidate the users of the replaced value, as they could
be GEPs indexed by the replaced value. In that case, the cache will
contain stale information and return invalid information.

Fixes PR31651.

Event Timeline

fhahn created this revision.Jul 24 2019, 5:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2019, 5:52 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

I'm not sure this fix is actually solving a real issue, as opposed to merely hiding the issue for the given testcase.

Replacing a use of an integer value with a use of an equal integer value could make alias analysis return different results, but the old result should still be correct; you haven't actually changed the alias properties of any memory operations. So reusing the result of any queries cached by MemDep should still produce a correct result (maybe an overly conservative result, but that should be okay). Or does MemDep caching not work like that for some reason?

fhahn added a comment.Jul 24 2019, 2:43 PM

I'm not sure this fix is actually solving a real issue, as opposed to merely hiding the issue for the given testcase.

Replacing a use of an integer value with a use of an equal integer value could make alias analysis return different results, but the old result should still be correct; you haven't actually changed the alias properties of any memory operations. So reusing the result of any queries cached by MemDep should still produce a correct result (maybe an overly conservative result, but that should be okay). Or does MemDep caching not work like that for some reason?

My understanding of the non-local query caching is the following: MemDepAnalysis keeps a map of query pointer values to a list discovered (BB, Dependency) pairs. It derives the pointer values by starting with the original pointer and then translating the address through PHIs in the predecessors. PHITranslateAddr tries to translate the pointer to an existing pointer, by looking at the existing uses of the translated PHI. In the example we translate %_tmp20 through %step1.7.0 to getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %step1.7.0 and it returns the first matching existing matching value %_tmp47. Using the pointer, we find Defs %_tmp48 = load i16, i16* %_tmp47, align 2 and %_tmp58 = load i16, i16* %_tmp57, align 2 and cache them using the translated pointer (%_tmp47)

Now GVN comes along and replaces the use of %step1.7.0 in %_tmp47 with 0.

Later we translate _tmp78 to getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 0 through %i.8.0 and later find the equivalent value %_tmp47 and use that as key into the cache, which now returns the defs %_tmp47 and %_tmp57, even though they now access different addresses.

It seems like MemDepAnalysis is using syntactic equivalences for effective caching (and therefore assumes the IR is not modified under it) and GVN in this case destroys the syntactic equivalence, Unless I am missing something here, invalidating the cache seems reasonable here (I think the same argument would also hold for replacing equivalent pointers).

Now there is still a potential problem as we just invalidate values 2 levels deep (the value itself and its users), if MemDepAnalysis goes deeper than looking at the immediate operands to find equivalences, we still have invalid values in the cache. dberlin mentioned that he might have a test case for such a scenario.

Oh, okay, that makes more sense.

cache them using the translated pointer (%_tmp47)

Does it really make sense to cache the load %_tmp58 using the pointer %_tmp47, as opposed to the pointer %_tmp57? There isn't any dominance relationship between %_tmp47 and %_tmp58.

Now there is still a potential problem as we just invalidate values 2 levels deep

This seems concerning, yes.

Oh, okay, that makes more sense.

cache them using the translated pointer (%_tmp47)

Does it really make sense to cache the load %_tmp58 using the pointer %_tmp47, as opposed to the pointer %_tmp57? There isn't any dominance relationship between %_tmp47 and %_tmp58.

I think it makes sense in this use case, given we just want to find a single value to represent equivalent addresses and we are not using them to replace anything. But it requires careful cache invalidation and we get that wrong in a bunch of places.

We could ask for addresses that dominate the blocks we are translating through, but it would limit both the number of addresses we can translate and also increase the number of cache misses (6 existing GVN test cases fail when requiring dominating addresses). What do you think? Should we fix the issues with invalidation or restrict the caching/translating?