Download Raw Diff

Details

Reviewers

eugenis
efriedma
fhahn
lebedev.ri

Summary

Support long chains of instruction where getUnderlyingObject fails.
getUnderlyingObjects already traverse large set on instructions so
there is no reason to limit some sequences by MaxLookup.

getUnderlyingObject needs MaxLookup to avoid infinit loops on cycles.

getUnderlyingObjects uses Visited so it can't get into infinit loops
if it calls getUnderlyingObject with MaxLookup > 0.

So we can continiosly repeate getUnderlyingObject updating Visited and
return all underlying Values.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

vitalybuka created this revision.Aug 26 2020, 6:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2020, 6:41 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

typo

Harbormaster completed remote builds in B69703: Diff 288152.Aug 26 2020, 7:11 PM

Harbormaster completed remote builds in B69704: Diff 288153.Aug 26 2020, 7:17 PM

so there is no reason to limit some sequences by MaxLookup.

Except avoiding runaway compile time in some degenerate cases possibly?

In D86669#2241103, @lebedev.ri wrote:

so there is no reason to limit some sequences by MaxLookup.

Except avoiding runaway compile time in some degenerate cases possibly?

With the current approach it's easy to make similar degenerate case with selects or phi nodes.
It was O(N) and still O(N) of number instructions.
MaxLookup makes getUnderlyingObject O(1), but it does not make getUnderlyingObjects.

rebase

vitalybuka added a reviewer: lebedev.ri.Jun 16 2021, 2:41 PM

I think this would benefit from adding the testcase[s] for the pass that uses this utility that showcase the improvements here.

Harbormaster completed remote builds in B109603: Diff 352562.Jun 17 2021, 1:14 AM

rebase

vitalybuka added a parent revision: D104585: [NFC] Add getUnderlyingObjects test.Jun 18 2021, 6:44 PM

Harbormaster completed remote builds in B110032: Diff 353147.Jun 18 2021, 6:44 PM

Oh, so wait, we need this for correctness even?
That doesn't look good on the LAA's side.

lebedev.ri mentioned this in D104585: [NFC] Add getUnderlyingObjects test.Jun 19 2021, 3:01 AM

rebase

Harbormaster completed remote builds in B110541: Diff 353839.Jun 22 2021, 7:08 PM

getUnderlyingObjects already traverse large set on instructions so
there is no reason to limit some sequences by MaxLookup.

I am not sure I completely follow this reasoning. Even if it already traverses a large number of instructions, without the depth limit it will visit even more? Do you have any estimate on the impact in terms of extra work?

llvm/test/Analysis/LoopAccessAnalysis/underlying-objects-2.ll
96 ↗	(On Diff #353839)	Not sure if it is really incorrect, but it is certainly a bit misleading; we will create a runtime check between `gepB_plus_one` and `gepB9`, which will never succeed I think. The result with the patch is certainly an improvement.

In D86669#2836130, @fhahn wrote:

getUnderlyingObjects already traverse large set on instructions so
there is no reason to limit some sequences by MaxLookup.

I am not sure I completely follow this reasoning. Even if it already traverses a large number of instructions, without the depth limit it will visit even more? Do you have any estimate on the impact in terms of extra work?

On real program almost always both version will do the same work and default MaxLookup will not be reached.

However there are possible two edge-cases with large number of nodes:
a) Long chain of N values:

a = b, b = c, c = d .... = underlying_object

b) Wide graph with total N values:

a = (a, b, c ....) , a = (a1, a2, ...), b = (b1, b2, ...)...

Existing algorithm will fail on "a" but exits in constant time, it may be able discover all objects on b (if it's not too deep), and but in O(N) time.
Patched algorithm discover all objects, in O(N) in both cases.

So my point is if we accept O(N) for b) why not accept O(N) for a) and let callers avoid thinking about MaxLookup.

tingwang added a subscriber: tingwang.Jun 5 2022, 10:29 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2022, 10:29 PM

We hit upon one test case (attached test.ll) blocked by this MaxLookup: the default value of MaxLookup is not enough to identify the underlying object, and the alias info added by AddAliasScopeMetadata() is wrong, resulting in wrong schedule decisions in machine-scheduler.

In function widget, pointer %tmp147 in below instruction is not correctly identified by getUnderlyingObjects()

%tmp148 = load i32, i32* %tmp147, align 4, !alias.scope !25, !noalias !26

The chain of pointers is below:

store [0 x %struct.ham.2]* %arg, [0 x %struct.ham.2]** %tmp, align 8
%tmp123 = load [0 x %struct.ham.2]*, [0 x %struct.ham.2]** %tmp, align 8
%tmp127 = bitcast [0 x %struct.ham.2]* %tmp123 to i8*
%tmp128 = getelementptr i8, i8* %tmp127, i64 %tmp126
%tmp136 = getelementptr inbounds i8, i8* %tmp128, i64 %tmp135
%tmp137 = bitcast i8* %tmp136 to [24 x i8]*
%tmp138 = bitcast [24 x i8]* %tmp137 to %struct.ham.2*
%tmp139 = getelementptr inbounds %struct.ham.2, %struct.ham.2* %tmp138, i32 0, i32 1
%tmp140 = bitcast [4 x %struct.spam.3]* %tmp139 to i8*
%tmp141 = getelementptr i8, i8* %tmp140, i64 -4
%tmp146 = getelementptr inbounds i8, i8* %tmp141, i64 %tmp145
%tmp147 = bitcast i8* %tmp146 to i32*

It looks like there is real program affected by this hard limit, and it produced wrong result. Can we think about this twice?

By the way, I used below command line to reproduce the issue.

opt --vector-library=MASSV -passes=default<O3> -aa-pipeline=default -mcpu=pwr9 -o test.err.bc test.ll

test.ll40 KBDownload

If you're seeing a miscompile, something is probably wrong with the caller. getUnderlyingObjects() is, in general, not guaranteed to produce an identifiable object. If the caller cares, it should check; for example, GlobalsAAResult::getModRefInfoForArgument checks all_of(Objects, isIdentifiedObject).

In D86669#3560983, @efriedma wrote:

If you're seeing a miscompile, something is probably wrong with the caller. getUnderlyingObjects() is, in general, not guaranteed to produce an identifiable object. If the caller cares, it should check; for example, GlobalsAAResult::getModRefInfoForArgument checks all_of(Objects, isIdentifiedObject).

I am not sure why do don't want to make it guaranty with a patch like this?

In D86669#3561125, @vitalybuka wrote:

In D86669#3560983, @efriedma wrote:

If you're seeing a miscompile, something is probably wrong with the caller. getUnderlyingObjects() is, in general, not guaranteed to produce an identifiable object. If the caller cares, it should check; for example, GlobalsAAResult::getModRefInfoForArgument checks all_of(Objects, isIdentifiedObject).

I am not sure why do don't want to make it guaranty with a patch like this?

How could we possibly guarantee that? In general, we're going to find some opaque thing that both getUnderlyingObjects() and its caller can't understand.

I mean, I guess we could define an API that narrowly guarantees it looks though all bitcast, addrspacecast, gep, phi, and select operations. But that would be expensive in general, and it's not obvious to me it solves anything.

In D86669#3561225, @efriedma wrote:

In D86669#3561125, @vitalybuka wrote:

In D86669#3560983, @efriedma wrote:

If you're seeing a miscompile, something is probably wrong with the caller. getUnderlyingObjects() is, in general, not guaranteed to produce an identifiable object. If the caller cares, it should check; for example, GlobalsAAResult::getModRefInfoForArgument checks all_of(Objects, isIdentifiedObject).

I am not sure why do don't want to make it guaranty with a patch like this?

How could we possibly guarantee that? In general, we're going to find some opaque thing that both getUnderlyingObjects() and its caller can't understand.

I mean, I guess we could define an API that narrowly guarantees it looks though all bitcast, addrspacecast, gep, phi, and select operations. But that would be expensive in general, and it's not obvious to me it solves anything.

Right, I totally forgot the point of the patch :)
I still think removing MaxLookup is good, and simplifies callers side.

In D86669#3560983, @efriedma wrote:

If you're seeing a miscompile, something is probably wrong with the caller. getUnderlyingObjects() is, in general, not guaranteed to produce an identifiable object. If the caller cares, it should check; for example, GlobalsAAResult::getModRefInfoForArgument checks all_of(Objects, isIdentifiedObject).

Thank you for pointing out the direction. It seems the caller AddAliasScopeMetadata() is making the false assumption without additional check. I will dig a little bit into history, and maybe submit a patch to fix that.

tingwang mentioned this in D127202: [InlineFunction] don't add noalias metadata for unknown objects.Jun 7 2022, 4:38 AM

jeroen.dobbelaere added a subscriber: jeroen.dobbelaere.Jun 15 2022, 2:17 AM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 12 2023, 5:30 PM

vitalybuka planned changes to this revision.May 24 2023, 5:25 PM

This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Remove MaxLookup from getUnderlyingObjects
Changes PlannedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 288152

llvm/include/llvm/Analysis/ValueTracking.h

llvm/lib/Analysis/ValueTracking.cpp

llvm/unittests/Analysis/ValueTrackingTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Remove MaxLookup from getUnderlyingObjectsChanges PlannedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 288152

llvm/include/llvm/Analysis/ValueTracking.h

llvm/lib/Analysis/ValueTracking.cpp

llvm/unittests/Analysis/ValueTrackingTest.cpp

[ValueTracking] Remove MaxLookup from getUnderlyingObjects
Changes PlannedPublic