This is an archive of the discontinued LLVM Phabricator instance.

[CaptureTracking] Add optimistic capture tracker for stores
AbandonedPublic

Authored by JDevlieghere on Dec 8 2016, 11:43 AM.

Details

Summary

The current capture tracking implementation always considers a store instruction to capture the given use. The example below shows a situation where this is overly pessimistic. The current implementation considers %ptr to be captured because it's the operand of the second store.

define void @sample() {
entry:
  %ptr = alloca i32
  store i32 1, i32* %ptr
  %ptrtoptr = alloca i32*
  store i32* %ptr, i32** %ptrtoptr
  ret void
}

This patch adds a new capture tracker implementation called OptimisticCaptureTracker. It can be used through a slightly modified interface that expects alias analysis results as an additional argument. If storeCaptures is set to true or no AA is provided, the OptimisticCaptureTracker behaves exactly as the current SimpleCaptureTracker. Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

Diff Detail

Repository
rL LLVM

Event Timeline

JDevlieghere retitled this revision from to [CaptureTracking] Add optimistic capture tracker for stores.
JDevlieghere updated this object.
JDevlieghere added reviewers: hfinkel, reames.
JDevlieghere set the repository for this revision to rL LLVM.
hfinkel edited edge metadata.Dec 13 2016, 9:31 PM

Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

I suppose this is named "optimistic" because it might return a false negative? If the pointer value is stored into some alloca then read and the read value stored into some global, then the value is captured but your analysis will return false. Is this what you intend?

Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

I suppose this is named "optimistic" because it might return a false negative? If the pointer value is stored into some alloca then read and the read value stored into some global, then the value is captured but your analysis will return false. Is this what you intend?

Does the example below cover what you have in mind?

@global = external global i32*

define void @sample() {
entry:
  %ptr = alloca i32
  store i32 1, i32* %ptr
  %ptrtoptr = alloca i32*
  store i32* %ptr, i32** %ptrtoptr
  %deref = load i32*, i32** %ptrtoptr
  store i32* %deref , i32** @global
  ret void
}

If so, the escape of %ptr is properly detected by my change. I have a small printer pass for capture tracking which enables me to easily verify this kind of stuff. I will update my diff to include it, we can always remove it again later.

JDevlieghere edited edge metadata.

Added small printer pass for capture tracking.

Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

I suppose this is named "optimistic" because it might return a false negative? If the pointer value is stored into some alloca then read and the read value stored into some global, then the value is captured but your analysis will return false. Is this what you intend?

Does the example below cover what you have in mind?

@global = external global i32*

define void @sample() {
entry:
  %ptr = alloca i32
  store i32 1, i32* %ptr
  %ptrtoptr = alloca i32*
  store i32* %ptr, i32** %ptrtoptr
  %deref = load i32*, i32** %ptrtoptr
  store i32* %deref , i32** @global
  ret void
}

If so, the escape of %ptr is properly detected by my change. I have a small printer pass for capture tracking which enables me to easily verify this kind of stuff. I will update my diff to include it, we can always remove it again later.

Can you please explain how this is detected? The store is to an alloca, and I don't believe that the system does any kind of memory-based data-flow tracking.

In any cases, some regression test cases will be necessary.

lib/Analysis/CaptureTracking.cpp
81

An Argument is always a Value, so this cast/check should never be necessary.

82

This check won't work correctly if the value is stored into some offset based on the argument. You can make the right kind of query, but only if you construct a Location with an unknown size based on the value.

90

Again, this cast/check is not necessary (a Global is always a Value).

91

Same comment here about the size/offset issue.

101

Remove this blank line.

218

Add a space in between the for and the ( [same for the lines below for for/if].

  • Addressed @hfinkel's comments
  • Added a few simple tests
  • Fixed style comment (not sure why clang-format missed it?)
  • Only print values that have a name
JDevlieghere marked 6 inline comments as done.Dec 14 2016, 12:05 PM

Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

I suppose this is named "optimistic" because it might return a false negative? If the pointer value is stored into some alloca then read and the read value stored into some global, then the value is captured but your analysis will return false. Is this what you intend?

Does the example below cover what you have in mind?

@global = external global i32*

define void @sample() {
entry:
  %ptr = alloca i32
  store i32 1, i32* %ptr
  %ptrtoptr = alloca i32*
  store i32* %ptr, i32** %ptrtoptr
  %deref = load i32*, i32** %ptrtoptr
  store i32* %deref , i32** @global
  ret void
}

If so, the escape of %ptr is properly detected by my change. I have a small printer pass for capture tracking which enables me to easily verify this kind of stuff. I will update my diff to include it, we can always remove it again later.

Can you please explain how this is detected? The store is to an alloca, and I don't believe that the system does any kind of memory-based data-flow tracking.

I don't know enough about AA to make a definitive statement about this. Conceptually, everything makes sense up the isNoAlias call. The two pointers do point to the same value, so assuming a perfect AA implementation, the results is not totally unexpected. Maybe the default is very conservative and says that's it can't say for sure that the pointer and the global don't alias. As a result the pointer is considered to be captured. The test case I added for storing to the function argument behaves exactly the same, it reports the pointer being captured as desired. I'll dig deeper into AA, because I want to understand what happens, but for now this is the best explanation I can give.

In any cases, some regression test cases will be necessary.

I've added 3 tests to illustrate the added merit of this change. Since the printer pass makes it easy to test the current CT implementation, I'm planning on adding more tests for the other parts of the algorithm in a separate differential.

Otherwise, it only consider %ptr to escape in the previous example if it is stored to either a global or to one of the arguments of the function.

I suppose this is named "optimistic" because it might return a false negative? If the pointer value is stored into some alloca then read and the read value stored into some global, then the value is captured but your analysis will return false. Is this what you intend?

Does the example below cover what you have in mind?

@global = external global i32*

define void @sample() {
entry:
  %ptr = alloca i32
  store i32 1, i32* %ptr
  %ptrtoptr = alloca i32*
  store i32* %ptr, i32** %ptrtoptr
  %deref = load i32*, i32** %ptrtoptr
  store i32* %deref , i32** @global
  ret void
}

If so, the escape of %ptr is properly detected by my change. I have a small printer pass for capture tracking which enables me to easily verify this kind of stuff. I will update my diff to include it, we can always remove it again later.

Can you please explain how this is detected? The store is to an alloca, and I don't believe that the system does any kind of memory-based data-flow tracking.

I don't know enough about AA to make a definitive statement about this. Conceptually, everything makes sense up the isNoAlias call. The two pointers do point to the same value, so assuming a perfect AA implementation, the results is not totally unexpected. Maybe the default is very conservative and says that's it can't say for sure that the pointer and the global don't alias. As a result the pointer is considered to be captured. The test case I added for storing to the function argument behaves exactly the same, it reports the pointer being captured as desired. I'll dig deeper into AA, because I want to understand what happens, but for now this is the best explanation I can give.

Yea, we need to figure this out. Do you have BasicAA enabled? What happens if you add -aa-pipeline=basic-aa to the opt command?

In any cases, some regression test cases will be necessary.

I've added 3 tests to illustrate the added merit of this change. Since the printer pass makes it easy to test the current CT implementation, I'm planning on adding more tests for the other parts of the algorithm in a separate differential.

Yea, we need to figure this out. Do you have BasicAA enabled? What happens if you add -aa-pipeline=basic-aa to the opt command?

With basic AA it is indeed not working; The results are only as good as the AA method provided, and in a sense it's optimistic in that regard. Would you agree that this is fair behavior, because the interface explicitly expects AA results and a flag indicating whether or not a store should capture? This would need to be documented of course.

Yea, we need to figure this out. Do you have BasicAA enabled? What happens if you add -aa-pipeline=basic-aa to the opt command?

With basic AA it is indeed not working; The results are only as good as the AA method provided, and in a sense it's optimistic in that regard. Would you agree that this is fair behavior, because the interface explicitly expects AA results and a flag indicating whether or not a store should capture? This would need to be documented of course.

Can you explain the use case for a capture-tracking mode that might return false negatives?

JDevlieghere added a comment.EditedDec 15 2016, 8:01 AM

Let's consider the following table:

No AABasic AAAdvanced AA
Current Capture Trackingfalse positives (a)n/an/a
Optimistic Capture Trackingfalse positives (b)false negatives (c)less false positives (d)

Right now, only situation (a) exists, where we get false positives for stores to non-aliasing function arguments and globals, as illustrated by the example in the original description of this patch. Optimistic capture tracking behaves exactly the same if no AA is provided, so situation (a) and (b) are identical.

The interesting cases are (c) and (d), especially in combination with the situation you described, where a "value is stored into some alloca then read and the read value stored into some global".

  • For (c), basic AA doesn't perform memory-based data-flow tracking and it returns a false negative, which in turn causes the optimistic capture tracker to return a false negative as well. Returning a false negative is less desirable than returning a false positive, i.e. situation (a). So I suggest we enforce that this scenario can not occur, e.g. with an assert.
  • For (d), it properly detects that the two variables might alias, and we reduce the set of false positives generated by stores to globals and function argument. The lit tests verify this.

The net result is that we ether end up in (a) or (d) which is in my opinion more desirable than what we currently have.

sanjoy added a subscriber: sanjoy.Dec 22 2016, 2:28 PM

Let's consider the following table:

No AABasic AAAdvanced AA
Current Capture Trackingfalse positives (a)n/an/a
Optimistic Capture Trackingfalse positives (b)false negatives (c)less false positives (d)

Right now, only situation (a) exists, where we get false positives for stores to non-aliasing function arguments and globals, as illustrated by the example in the original description of this patch. Optimistic capture tracking behaves exactly the same if no AA is provided, so situation (a) and (b) are identical.

The interesting cases are (c) and (d), especially in combination with the situation you described, where a "value is stored into some alloca then read and the read value stored into some global".

  • For (c), basic AA doesn't perform memory-based data-flow tracking and it returns a false negative, which in turn causes the optimistic capture tracker to return a false negative as well. Returning a false negative is less desirable than returning a false positive, i.e. situation (a). So I suggest we enforce that this scenario can not occur, e.g. with an assert.

What would you assert? That is, what would the assert look like?

  • For (d), it properly detects that the two variables might alias, and we reduce the set of false positives generated by stores to globals and function argument. The lit tests verify this.

Which lit test?

In any case we can't rely on the AA being smart or advanced for correctness. That is, a less aggressive / more conservative AA should not cause us to miscompile code.

The net result is that we ether end up in (a) or (c) which is in my opinion more desirable than what we currently have.

Optimizations based on (c) will generally be incorrect and so (a) or (c) is less desirable than (a). That is, (a) == "always correct", (a) or (c) == "sometimes wrong".

I think Hal's question was more of the tune of: do you have optimizations that will be correct even with false negatives? That is, even if the optimization thinks a value did not escape when it actually did, the transform it does will be be correct? If so, that is interestingly different from what LLVM does today, and worth discussion _before_ we add this new capture tracker to LLVM.

sanjoy requested changes to this revision.Dec 22 2016, 2:28 PM
sanjoy added a reviewer: sanjoy.

As per previous comment.

This revision now requires changes to proceed.Dec 22 2016, 2:28 PM

Let's consider the following table:

...

I think Hal's question was more of the tune of: do you have optimizations that will be correct even with false negatives? That is, even if the optimization thinks a value did not escape when it actually did, the transform it does will be be correct? If so, that is interestingly different from what LLVM does today, and worth discussion _before_ we add this new capture tracker to LLVM.

Yes, this is exactly what I'd like to know.

Let's consider the following table:

No AABasic AAAdvanced AA
Current Capture Trackingfalse positives (a)n/an/a
Optimistic Capture Trackingfalse positives (b)false negatives (c)less false positives (d)

Right now, only situation (a) exists, where we get false positives for stores to non-aliasing function arguments and globals, as illustrated by the example in the original description of this patch. Optimistic capture tracking behaves exactly the same if no AA is provided, so situation (a) and (b) are identical.

The interesting cases are (c) and (d), especially in combination with the situation you described, where a "value is stored into some alloca then read and the read value stored into some global".

  • For (c), basic AA doesn't perform memory-based data-flow tracking and it returns a false negative, which in turn causes the optimistic capture tracker to return a false negative as well. Returning a false negative is less desirable than returning a false positive, i.e. situation (a). So I suggest we enforce that this scenario can not occur, e.g. with an assert.

What would you assert? That is, what would the assert look like?

Originally I wanted to assert the algorithm used for AA. However, now I understand that unless the AA algorithm is perfect, there will always be a case where this generates a false negative. What I'd need for this to work properly is for AA to provide a guaranteed must not alias result. Unfortunately for me, this is inherently incompatible with what we have now. LLVM is (correctly) conservative in saying that something aliases, while I would need an implementation that is conservative in saying that something does not alias.

  • For (d), it properly detects that the two variables might alias, and we reduce the set of false positives generated by stores to globals and function argument. The lit tests verify this.

Which lit test?

captureindirectstore.ll

In any case we can't rely on the AA being smart or advanced for correctness. That is, a less aggressive / more conservative AA should not cause us to miscompile code.

Of course, this goes without saying!

The net result is that we ether end up in (a) or (c) which is in my opinion more desirable than what we currently have.

Optimizations based on (c) will generally be incorrect and so (a) or (c) is less desirable than (a). That is, (a) == "always correct", (a) or (c) == "sometimes wrong".

This was a typo on my part, it should've said (a) or (d). However, it doesn't really matter anymore, because of the reason I mentioned above.

I think Hal's question was more of the tune of: do you have optimizations that will be correct even with false negatives? That is, even if the optimization thinks a value did not escape when it actually did, the transform it does will be be correct? If so, that is interestingly different from what LLVM does today, and worth discussion _before_ we add this new capture tracker to LLVM.

No, I don't have an optimization that can deal with a false negative. I originally extended the capture tracker because I ran into a lot of false negatives for my use cases. This extension made sense from a theoretical point of view, and after testing it, I was able to correctly classify every case I could come up with. I think I confused both you and myself by calling it optimistic. I didn't mean for it to be optimistic in the sense that it would misclassify captures, rather that it could do better than what we currently have with the help of AA. Anyway, now I understand that I can't guarantee my desired scenario (d). I don't really see a way around this limitation, without implementing a inverse-conservative variant of AA.

Do you guys have an alternative idea for reducing the amount of false positives? If not I guess I'll have to abandon this change. :-(

Yes, this is exactly what I'd like to know.

Thank you both for the clarification. I guess I must have misunderstood the question, sorry!

Let's consider the following table:

...

Do you guys have an alternative idea for reducing the amount of false positives? If not I guess I'll have to abandon this change. :-(

To literally do what you'd like, you'll need to do some kind of data-flow analysis. This is possible, but first we should understand the use case. Specifically, why are the variables in question being stored such that the values don't escape but nevertheless mem2reg is not promoting them to SSA values.

JDevlieghere abandoned this revision.Apr 27 2017, 6:00 AM