This is an archive of the discontinued LLVM Phabricator instance.

[CaptureTracker] Comparisons of allocation pointers do not capture
Changes PlannedPublic

Authored by jdoerfert on Aug 19 2019, 9:47 PM.

Details

Summary

If two pointers compared in an ICmp instruction are both the result
of an allocation, or pointing into/to the end of an allocation, hence:

  • dereferenceable pointers,
  • noalias return pointers, (including the unique value potentiall resulting from a malloc(0) call)
  • null, (if null is not a valid pointer)

then there are no "bit-tricks" possible to learn about the pointer
bits. Put differently, one cannot learn about a pointer in these
category because one cannot manipulate the other pointer freely while
keeping it in these categories. So the "other pointer" is either
"fixed" (null or something and special returned by malloc(0)) or for
the most part not controllable, e.g., the "random" allocation location
(alloc, malloc, etc.) which only allows manipulation of a few bits up
to the (log of the) allocation size.

Event Timeline

jdoerfert created this revision.Aug 19 2019, 9:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2019, 9:47 PM

I'm missing something here. If I have two pointers (e.g., returned from malloc), and I compare them so I know they're equal, then I've learned that the pointers are equal -- and, thus, if I know all of the bits of one pointer I now know all of the bits of the other pointer. As a result, I can still reconstruct it later using that information.

I'm missing something here. If I have two pointers (e.g., returned from malloc), and I compare them so I know they're equal, then I've learned that the pointers are equal -- and, thus, if I know all of the bits of one pointer I now know all of the bits of the other pointer. As a result, I can still reconstruct it later using that information.

I'll try to explain why I think this is fine:
If you compare the same pointer (or derived from the same), you need to capture at least one from them "explicitly" to learn the bits. You cannot use ICmps between them for that (without leaving the categories specified) so you need an "explicit" capture somehow. The comparison we would allow here might give you the offset of the second pointer, which, assuming you know the bits of the first, tells you the second one. However, I think this is no more information than the captured pointer alone, assuming "capturing" does not only leak the bits of the pointer but the address range of the whole underlying object.

Yes, you can't capture a pointer that's already captured.

The more interesting scenario is something like this: suppose you have three consecutive 16-byte allocations, laid out one after another. The first and the third are captured; the second is not. Can you capture a pointer to the second allocation using comparisons against the first and third allocations? I think the only reasonable answer here is yes; you've proven the value of every bit of the pointer.

It doesn't really matter if three allocations don't reliably produce this layout; for any particular run of the program, we have to choose some memory layout.

If you compare the same pointer (or derived from the same), you need to capture at least one from them "explicitly" to learn the bits.

Yes, but then we'd need some way to account for that. If pointer A is compared to pointer B, and then pointer B is unambiguously captured, then so has been pointer A (at least in the part of the control flow dominated by the true edge of the comparison).

Yes, you can't capture a pointer that's already captured.

! In D66461#1640565, @hfinkel wrote:
If you compare the same pointer (or derived from the same), you need to capture at least one from them "explicitly" to learn the bits.

Yes, but then we'd need some way to account for that. If pointer A is compared to pointer B, and then pointer B is unambiguously captured, then so has been pointer A (at least in the part of the control flow dominated by the true edge of the comparison).

You can capture A that way if A and B are (basically) equal. Now in the part in which the capture of A through the icmp matters you already have B captured for sure, that is an alias of A is captured so A is captured even if not directly through the expression of A.

The more interesting scenario is something like this: suppose you have three consecutive 16-byte allocations, laid out one after another. The first and the third are captured; the second is not. Can you capture a pointer to the second allocation using comparisons against the first and third allocations? I think the only reasonable answer here is yes; you've proven the value of every bit of the pointer.

It doesn't really matter if three allocations don't reliably produce this layout; for any particular run of the program, we have to choose some memory layout.

I see your point but I think the basic idea is still sound, the patch needs an addition though.
As I described, the compare will capture only if the other pointer was captured. So additionally to the checks in the patch ask the tracker if that is the case, e.g.,

if (IsDerefOrNoAlias0 && IsDerefOrNoAlias1)
  if (!Tracker->capture(Op1))
      break;

Maybe we can even get rid of the categories and say sth along the lines of:

// Op0, the use of V in the ICmp is capturing if Op1 is captured. Otherwise,
// Op1 is opaque and comparing it with another opaque pointer is not going
// to yield information.
if (!Tracker->capture(Op1)
   break;

What do you think?

As I described, the compare will capture only if the other pointer was captured. So additionally to the checks in the patch ask the tracker if that is the case

You can't actually reconstruct the pointer in that case, but the bits are still significant; if two functions sort a list of pointers, they have to sort the same way. Whether that means the pointer is "captured" depends on the definition of "captured"; I can think of transforms that don't care, and transforms that do care.

As I described, the compare will capture only if the other pointer was captured. So additionally to the checks in the patch ask the tracker if that is the case

You can't actually reconstruct the pointer in that case, but the bits are still significant; if two functions sort a list of pointers, they have to sort the same way. Whether that means the pointer is "captured" depends on the definition of "captured"; I can think of transforms that don't care, and transforms that do care.

You lost me. When I say "captured" I mean (a bit of) the pointer value is leaked. When I above said you have two pointers, if neither is captured so far a comparison between them doesn't capture either I mean that we do not need to worry about the bits being leaked and later used to access the memory or doing other odd things.

Thinking about it a bit more, that definition is probably okay.

It would be possible to define a stronger property, that the value of the pointer isn't significant for the operation. So if a pointer points to the value "3", you can replace it with some other pointer that also points to "3". Sort of like unnamed_addr on globals. But that's stronger than you need for alias analysis.

aqjune added a subscriber: aqjune.Aug 22 2019, 2:09 PM
uenoku added a subscriber: uenoku.Aug 22 2019, 4:18 PM

As I described, the compare will capture only if the other pointer was captured.

But the other pointer can be captured afterward, no?

jdoerfert planned changes to this revision.Aug 22 2019, 11:58 PM

As I described, the compare will capture only if the other pointer was captured.

But the other pointer can be captured afterward, no?

Yes, but the idea was that because it is captured we would not rule the icmp as "benign".
However, trying to update the patch I realized this is not as simple as the scope in which the other pointer is "not captured" seems important.

I will think about this further and (hopefully) come up with a proposal that might use capture information and the categories I had here as they restrict how a pointer can be "prepared" in the first place.

If anyone has a good idea, let me know, I am almost certain we can do better here and I believe we should as pointer comparisons seem like something that happens frequently.

xbolva00 resigned from this revision.Nov 23 2019, 12:55 PM
sanjoy resigned from this revision.Jan 29 2022, 5:40 PM