This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] PtrToInt on non-integral pointers is allowed
ClosedPublic

Authored by lebedev.ri on Jun 15 2021, 2:29 PM.

Details

Summary

As per (committed without review) @reames's rGac81cb7e6dde9b0890ee1780eae94ab96743569b change,
we are now allowed to produce ptrtoint for non-integral pointers.
This will unblock further unbreaking of SCEV regarding int-vs-pointer type confusion.

Diff Detail

Event Timeline

lebedev.ri created this revision.Jun 15 2021, 2:29 PM
lebedev.ri requested review of this revision.Jun 15 2021, 2:29 PM
mkazantsev accepted this revision.Jun 15 2021, 9:48 PM
This revision is now accepted and ready to land.Jun 15 2021, 9:48 PM
This revision was automatically updated to reflect the committed changes.

JFYI, I don't object to this change, but I am hesitant about the direction you seem to be indicating. Please don't use the excuse of our handling of non-integral types being somewhat broken to further break them. I'll comment on future reviews if appropriate.

nikic added a comment.Jun 16 2021, 8:24 AM

As a drive-by note, it would be great if you could expand LangRef on non-integral pointers a bit. It made sense to me when it specified that you can't use ptrtoint on a non-integral pointer, but without that limitation, it's not really clear to me what the actual difference between a non-integral pointer and a normal one is. What transforms are you not allowed to perform on a non-integral pointer that you can perform on a normal one?

Roughly, we can either end up with SCEV effectively being inoperable for non-integral pointers,
or the users of non-integral pointers having issues with lowering the ptrtoint's produced by SCEV.
I don't know what is worse, and i'm presently not sure i care, but i do believe that we can not
leave the current SCEV casual-ness of treating pointers as integers as-is.
I actually had some of the patches @efriedma posted locally, and i agree with them.

As a drive-by note, it would be great if you could expand LangRef on non-integral pointers a bit. It made sense to me when it specified that you can't use ptrtoint on a non-integral pointer, but without that limitation, it's not really clear to me what the actual difference between a non-integral pointer and a normal one is. What transforms are you not allowed to perform on a non-integral pointer that you can perform on a normal one?

Yes, i would also like to see such documentation, especially if non-integral pointers
are going to be used as an "arbitrary" roadblock for SCEV changes. I would have posted
that in the review for the commit mentioned, but there was none, which also highlights
the problem around non-integral pointer status in llvm :)

As a drive-by note, it would be great if you could expand LangRef on non-integral pointers a bit. It made sense to me when it specified that you can't use ptrtoint on a non-integral pointer, but without that limitation, it's not really clear to me what the actual difference between a non-integral pointer and a normal one is. What transforms are you not allowed to perform on a non-integral pointer that you can perform on a normal one?

Yes, i would also like to see such documentation, especially if non-integral pointers
are going to be used as an "arbitrary" roadblock for SCEV changes. I would have posted
that in the review for the commit mentioned, but there was none, which also highlights
the problem around non-integral pointer status in llvm :)

On the documentation side, I'd love to, but I'm honestly not sure *how* to. There's two inter-related problems here. The first is the semantics of an inttoptr is highly dependent on the target for non-integral pointers. At the moment, it can basically only be used to implement non-inlined built-in routines in any practical way. The second issue is that our definition of the integral pointer types themselves appear to be in flux, and are very vague about certain key details. The result is I'm left unsure how to formally specify them. This is why I used the implementation defined wording I did.

On the SCEV side, I understand the frustration, but I think you're also mischaracterizing slightly. SCEV has long had the notion of subtracting two pointers which has been "questionable" the whole time as the semantics of subtracting two unrelated pointers is unclear from the underlying IE. (IR doesn't have subtract, but it does have icmp which is more or less the same.) Eli's recent changes - which are making progress btw, even if slow - are the first I've seen to really raise the question if subtract is a primitive which should be representable for pointers in SCEV. (I'll also add that the confusion around whether we could still have sizeless pointers - which I admittedly contributed to - has only recently been cleared up.)

In terms of forward progress, I am willing to accept crippling SCEV for NI pointers provided that all the changes are otherwise well structured and make sense. I'm not thrilled, and I will be a skeptical reviewer, but I won't block changes which are well justified.

Hm, random vaguely OT thought, would it be worthwhile to add an explicit pointer difference node in SCEV? This would allow us to form subtracts of pointers without needing the nasty multiply by negative one trick, and we could optimize them to the lossless inttoptr form for integral pointers, and not for NI pointers. Might be worth a bit of further thought. I'll follow up if I still think this is a good idea in an hour. :)

As a drive-by note, it would be great if you could expand LangRef on non-integral pointers a bit. It made sense to me when it specified that you can't use ptrtoint on a non-integral pointer, but without that limitation, it's not really clear to me what the actual difference between a non-integral pointer and a normal one is. What transforms are you not allowed to perform on a non-integral pointer that you can perform on a normal one?

Yes, i would also like to see such documentation, especially if non-integral pointers
are going to be used as an "arbitrary" roadblock for SCEV changes. I would have posted
that in the review for the commit mentioned, but there was none, which also highlights
the problem around non-integral pointer status in llvm :)

On the documentation side, I'd love to, but I'm honestly not sure *how* to. There's two inter-related problems here. The first is the semantics of an inttoptr is highly dependent on the target for non-integral pointers. At the moment, it can basically only be used to implement non-inlined built-in routines in any practical way. The second issue is that our definition of the integral pointer types themselves appear to be in flux, and are very vague about certain key details. The result is I'm left unsure how to formally specify them. This is why I used the implementation defined wording I did.

On the SCEV side, I understand the frustration, but I think you're also mischaracterizing slightly. SCEV has long had the notion of subtracting two pointers which has been "questionable" the whole time as the semantics of subtracting two unrelated pointers is unclear from the underlying IE. (IR doesn't have subtract, but it does have icmp which is more or less the same.) Eli's recent changes - which are making progress btw, even if slow - are the first I've seen to really raise the question if subtract is a primitive which should be representable for pointers in SCEV. (I'll also add that the confusion around whether we could still have sizeless pointers - which I admittedly contributed to - has only recently been cleared up.)

In terms of forward progress, I am willing to accept crippling SCEV for NI pointers provided that all the changes are otherwise well structured and make sense. I'm not thrilled, and I will be a skeptical reviewer, but I won't block changes which are well justified.

Glad that we have established this.

Hm, random vaguely OT thought, would it be worthwhile to add an explicit pointer difference node in SCEV? This would allow us to form subtracts of pointers without needing the nasty multiply by negative one trick, and we could optimize them to the lossless inttoptr form for integral pointers, and not for NI pointers. Might be worth a bit of further thought. I'll follow up if I still think this is a good idea in an hour. :)

Doesn't sound that thrilling to me.

Took a shot at a minimal crippling of SCEV over in https://reviews.llvm.org/D104403. That isn't a patch ready to land, but it should let us collect some perf numbers on the impact of crippling SCEV's ability to reason about non-integral pointers.

Slightly OT, but I realized an increasing reliance on ptrtoint in code SCEV analyzes and expands might interact badly with the current discussion around pointer provenance and end up negatively impacting AA results. Just a thought at the moment. (This is not related to NI pointers at all.)