This is an archive of the discontinued LLVM Phabricator instance.

Introduce Value::stripPointerCastsSameRepresentation
ClosedPublic

Authored by jdoerfert on May 6 2019, 12:50 PM.

Details

Summary
This patch allows current users of Value::stripPointerCasts() to force
the result of the function to have the same representation as the value
it was called on. This is useful in various cases, e.g., (non-)null
checks.

In this patch only a single call site was adjusted to fix an existing
misuse (which is hard to test). Uses in attribute deduction and other
areas, e.g., D60047, are to be expected.

For a discussion on this topic, please see [0].

[0] http://lists.llvm.org/pipermail/llvm-dev/2018-December/128423.html

Event Timeline

jdoerfert created this revision.May 6 2019, 12:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2019, 12:50 PM
arsenm added inline comments.May 6 2019, 1:03 PM
llvm/include/llvm/IR/Value.h
502

I think a separate function name would be more helpful, implemented with a hidden bool argument

jdoerfert marked an inline comment as done.May 6 2019, 1:17 PM
jdoerfert added inline comments.
llvm/include/llvm/IR/Value.h
502

It depends on different things, including taste. I don't feel strong so I'd be fine with a new function as well. Two things to consider though:

Do we want to prevent as-cast stripping in other strip-casts scenarios as well?
This patch enables it for PSK_ZeroIndicesAndAliases but not for PSK_ZeroIndices and PSK_ZeroIndicesAndAliasesAndInvariantGroups. If the answer is yes, then we would end up with even more versions in the stripXXXX family (we have a bunch already).

Do we want to be able to set KeepBitPattern as the default in order for us to "guarantee" correctness and give people a way to opt-in? We can achieve this also if the new function would look through as-casts and stripPointerCasts wouldn't anymore.

hfinkel added inline comments.May 6 2019, 1:46 PM
llvm/include/llvm/IR/Value.h
502

If we want to be able to switch the default at some point, I certainly recommend something with a name so that we can find the various forms with grep (etc.). We can always add more forms as needed (or we can switch the argument type to an enum).

jdoerfert marked an inline comment as done.May 6 2019, 3:34 PM
jdoerfert added inline comments.
llvm/include/llvm/IR/Value.h
502

If we want to be able to switch the default at some point, I certainly recommend something with a name so that we can find the various forms with grep (etc.).

Later, the easiest "switch situation" is with the default argument (= this patch), especially because of down-stream code. Adding a new function now will fix the behavior of the new function. Switching stripPointerCasts later would then require us to replace all uses or introduce yet another function as a way to look through as-casts.

We can always add more forms as needed (or we can switch the argument type to an enum).

Now I'm confused. Do you want an argument to stripPointerCasts or a new function which will have a fixed semantic wrt. as-casts. For the latter, we would introduce either: stripTrivialPointerCasts (= w/o as-casts) or stripAddressSpaceAndPointerCasts (= w/ as-casts) and make stripPointerCasts do the oposite.

jdoerfert updated this revision to Diff 198375.May 6 2019, 5:08 PM

Use a new function for the new behavior

aykevl added a subscriber: aykevl.EditedMay 18 2019, 8:08 AM

To me (someone from outside LLVM), the "keep bit pattern" argument looks confusing. Pointer casts do not necessarily change the bit pattern? In fact, it seems to me that most pointer casts preserve the bit pattern but change the meaning of the pointer (example: address space 1 on AVR which points into flash instead of RAM - no bits are changed with an addrspacecast). Names like stripTrivialPointerCasts and stripAddressSpaceAndPointerCasts look much more obvious to me. Or did I miss the real intent of the change here?
In other words, I would suggest renaming stripPointerCastsSameBitPattern to stripTrivialPointerCasts.

To me (someone from outside LLVM), the "keep bit pattern" argument looks confusing. Pointer casts do not necessarily change the bit pattern? In fact, it seems to me that most pointer casts preserve the bit pattern but change the meaning of the pointer (example: address space 1 on AVR which points into flash instead of RAM - no bits are changed with an addrspacecast).

This isn't how the IR is defined. addrspacecast is explicitly allowed to change the bit pattern of the pointer, and it does for some addrspace combinations on some targets (like AMDGPU). It has to be assumed it could have changed the bitpattern without specific target information.

Yes I know it can change the bit pattern. And if it is only about the bit pattern, the name is fine. However, it seems to me that the only difference between stripPointerCastsSameBitPattern and stripPointerCasts is the AS, which may or may not be relevant to the bit pattern. This is confusing: it isn't clear to me from the name stripPointerCastsSameBitPattern whether it may strip any addrspacecast that has the same bit pattern.

Yes I know it can change the bit pattern. And if it is only about the bit pattern, the name is fine. However, it seems to me that the only difference between stripPointerCastsSameBitPattern and stripPointerCasts is the AS, which may or may not be relevant to the bit pattern. This is confusing: it isn't clear to me from the name stripPointerCastsSameBitPattern whether it may strip any addrspacecast that has the same bit pattern.

I think bitcasts between equivalently sized pointers with different address spaces should be allowed, rather than requiring addrspacecast like today, so I'm not sure I would want to specify this is an address space property

Yes I know it can change the bit pattern. And if it is only about the bit pattern, the name is fine. However, it seems to me that the only difference between stripPointerCastsSameBitPattern and stripPointerCasts is the AS, which may or may not be relevant to the bit pattern. This is confusing: it isn't clear to me from the name stripPointerCastsSameBitPattern whether it may strip any addrspacecast that has the same bit pattern.

I think I do not understand the problem. For me, it is all about the bit-pattern. But in general, I think users should not care what casts are stripped and which are not. It should be clear from the name and description what they can expect. Maybe someone adds logic here to determine if as-casts preserve the bit pattern. That will help users that need the same bit pattern, e.g., the isNonNull family of calls, and it will break with the strict no-as rule. If we add another "as-cast-like" instruction (like D59065) that might change the bit-pattern but not the "logical object" users of any stripPointerCastsXXX call should get the right value.

Yes I know it can change the bit pattern. And if it is only about the bit pattern, the name is fine. However, it seems to me that the only difference between stripPointerCastsSameBitPattern and stripPointerCasts is the AS, which may or may not be relevant to the bit pattern. This is confusing: it isn't clear to me from the name stripPointerCastsSameBitPattern whether it may strip any addrspacecast that has the same bit pattern.

I think I do not understand the problem. For me, it is all about the bit-pattern. But in general, I think users should not care what casts are stripped and which are not. It should be clear from the name and description what they can expect. Maybe someone adds logic here to determine if as-casts preserve the bit pattern. That will help users that need the same bit pattern, e.g., the isNonNull family of calls, and it will break with the strict no-as rule. If we add another "as-cast-like" instruction (like D59065) that might change the bit-pattern but not the "logical object" users of any stripPointerCastsXXX call should get the right value.

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

I'd like to allow bitcasts between pointers with the same size for this purpose

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

There are non-integral pointers, so I don't think the name should refer to integers

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

I'd like to allow bitcasts between pointers with the same size for this purpose

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

There are non-integral pointers, so I don't think the name should refer to integers

Good point. I'll revise my suggestion stripPointerCastsSameRepresentation.

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

I'd like to allow bitcasts between pointers with the same size for this purpose

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

There are non-integral pointers, so I don't think the name should refer to integers

Good point. I'll revise my suggestion stripPointerCastsSameRepresentation.

I don't want to argue about the name at this point. I gladly change it if I get a LGTM with the name change request. (@arsenm?)

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

I'd like to allow bitcasts between pointers with the same size for this purpose

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

There are non-integral pointers, so I don't think the name should refer to integers

Good point. I'll revise my suggestion stripPointerCastsSameRepresentation.

I don't want to argue about the name at this point. I gladly change it if I get a LGTM with the name change request. (@arsenm?)

+1

I agree with this. I recall past conversations about adding something to DataLayout to say that some address-space pairs share the same pointer representation, or some aspects of the representation, I don't know if we ever had a strong-enough use case to write the code, but should such a worthwhile case emerge, I can certainly see us teaching the optimizer that some as-casts don't change (certain relevant aspects of) the underlying representation.

I'd like to allow bitcasts between pointers with the same size for this purpose

As a bit of bikeshedding, I like the term 'representation' better than 'pattern'. I wonder if the name stripPointerCastsSameIntRepresentation would be clearer? (The idea being that pointers are opaque types at the IR level, but get integer representations via ptrtoint).

There are non-integral pointers, so I don't think the name should refer to integers

Good point. I'll revise my suggestion stripPointerCastsSameRepresentation.

I don't want to argue about the name at this point. I gladly change it if I get a LGTM with the name change request. (@arsenm?)

+1

Should I take this as "LGTM with the name change" or as a "name change seems fine, someone else needs to sign-off"? (I will also adjust the commit message accordingly)

jdoerfert updated this revision to Diff 201237.May 24 2019, 7:30 AM

Rename method and rephrase commit message

jdoerfert retitled this revision from Introduce an option to stripPointerCasts to force the same bit pattern to Introduce Value::stripPointerCastsSameRepresentation.May 24 2019, 7:35 AM
jdoerfert edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.May 24 2019, 12:03 PM
This revision was automatically updated to reflect the committed changes.