This is an archive of the discontinued LLVM Phabricator instance.

[Bitcode][OpaquePtr] Remove usages of PointerType's getElementType()
AbandonedPublic

Authored by zequanwu on Jul 1 2021, 4:18 PM.

Details

Reviewers
aeubanks
dblaikie
Summary

Keep track of pointer's element type when parsing typeTable and remove usages of PointerType's getElementType().

Diff Detail

Event Timeline

zequanwu created this revision.Jul 1 2021, 4:18 PM
zequanwu requested review of this revision.Jul 1 2021, 4:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2021, 4:18 PM
nikic added a subscriber: nikic.Jul 1 2021, 11:53 PM

Can you please add some tests? Running some existing bitcode upgrade tests under --force-opaque-pointers should do it.

I don't really get how this is supposed to work. If opaque pointers are enabled, all PointerTypes (in the same address space) are going to be the same, so you can't create a meaningful map from PointerType to element type. The map will effectively store the last element type it has seen on any pointer.

I don't really get how this is supposed to work. If opaque pointers are enabled, all PointerTypes (in the same address space) are going to be the same, so you can't create a meaningful map from PointerType to element type. The map will effectively store the last element type it has seen on any pointer.

This is to be compatible with llvm bitcode file with opaque pointer disabled. Those files have pointers' pointee types encoded. So, we need the map to keep track of pointers' pointee types.

Ah, so this is intended to be an NFC patch - a different way of implementing the same functionality in the bitcode reader? A way that doesn't rely on PointerType having a pointee type present?

Seems pretty reasonable to me.

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
5055–5056

Is this reachable? If so, it should be tested. If not, it should be an assertion, or omitted?

I think the original code with only the direct replacement of "X->getElementType()" -> "getElementType(X)" would be OK? Because we know by construction that it can't return null here? (if it'd be helpful, could have a separate getElementType that asserts (rather than returning null) and one (if needed called getElementTypeOrNull which does the returning null thing?)

But I think most of the getElementType queries will be the non-null kind, so maybe there's no need to have both, and the existing one you're adding here could be changed to assert, rather than returning null?

I don't really get how this is supposed to work. If opaque pointers are enabled, all PointerTypes (in the same address space) are going to be the same, so you can't create a meaningful map from PointerType to element type. The map will effectively store the last element type it has seen on any pointer.

This is to be compatible with llvm bitcode file with opaque pointer disabled. Those files have pointers' pointee types encoded. So, we need the map to keep track of pointers' pointee types.

the issue is what nikic said, that if we ignore address spaces, there will only be one singleton PointerType. We can't map the same PointerType to multiple pointee types, that doesn't do anything.
Rather, we need to map each Value to a potential pointee type.

I don't really get how this is supposed to work. If opaque pointers are enabled, all PointerTypes (in the same address space) are going to be the same, so you can't create a meaningful map from PointerType to element type. The map will effectively store the last element type it has seen on any pointer.

This is to be compatible with llvm bitcode file with opaque pointer disabled. Those files have pointers' pointee types encoded. So, we need the map to keep track of pointers' pointee types.

the issue is what nikic said, that if we ignore address spaces, there will only be one singleton PointerType. We can't map the same PointerType to multiple pointee types, that doesn't do anything.
Rather, we need to map each Value to a potential pointee type.

oooh, right - thanks for walking me through it. Yeah, works for now and correctly removes the calls to getElementType - but relies on the resulting Type* being distinct, so it's still indirectly relying on pointee types... cunning.

Presumably at the time we parse the type we know whether it's distinct or not - but then we don't have a good way of tracking it once we've collapsed the types down to opaque pointers....

Tracking every Value* of pointer type seems expensive to me (but I'm just guessing), but maybe that's what it'd take.

Tracking every Value* of pointer type seems expensive to me (but I'm just guessing), but maybe that's what it'd take.

Yeah it's pretty expensive, although that's what tnorthover's previous attempt did (which was in-tree for more than a year before I reverted it).
If we don't keep track of pointee types for bitcode that only uses opaque pointers, it shouldn't be noticeable for future bitcode that doesn't use typed pointers.

I don't really get how this is supposed to work. If opaque pointers are enabled, all PointerTypes (in the same address space) are going to be the same, so you can't create a meaningful map from PointerType to element type. The map will effectively store the last element type it has seen on any pointer.

This is to be compatible with llvm bitcode file with opaque pointer disabled. Those files have pointers' pointee types encoded. So, we need the map to keep track of pointers' pointee types.

the issue is what nikic said, that if we ignore address spaces, there will only be one singleton PointerType. We can't map the same PointerType to multiple pointee types, that doesn't do anything.
Rather, we need to map each Value to a potential pointee type.

This map only mappes from non-opaque pointer to its pointee type.

We may need a map from Value to potential pointee type to get opaque opinter's pointee type, which could be another patch.

A non-opaque pointer already has its pointee type available, there's no reason to have a separate map.

I was thinking, maybe we should temporarily put aside the goal of removing the pointee type from PointerType so that the bitcode upgrade is easier to implement. So we'd keep things mostly as they are in BitcodeReader with calls to getElementType(), but we'd have a pass immediately after reading the entire bitcode file that makes all pointer types opaque. This seems very doable and hopefully not complicated.

Then in the future, after the opaque pointer transition, we can consider removing the pointee type and having the Value -> pointee type map in BitcodeReader if we really want to remove the pointee type field in PointerType.

A non-opaque pointer already has its pointee type available, there's no reason to have a separate map.

I was thinking, maybe we should temporarily put aside the goal of removing the pointee type from PointerType so that the bitcode upgrade is easier to implement. So we'd keep things mostly as they are in BitcodeReader with calls to getElementType(), but we'd have a pass immediately after reading the entire bitcode file that makes all pointer types opaque. This seems very doable and hopefully not complicated.

Then in the future, after the opaque pointer transition, we can consider removing the pointee type and having the Value -> pointee type map in BitcodeReader if we really want to remove the pointee type field in PointerType.

Adding a transformation pipeline (or is there already a pipeline in use there, that you're proposing adding a pass to?) after bitcode reading seems like non-trivial complexity/technical debt to incur if we can write the code we would like to have in the future (the one that doesn't involve keeping a pointee type field on PointerType) now.

What's the advantage to not writing the code we'd want in the future, today?

zequanwu abandoned this revision.Nov 24 2021, 11:08 AM