This is an archive of the discontinued LLVM Phabricator instance.

[SCEVExpander] Support expanding nonintegral pointers with constant base.
ClosedPublic

Authored by fhahn on Sep 17 2020, 5:39 AM.

Details

Summary

Currently SCEVExpander creates inttoptr for non-integral pointers if the
base is a null constant for example. This results in invalid IR.

This patch changes InsertNoopCastOfTo to emit a GEP to convert to a
non-integral pointer. The GEP uses null as base. The original value is
converted to an index by dividing by the alloc size of the type.

Note that this requires the original value to be divisible by the alloc
size, otherwise the pointer value will be changed. I am not sure if that
is an issue in practice (ideally the expander would only be used for
expanding pointer arithmetic that keeps the result divisible by the
alloc size) and there is not much we could do otherwise.

I guess we could instead try to create a pointer with alloc size of 1
byte, create a GEP with the original value as index and bitcast to the
target type. I am not sure if we can always do that, but I guess we
could add an assertion if that happens.

This was exposed by D71539.

Diff Detail

Event Timeline

fhahn created this revision.Sep 17 2020, 5:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 17 2020, 5:39 AM
fhahn requested review of this revision.Sep 17 2020, 5:39 AM
arsenm added inline comments.Sep 17 2020, 7:15 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
134

It seems broken that getCastOpcode would return IntToPtr in a situation where it's illegal

fhahn added inline comments.Sep 17 2020, 7:56 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
134

Yes, that seems a rather big hole in getCastOpcode unfortunately and should probably also be changed.

But I think the key question is how to best deal with the problem during SCEV expansion here, because AFAIK we do not have a way to indicate that we failed to expand an expression in the expander.

Is it enough to emit a GEP with a computed index? Or should we better try to get a type with an alloc size of 1 byte and use that? I guess there could be configurations which do not have such a type, not sure what to do then.

I have a proposal in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which would solve this, among other issues. Not sure when I'll get around to actually implementing it.

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
134

In general, for non-null pointers, SCEVExpander uses a GEP over i8* (in the appropriate address space). We should do that even if the implicit null is missing. The offset isn't necessarily divisible by the element type: SCEV looks through bitcasts.

I guess there could be configurations which do not have such a type

i8 is always one byte. (This should be true even with an out-of-tree patchset that makes bytes bigger than 8 bits: it would just be a non-byte-sized type, similar to i1.)

fhahn updated this revision to Diff 292731.Sep 18 2020, 2:45 AM

Change generated code to use bitcast (GEP i8* null, %v) to Ty. Also adds an assertion that i8 has an alloc size of 1 byte.

fhahn added a comment.Sep 18 2020, 2:50 AM

I have a proposal in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which would solve this, among other issues. Not sure when I'll get around to actually implementing it.

This looks very interesting, but like a non-trivial amount of work. The expander problem is currently blocking the re-landing of D71539 and it seems the fix in the expander is relatively small/straight-forward and it would be great to unblock that change :)

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
134

In general, for non-null pointers, SCEVExpander uses a GEP over i8* (in the appropriate address space). We should do that even if the implicit null is missing. The offset isn't necessarily divisible by the element type: SCEV looks through bitcasts.

I guess there could be configurations which do not have such a type

i8 is always one byte. (This should be true even with an out-of-tree patchset that makes bytes bigger than 8 bits: it would just be a non-byte-sized type, similar to i1.)

Oh, I thought there might be out-of-tree targets that have i8 with different alloc sizes, but if that is nothing to worry about we can just emit a GEP with i8* null as base pointer. I updated the code and added an extra assertion to make things blow up if the assumption is broken.

efriedma added inline comments.Sep 18 2020, 2:34 PM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
131

It's probably worth noting the reason we know this is safe. Semantically, a GEP from null is quite different from an inttoptr: a GEP from null can't be used to access memory. I guess the reason this is safe is that we know the original IR must also have been a GEP from null.

fhahn updated this revision to Diff 292920.Sep 18 2020, 3:50 PM

expand comment stating why it should be safe to use GEP of null instead of inttoptr as Eli suggested, thanks!

fhahn added inline comments.Sep 18 2020, 3:54 PM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
131

Updated, thanks! I think in theory the user could also use expandCodeFor to expand a SCEV expression that does not originate from a pointer and force the result to be casted to a pointer (e.g. expandCodeFor(2, PtrTy)), but this seems like an invalid use of the API to me. (Which we unfortunately cannot easily catch/detect)

This revision is now accepted and ready to land.Sep 18 2020, 8:19 PM
This revision was landed with ongoing or failed builds.Sep 19 2020, 9:22 AM
This revision was automatically updated to reflect the committed changes.