Page MenuHomePhabricator

Fix occurrences that size and range of pointers are assumed to be the same.
ClosedPublic

Authored by Joe on Oct 2 2019, 3:33 AM.

Details

Summary

GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places
in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues
with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered.

Diff Detail

Event Timeline

Joe created this revision.Oct 2 2019, 3:33 AM
Joe retitled this revision from Fix occurances that size and range of pointers are assumed to be the same. to Fix occurrences that size and range of pointers are assumed to be the same..Oct 2 2019, 3:34 AM
Nicola added a subscriber: Nicola.Oct 2 2019, 3:35 AM
delena added inline comments.Oct 2 2019, 6:06 AM
llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
11–12

This change does not work for us.
Doing this change we added a way to specify type for "ptrtoint" transformation and for size of GEP index. If you do not specify the index type, this change should be NFC. What exactly does not work for you?

Joe marked an inline comment as done.Oct 2 2019, 6:49 AM
Joe added inline comments.
llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
11–12

I don't believe that size of GEP index and type for "ptrtoint" transformation are the same thing. At least, that's how I interpreted langref: 'The ‘ptrtoint’ instruction converts value to integer type ty2 by interpreting the pointer value as an integer'.

InstCombine was transforming 'ptrtoint i8* to i64' to 'ptrtoint i8* to i32', followed by a zext. Here, where the pointer size is 40 bits, that would mean 8 bits have been chopped off.

If you do not specify the index type, this change is NFC because index size defaults to pointer size.

delena added inline comments.Oct 3 2019, 6:26 AM
llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
11–12

I agree that we loose 8 bits in this case. Our promblem is in i40 which is not an integer and we do not have any arithmetic ISA for i40.
Your current changes may cause many problems in our (out-of-tree) compiler.
Now we all have about 3 weeks local hollidays and we'd like to check your changes interlally when we back. So we whould ask you to hold on with this fix meanwhile.

theraven added inline comments.Oct 4 2019, 6:36 AM
llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
11–12

I am not entirely sure what the transformation that's happening in this test is supposed to show (I would be very happy if we required explanatory comments in tests!), but if the pointer range and the pointer size are not the same then I think any transformations involving ptrtoint / inttoptr are unsound. The only information that this gives us is that a pointer is not just an integer address. It may be an integer address that is sign / zero extended, an integer address that has some metadata in the high bits, a fat pointer, and so on.

Joe marked an inline comment as done.Oct 4 2019, 7:23 AM
Joe added inline comments.
llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
11–12

I think this test was originally written to check that Instcombine is behaving correctly when specifying pointer index range. I.e (in)correctly transforming a ptrtoint instruction to an integer size of that of the pointer index size. (Though, with this change, it would be the pointer size instead)

if the pointer range and the pointer size are not the same then I think any transformations involving ptrtoint / inttoptr are unsound

I'm sorry, I don't quite understand this. Why would they be unsound? Pointer index range surely should have nothing to do with converting pointer values to/from integers.

Is this missing test coverage?
Since it only requires datalayout, i wouldn't think it's impossible to test this.

igorb added a comment.Nov 5 2019, 7:52 AM

Hi Joseph,
Sorry for the delay. I am currently checking your changes, I need a few more days to complete the investigation.

Regards,
Igor

Joe added a comment.Nov 7 2019, 8:09 AM

After further review and testing, I have found that changing the effective SCEV type to be the pointer index type does not work. Though neither does it work when assumed to be pointer size. In many places, these are assumed to be the same. I'm unsure of which way to go with this. Do I force effective SCEV type to be the pointer size and enforce this, or change it to be index size and enforce this?

In my mind, SCEV would treat the evolution of a pointer to be that of it's offset, so it should bear the same type.

Does anyone have an opinion on this?

igorb added a comment.Nov 11 2019, 8:35 AM

Hi Joseph,
Thank you very much for looking into this

I think effective SCEV type must be index size and it should be enforced. Without it, indeed many tests fail. Any pointer arithmetic should fit into index size, in other case target may not be able to handle it.
Something like

--- a/lib/Analysis/ScalarEvolution.cpp
+++ b/lib/Analysis/ScalarEvolution.cpp
@@ -5692,6 +5692,8 @@ ScalarEvolution::getRangeRef(const SCEV *S,
     if (SignHint == ScalarEvolution::HINT_RANGE_UNSIGNED) {
       // For a SCEVUnknown, ask ValueTracking.
       KnownBits Known = computeKnownBits(U->getValue(), DL, 0, &AC, nullptr, &DT);
+      if (Known.getBitWidth() > BitWidth)
+        Known = Known.trunc(BitWidth);
       if (Known.One != ~Known.Zero + 1)

I think you also missed

@@ -9261,7 +9261,7 @@ unsigned SelectionDAG::InferPtrAlignment(SDValue Ptr) const {
   const GlobalValue *GV;
   int64_t GVOffset = 0;
   if (TLI->isGAPlusOffset(Ptr.getNode(), GV, GVOffset)) {
-    unsigned IdxWidth = getDataLayout().getIndexTypeSizeInBits(GV->getType());
+    unsigned IdxWidth = getDataLayout().getPointerTypeSizeInBits(GV->getType());
     KnownBits Known(IdxWidth);
     llvm::computeKnownBits(GV, Known, getDataLayout());

With changes above most of the tests pass, but a few still failing . For example Analysis/ScalarEvolution/implied-via-addition.ll with target datalayout="p:40:64:64:32"

Currently we keep Pointer SCEV type as Index type because we don't have any special SCEV Expression type for pointers.
I suppose, we should add scAddPtrExpr (add index to pointer) and scPtrDiffExpr (pointers diff) to solve this problem. The scAddPtrExpr will always return PonterSizeType and will be expanded to GEP, not to "add". The scPtrDiffExpr will return IndexSizeType and will be expanded to trunc+sub.

Joe updated this revision to Diff 229081.Nov 13 2019, 6:22 AM
Joe edited the summary of this revision. (Show Details)

Updated diff:

I am quite new to SCEV, but it really looks like it just shouldn't work on pointers sometimes. I've fixed it such that it works with custom dl on all tests, so hopefully this is okay.

Also fixed a few more places in which idx and ptr size are assumed to be the same.

Added tests which touch as many places I could manage where I changed ptr type to idx type. Many of the failures were caused due to the ptrtoint.

I think you also missed

@@ -9261,7 +9261,7 @@ unsigned SelectionDAG::InferPtrAlignment(SDValue Ptr) const {
   const GlobalValue *GV;
   int64_t GVOffset = 0;
   if (TLI->isGAPlusOffset(Ptr.getNode(), GV, GVOffset)) {
-    unsigned IdxWidth = getDataLayout().getIndexTypeSizeInBits(GV->getType());
+    unsigned IdxWidth = getDataLayout().getPointerTypeSizeInBits(GV->getType());
     KnownBits Known(IdxWidth);
     llvm::computeKnownBits(GV, Known, getDataLayout());

Surely this should stay as IndexType?

igorb added a comment.Nov 14 2019, 7:59 AM
In D68328#1743867, @Joe wrote:

Updated diff:

I am quite new to SCEV, but it really looks like it just shouldn't work on pointers sometimes. I've fixed it such that it works with custom dl on all tests, so hopefully this is okay.

Also fixed a few more places in which idx and ptr size are assumed to be the same.

Added tests which touch as many places I could manage where I changed ptr type to idx type. Many of the failures were caused due to the ptrtoint.

I think you also missed

@@ -9261,7 +9261,7 @@ unsigned SelectionDAG::InferPtrAlignment(SDValue Ptr) const {
   const GlobalValue *GV;
   int64_t GVOffset = 0;
   if (TLI->isGAPlusOffset(Ptr.getNode(), GV, GVOffset)) {
-    unsigned IdxWidth = getDataLayout().getIndexTypeSizeInBits(GV->getType());
+    unsigned IdxWidth = getDataLayout().getPointerTypeSizeInBits(GV->getType());
     KnownBits Known(IdxWidth);
     llvm::computeKnownBits(GV, Known, getDataLayout());

Surely this should stay as IndexType?

Without the change it fail in computeKnownBits on assert(ExpectedWidth == BitWidth && "V and Known should have same BitWidth");
And we should check pointer alignment.

I think emitPointerArithmetic (clang) should be updated also

diff --git a/lib/CodeGen/CGExprScalar.cpp b/lib/CodeGen/CGExprScalar.cpp
index 105e937ca5b..01eb1a1f76e 100644
--- a/lib/CodeGen/CGExprScalar.cpp
+++ b/lib/CodeGen/CGExprScalar.cpp
@@ -3194,10 +3194,10 @@ static Value *emitPointerArithmetic(CodeGenFunction &CGF,
                                                        expr->getRHS()))
     return CGF.Builder.CreateIntToPtr(index, pointer->getType());
 
-  if (width != DL.getTypeSizeInBits(PtrTy)) {
+  if (width != DL.getIndexTypeSizeInBits(PtrTy)) {
     // Zero-extend or sign-extend the pointer value according to
     // whether the index is signed or not.
-    index = CGF.Builder.CreateIntCast(index, DL.getIntPtrType(PtrTy), isSigned,
+    index = CGF.Builder.CreateIntCast(index, DL.getIndexType(PtrTy), isSigned,
                                       "idx.ext");
   }

Thanks

Joe updated this revision to Diff 229340.Nov 14 2019, 9:59 AM

Addressed @igorb 's comments.

igorb added a subscriber: sanjoy.Nov 17 2019, 5:42 AM

adding @sanjoy for more thoughts on such a proposed SCEV change

delena added inline comments.Nov 17 2019, 6:12 AM
llvm/lib/Analysis/ScalarEvolution.cpp
5720

Your patch will work when Index type < Pointer type. So it's ok for us.
In general, I'd add a function to ScalarEvoultion bool truncatesEffectiveSCEVType(Type *, int *NumOfTruncatedBits) that returns true if SCEV effective type smaller than real type and use it here:

if (Known.getBitWidth() > BitWidth && truncatesEffectiveSCEVType(U->getValue()->getType())

Known = Known.trunc(BitWidth);

And I still assume that new SCEV types scAddPtr and scPtrDiff will allow to remove these compensations.

igorb accepted this revision.Nov 24 2019, 11:01 PM

LGTM,
Thanks

This revision is now accepted and ready to land.Nov 24 2019, 11:01 PM
Joe added a comment.Nov 25 2019, 1:42 AM

LGTM,
Thanks

Brilliant. Thank you.
I don't have commit access. Can you commit this on my behalf please? Or I can try and request commit access.

igorb added a comment.Nov 25 2019, 4:24 AM
In D68328#1758448, @Joe wrote:

LGTM,
Thanks

Brilliant. Thank you.
I don't have commit access. Can you commit this on my behalf please? Or I can try and request commit access.

Hi,
Sorry, at recent time i am not active contributor, please request commit access.

Joe updated this revision to Diff 231658.Dec 2 2019, 3:05 AM

Rebased on current head.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2019, 2:16 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript