Derive GEP index type from Data Layout
ClosedPublic

Authored by delena on Jan 16 2018, 12:06 PM.

Details

Summary

In the current version InstCombiner “normalizes” GEPs and extends Index operand to the pointer width.

It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
The target I’m working on has very restricted ISA for the pointer calculation. Hal suggested to retrieve information for GEP index width from Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html

I added the interface to Data Layout and I changed the InstCombiner.
I know that I didn't touch all GEP creation points, but all changes that you see in the review are covered by our internal test system.

Diff Detail

Repository
rL LLVM
delena created this revision.Jan 16 2018, 12:06 PM
craig.topper added inline comments.Jan 16 2018, 5:15 PM
lib/Transforms/InstCombine/InstructionCombining.cpp
1887 ↗(On Diff #129964)

Should this assert message be updated since it not guaranteed to be pointer width now?

1927 ↗(On Diff #129964)

Same with this assert.

delena updated this revision to Diff 130096.Jan 16 2018, 10:41 PM
delena marked 2 inline comments as done.

Fixed 2 "assert" messages.

delena updated this revision to Diff 130138.Jan 17 2018, 5:10 AM
delena edited the summary of this revision. (Show Details)

Added tests for a data layout, where pointer is wider than the largest supported integer type.

craig.topper added inline comments.Jan 18 2018, 5:05 PM
lib/Analysis/ConstantFolding.cpp
812 ↗(On Diff #130138)

Can you put curly braces after this if and the for loop below it to help with readability? I know the for loop didn't have them before, but I feel like it should have. I tend to think that if the inner scope is curly braced, the outer scope should be too.

theraven requested changes to this revision.Jan 19 2018, 4:19 AM

I don't like this patch as is, for several reasons.

  1. It's adding a hack that assumes that the offset should be the width of the widest integer operation. This is probably true in most cases (it is for us), but if we're going to introduce the idea that an address offset is distinct from the size of the pointer then we should do it properly and add that to the TargetInfo string explicitly (defaulting to the same size, if not specified).
  2. We're computing the correct width every time it's requested, which looks expensive. TargetInfo should store the width for each address space and, for non-vector types, not have to do any calculation to determine the kind of integer to return.
  3. It fixes only around 20% of the places that we've found that assume that the size and range of the pointer are the same.
This revision now requires changes to proceed.Jan 19 2018, 4:19 AM
delena updated this revision to Diff 130600.Jan 19 2018, 6:17 AM
delena marked an inline comment as done.

Updated, following Craig's comments.

I don't like this patch as is, for several reasons.

  1. It's adding a hack that assumes that the offset should be the width of the widest integer operation. This is probably true in most cases (it is for us), but if we're going to introduce the idea that an address offset is distinct from the size of the pointer then we should do it properly and add that to the TargetInfo string explicitly (defaulting to the same size, if not specified).

So you propose to extend Data Layout string and add index size to it, right? It was one of options that Hal suggested. Ok.

  1. We're computing the correct width every time it's requested, which looks expensive. TargetInfo should store the width for each address space and, for non-vector types, not have to do any calculation to determine the kind of integer to return.

We calculated getIntPtrType() anyway, getIndexType() is not more expensive. If I extend TargetInfo, the extension will be optional and all other targets will calculate getIntPtrType() anyway.

  1. It fixes only around 20% of the places that we've found that assume that the size and range of the pointer are the same.

I can't derive all places from your code. You can show me them all, one-by-one, or we'll fix more places gradually on top of this patch.

I don't like this patch as is, for several reasons.

  1. It's adding a hack that assumes that the offset should be the width of the widest integer operation. This is probably true in most cases (it is for us), but if we're going to introduce the idea that an address offset is distinct from the size of the pointer then we should do it properly and add that to the TargetInfo string explicitly (defaulting to the same size, if not specified).

So you propose to extend Data Layout string and add index size to it, right? It was one of options that Hal suggested. Ok.

Yes, if we're going to fix this upstream, let's fix it properly.

  1. We're computing the correct width every time it's requested, which looks expensive. TargetInfo should store the width for each address space and, for non-vector types, not have to do any calculation to determine the kind of integer to return.

We calculated getIntPtrType() anyway, getIndexType() is not more expensive. If I extend TargetInfo, the extension will be optional and all other targets will calculate getIntPtrType() anyway.

If you read it from the DataLayout string, you'll either construct it at that parsing time from the specified version or from the default version.

  1. It fixes only around 20% of the places that we've found that assume that the size and range of the pointer are the same.

I can't derive all places from your code. You can show me them all, one-by-one, or we'll fix more places gradually on top of this patch.

Greping the code for all uses of getPointerBaseSize should show them all, but I can send you a list.

I want to deprecate SCEVs for pointers if the index size is not equal to pointer size.
What do you think?

bool ScalarEvolution::isSCEVable(Type *Ty) const {
  if (Ty->isIntegerTy())
    return true;
  if (Ty->isPointerTy()) {
    // Pointer can't be scevable if index type and pointer type have different
    // width.
    const DataLayout& DL = getDataLayout();
    if (DL.getIndexTypeSizeInBits(Ty) == DL.getPointerTypeSizeInBits(Ty))
      return true;
  }
  return false;

}

I want to deprecate SCEVs for pointers if the index size is not equal to pointer size.
What do you think?

This will mean that you don't get a load of loop optimisations. I think that's a pretty big hammer. There's no reason why SCEV can't work here - we use it and have a bunch of patches against it to make it work in this context. Please take a look at our code and see how much of it is applicable to you.

I looked at your code:
/ Return the size in bits of the specified type, for which isSCEVable must
/ return true.
uint64_t ScalarEvolution::getTypeSizeInBits(Type *Ty) const {

assert(isSCEVable(Ty) && "Type is not SCEVable!");
const DataLayout &DL = getDataLayout();
if (PointerType *PT = dyn_cast<PointerType>(Ty))
  return DL.getPointerBaseSizeInBits(PT->getPointerAddressSpace());
return DL.getTypeSizeInBits(Ty);

}
I can't say that size of pointer is smaller that it is. I can't truncate pointer to integer in order to expand all SCEV expressions.

Ayal added a subscriber: Ayal.Mon, Jan 29, 9:31 AM
delena updated this revision to Diff 131955.Tue, Jan 30, 6:07 AM

Added index width specification to the DataLayout. Updated the langref.
Fixed Pointer vs Index sizes in the code.
Added more tests.

sanjoy added inline comments.Tue, Jan 30, 10:25 AM
../lib/Analysis/ScalarEvolution.cpp
3667 ↗(On Diff #131955)

Generally speaking; the SCEV changes need to be tested.

3675 ↗(On Diff #131955)

I don't think this is a correct place to make this change -- the size of a pointer is the size of a pointer. I think you need to change the SCEV corresponding to GEP(Ptr, Idx) to be "sext(Ptr) + Idx" or "Ptr + sext(Idx)" depending on their relative sizes.

delena added inline comments.Wed, Jan 31, 1:46 AM
../lib/Analysis/ScalarEvolution.cpp
3667 ↗(On Diff #131955)

I added several tests that go through the SCEV. Looks ok right now. I can't say that I cover all corner cases, but we can do further changes gradually, there is no impact on in-tree targets. If you see something specific that requires more testing now, please let me know.

3675 ↗(On Diff #131955)

I can't create SCEV expressions with ptr+ind, it will fail with assertion on different types.

Ayal added inline comments.Thu, Feb 1, 2:47 AM
../lib/Analysis/ScalarEvolution.cpp
3675 ↗(On Diff #131955)

Elena, Sanjoy's thought above to change SCEV to be "sext(Ptr) + Idx" or "Ptr + sext(Idx)" will bring the two addends to be of the same type, i.e., of the larger type. The challenge in your case is lack of target support for integer addition of pointer-sized integers; which seems similar to CHERI's case. Except CHERI pointers (or capabilities) hold in addition to a standard-sized address additional information, such that the latter can be stripped out for SCEV purposes (IIUC - @theraven please correct if needed); whereas in your case the address itself is larger than a standard-sized integer. Perhaps for your case too the pointer can be stripped down to standard-sized integers to leverage SCEV's capabilities on "legal" types, which seems to be what your patch is doing, coupled with separate logic that deals with the stripped out bits(?).

theraven added inline comments.Thu, Feb 1, 3:49 AM
../lib/Analysis/ScalarEvolution.cpp
3675 ↗(On Diff #131955)

That's pretty much the case for us: our pointers are 128 bits, but have a 64-bit range (64 bits of metadata). We have modified DataLayout to explicitly understand that pointer size and range are different (in a slightly hacky way, which we should improve before we think about upstreaming). In scalar evolution, we always use the pointer's range as the type.

We don't support arbitrary integer operations on pointers and in our back end we have added some new MVTs to represent non-integer pointer types. Our architecture provides pointer + integer addressing modes.

I believe that, in the motivating example for this change, the existing ScalarEvolution code is correct: it should use pointer-sized integers, because otherwise the analyses are likely to be wrong in some exciting corner cases.

We have addressed this by adding explicit PTRADD SelectionDAG nodes, which perform pointer + integer addition. For complex addressing modes, we end up with (ptradd base (some complex integer ops)). This works well as long as the underlying hardware supports address register + integer register addressing, which I presume is the case for Intel (it is for all Harvard architectures that I've come across).

If you are targeting an architecture for which pointer operations and integer operations are not the same, then you should follow the same approach: in the back end, lower pointers to some non-integer type and match pointer operations with different patterns to integer ones. We have a bunch of SelectionDAG and TableGen patches that make this work well, which we'd be happy to upstream.

delena added a comment.Thu, Feb 1, 4:09 AM
" > We have addressed this by adding explicit PTRADD SelectionDAG nodes, which perform pointer + integer addition. For complex addressing modes, we end up with (ptradd base (some complex integer ops)). This works well as long as the underlying hardware supports address register + integer register addressing, which I presume is the case for Intel (it is for all Harvard architectures that I've come across)."

Yes, we also added ADDPTR node for SelectionDAG and we have more changes related to the special pointer type. Apparently, the codegen does not work with MVT::Ptr.
We can try to upstream the part of DAG builder, that makes ADDPTR from GEP.

@theraven , the latest uploaded version is aligned with what you implemented out of the tree. Could you, please, take a look?

theraven accepted this revision.Tue, Feb 6, 2:29 AM

Two very small nits (which I'd be happy to see fixed after commit, but might be easier to fix first), but otherwise it looks like a significantly cleaned-up version of what we have.

Thank you very much for working on this! Our next merge will be a little bit painful, but subsequent ones should be a lot easier.

Are you planning on upstreaming your ADDPTR SelectionDAG stuff? We have added PTRADD, INTTOPTR and PTRTOINT nodes and if they're useful to someone apart from us then we can upstream them.

../include/llvm/IR/DataLayout.h
357 ↗(On Diff #131955)

Please can we not have a default for AS? We've added defaults for other things like this because they were existing APIs and we didn't want to have to update all of the callers at once, but all of the callers of this are already being updated and so should specify the correct AS.

../lib/Analysis/ScalarEvolution.cpp
3675 ↗(On Diff #131955)

I agree with @sanjoy that this isn't the correct place for this change, but it does happen to be the least disruptive place for the change. The correct solution is probably to rename this method to something like getTypeArithmeticSizeInBits so that it's clear that it's returning a size as a proxy for a range and not a storage size.

This revision is now accepted and ready to land.Tue, Feb 6, 2:29 AM
sanjoy added inline comments.Tue, Feb 6, 1:06 PM
../lib/Analysis/ScalarEvolution.cpp
3667 ↗(On Diff #131955)

I don't see these tests in this current version of the patch.

3675 ↗(On Diff #131955)

Given what you said, the right fix seems to be to truncate 128 bit pointers to 64 bits in getSCEV instead of lying about the pointer's size. SCEV calls into other parts of LLVM like ValueTracking, and other parts of LLVM call into SCEV (invdvars, lsr, scev-aa etc.) and I'm worried that a discrepancy like this (pointer size = 64 in SCEV but 128 elsewhere) will cause bugs.

delena added a comment.Tue, Feb 6, 9:36 PM

I don't see these tests in this current version of the patch.

All tests that you see *-custom.ll" go through the scev calculations.

theraven added inline comments.Wed, Feb 7, 2:54 AM
../lib/Analysis/ScalarEvolution.cpp
3675 ↗(On Diff #131955)

Truncating for us would be absolutely the wrong thing, because it changes the semantics (throws away all of the bounds metadata, falls back to some per-environment bounds which may not even allow access to this address). In our case, we have a difference between the size and the range.

All of the places we've seen this used in ScalarEvolution, it cares about the range, not the storage size, it just happens that on most architectures these are the same (because a 32-bit pointer has a range of 2^32 bytes, a 64-bit pointer has a range of 2^64 bytes).

The correct solution is to either rename this method something like getTypeArithmeticSizeInBits and update all of the callers, or add a new getTypeArithmeticSizeInBits method and update all of the callers (I don't believe that there are any that care about the storage size, but I might have missed one).

We've been running with a change that's semantically identical to @delena's for a few years and have not encountered any miscompilations as a result (building a whole of the FreeBSD base system, a bunch of benchmarks, and a few other large programs), so I'm fairly confident that it's safe.

Thank you very much for working on this. It will make our future upstream merges much easier.

../docs/LangRef.rst
1913 ↗(On Diff #131955)

fourth parameter

../include/llvm/IR/DataLayout.h
357 ↗(On Diff #131955)

Yes, please remove the default value here. We have run into lots of issues due to using the size of AS0 instead of the correct one.

../lib/Transforms/InstCombine/InstructionCombining.cpp
1511 ↗(On Diff #131955)

Index width may not be the same width as pointer width

delena updated this revision to Diff 133387.Thu, Feb 8, 2:21 AM
delena marked 3 inline comments as done.

Updated according to the latest comments.

delena updated this revision to Diff 133840.Mon, Feb 12, 5:35 AM

Added more tests with custom data layout.

This revision was automatically updated to reflect the committed changes.