This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Fix adjustToPointerSize in BasicAliasAnalysis.cpp for ptr > 64b
Needs RevisionPublic

Authored by mppf on Oct 3 2017, 6:12 AM.

Details

Summary

The function adjustToPointerSize in BasicAliasAnalysis.cpp includes
an assertion that the pointer size is <= 64 bits. I'm working on some
uses of LLVM that need 128-bit pointers. This change simply adjusts
this function to allow 128-bit pointers and to not worry about adjusting
offsets to handle integer sizes < int64_t in that case.

The logic being adjusted has existed in some form in LLVM since
at least r111433 in 2010. LLVM has become much better at handling
varied pointer sizes since then.

Includes a test case that fails without this change.

Event Timeline

mppf created this revision.Oct 3 2017, 6:12 AM
aprantl accepted this revision.Oct 3 2017, 9:13 AM
aprantl added a subscriber: aprantl.

This seems to match the intended semantics of the function based on the doxygen comment, so this change LGTM. Ideally, the test should check an actual result of the analysis pass rather than "don't crash".

lib/Analysis/BasicAliasAnalysis.cpp
363

s/32b/32bit/ ?

test/Analysis/BasicAA/128-bit-ptr.ll
18

I would prefer it if the CHECK would test for some actual result of the analysis if possible.

This revision is now accepted and ready to land.Oct 3 2017, 9:13 AM
hfinkel added a subscriber: hfinkel.Oct 3 2017, 9:59 AM

LGTM too. Please do update the test so it checks that the result makes sense. We need more test coverage in this area.

davide accepted this revision.Oct 3 2017, 11:21 AM

This looks good after other reviewer's comments have been addressed.

efriedma requested changes to this revision.Oct 3 2017, 11:40 AM
efriedma added a subscriber: efriedma.

Hang on, there's a more fundamental problem here this is papering over. If your pointers are larger than 64 bits, those pointers can have offsets larger than 64 bits. Since BasicAA is using 64-bit integers to represent pointer offsets, the math in DecomposeGEPExpression will overflow, so you'll get incorrect results, and eventually cause a miscompile.

This revision now requires changes to proceed.Oct 3 2017, 11:40 AM

Hang on, there's a more fundamental problem here this is papering over. If your pointers are larger than 64 bits, those pointers can have offsets larger than 64 bits. Since BasicAA is using 64-bit integers to represent pointer offsets, the math in DecomposeGEPExpression will overflow, so you'll get incorrect results, and eventually cause a miscompile.

Indeed; I made a similar comment in D38501. In this case, it looks like the main potential overflow comes from:

Index = GetLinearExpression(Index, IndexScale, IndexOffset, ZExtBits,
                            SExtBits, DL, 0, AC, DT, NSW, NUW);

// The GEP index scale ("Scale") scales C1*V+C2, yielding (C1*V+C2)*Scale.
// This gives us an aggregate computation of (C1*Scale)*V + C2*Scale.
Decomposed.OtherOffset += IndexOffset.getSExtValue() * Scale;
Scale *= IndexScale.getSExtValue();

where Scale is multiplied by IndexScale.getSExtValue();. We might just bail out here if IndexScale, or IndexOffset, can't be represented in 64 bits.

Hang on, there's a more fundamental problem here this is papering over. If your pointers are larger than 64 bits, those pointers can have offsets larger than 64 bits. Since BasicAA is using 64-bit integers to represent pointer offsets, the math in DecomposeGEPExpression will overflow, so you'll get incorrect results, and eventually cause a miscompile.

Indeed; I made a similar comment in D38501. In this case, it looks like the main potential overflow comes from:

Index = GetLinearExpression(Index, IndexScale, IndexOffset, ZExtBits,
                            SExtBits, DL, 0, AC, DT, NSW, NUW);

// The GEP index scale ("Scale") scales C1*V+C2, yielding (C1*V+C2)*Scale.
// This gives us an aggregate computation of (C1*Scale)*V + C2*Scale.
Decomposed.OtherOffset += IndexOffset.getSExtValue() * Scale;
Scale *= IndexScale.getSExtValue();

where Scale is multiplied by IndexScale.getSExtValue();. We might just bail out here if IndexScale, or IndexOffset, can't be represented in 64 bits.

I'm not sure, however, that I'd hold this patch up over that issue. We'll need to audit all of BasicAA to put in the necessary checks. We should probably do that as a series of small patches, however, (so that we can bisect usefully if we mess something up).

If we switch to using APInt for offsets, adjustToPointerSize() just goes away, so I don't see how this is forward progress.

I guess the alternative is to try to add a bunch of overflow checks, but that seems a lot more complicated for no benefit.

If we switch to using APInt for offsets, adjustToPointerSize() just goes away, so I don't see how this is forward progress.

Granted. There's a risk that, if we try that, we'll find an unacceptable compile-time slowdown. Nevertheless, I'll go with you on this: there are only around 40 int64_t variables in BasicAA, so it's worth trying.

I guess the alternative is to try to add a bunch of overflow checks, but that seems a lot more complicated for no benefit.

mppf added a comment.Oct 4 2017, 7:48 AM

Hang on, there's a more fundamental problem here this is papering over. If your pointers are larger than 64 bits, those pointers can have offsets larger than 64 bits. Since BasicAA is using 64-bit integers to represent pointer offsets, the math in DecomposeGEPExpression will overflow, so you'll get incorrect results, and eventually cause a miscompile.

In my use case, I have 128-bit pointers but "adding" to one of these will only ever modify the lower 64 bits. So for my use case, offsets > 64 bits just don't matter.

More generally, in the case of BasicAA, my understanding is that these int64_ts represent *fixed* offsets in to pointed-to-data. I.e. you'd have to have a struct field or other fixed-width type that is 16 exabytes or so. Sure, you could have a getelementptr accessing i8* with whatever fixed integer we want, say 2^80 for the sake of argument. I just don't think it's likely that any front-end would generate such a thing. (But of course they *could* if somebody comes up with a reason to... I'm just explaining why I'm not particularly worried about it). In practice, the fixed GEPs our front-end generates have offsets that almost certainly fit into an int32...

If we switch to using APInt for offsets, adjustToPointerSize() just goes away

Granted. There's a risk that, if we try that, we'll find an unacceptable compile-time slowdown. Nevertheless, I'll go with you on this: there are only around 40 int64_t variables in BasicAA, so it's worth trying.

I'm not sure if you're asking me to try to do that? I've never seen this code before attempting this fix... so I'm not sure I'm a good candidate for that effort. This patch request is really a bug report with a suggested fix. If you prefer, I could switch to doing a bug report and let one of you fix it. (Among other things, I don't know how to appropriately analyze the performance impact).

Anyway my recent experience with APInt and offsets in D38501 suggests to me that it's not as simple as just switching to APInt. These GEP offsets have specific overflow behavior that needs to be considered. From LangRef.html for getelementptr:

If the inbounds keyword is not present, the offsets are added to the base address with silently-wrapping two’s complement arithmetic. If the offsets have a different width from the pointer, they are sign-extended or truncated to the width of the pointer.

(As an aside, I don't know why this arithmetic overflow has anything to do with inbounds, since things can overflow and still be in an allocated object).
I think the IR is designed this way because LLVM IR doesn't generally treat signed/unsigned integers differently. So to make negative offsets work right (like -1), the definition actually relies on overflow behavior. So we'd need to be doing all of this address arithmetic with APInts with the same width as the pointer. Maybe that's what you were proposing but I don't think such things are obvious...

so I don't see how this is forward progress.

It's forward progress because it solves an immediately blocking issue for me, it does so in a way that has low impact on anything else (performance, correctness, ...), and it exists already.
Don't let the perfect be the enemy of the good (enough), and all that.

I guess the alternative is to try to add a bunch of overflow checks, but that seems a lot more complicated for no benefit.

It seems to me that we could check that the computed GEP offsets fit in to int64_t without too much trouble... but I'd do that by doing the computation with APInts, say in the code calling GetLinearExpression that hfinkel mentioned.

In my use case, I have 128-bit pointers but "adding" to one of these will only ever modify the lower 64 bits. So for my use case, offsets > 64 bits just don't matter.

That's completely different.

LangRef says "[...] the offsets are added to the base address with silently-wrapping two’s complement arithmetic. If the offsets have a different width from the pointer, they are sign-extended or truncated to the width of the pointer." And we have a bunch of code which actually assumes it represents two's complement arithmetic in the width of the pointer; for example, instcombine has a transform which turns a pointer comparison into a comparison of pointer offsets, and some analysis passes use an APInt whose width is the pointer width rather than the int64_t AA uses. If pointer arithmetic wraps differently on your target, we probably need to add an attribute to the data layout, and adjust LangRef to define what it means.

mppf added a comment.Oct 4 2017, 2:09 PM

In my use case, I have 128-bit pointers but "adding" to one of these will only ever modify the lower 64 bits. So for my use case, offsets > 64 bits just don't matter.

That's completely different.

... we have a bunch of code which actually assumes it represents two's complement arithmetic in the width of the pointer; for example, instcombine has a transform which turns a pointer comparison into a comparison of pointer offsets, and some analysis passes use an APInt whose width is the pointer width rather than the int64_t AA uses. If pointer arithmetic wraps differently on your target, we probably need to add an attribute to the data layout, and adjust LangRef to define what it means.

I don't think that's necessary (for me at least) because it's OK for me to get undefined behavior / core dumps if one of these pointers actually overflows/underflows (assuming signed arithmetic). I.e. as long as adding 1, -1 or other small values works, that's good enough for me at the present time. In other words, you could assume that all my pointer operations are 'inbounds'. *technically* adding -1 causes overflow, if you look at it in the right way, but it happens in a way that plays well with truncation to a smaller sized pointer. My use-case involves a late optimization pass that changes these 128-bit pointers to {i64, 64-bit pointer} structures, at which point handling any necessary truncation is my problem. I'm saying that i64 arithmetic for my pointers is fine, but so is i128 arithmetic. I don't care what happens if you do something like load from NULL - 1.

It's certainly the case that we *could* try to modify LangRef to define pointers with offsets of a different size than the pointer, and I *could* use such semantics. But we'd have to adjust every optimization. I just don't think it's called for right now.

Remember, I'm looking for a bug fix?

But anyway this leaves me wondering - what in particular do we need to do, from your point of view, to move this patch forward? There have been quite a few suggestions and I can't tell which ones would be satisfactory (or which ones would be my problem to solve).

Thanks.

In D38499#888784, @mppf wrote:

In my use case, I have 128-bit pointers but "adding" to one of these will only ever modify the lower 64 bits. So for my use case, offsets > 64 bits just don't matter.

That's completely different.

... we have a bunch of code which actually assumes it represents two's complement arithmetic in the width of the pointer; for example, instcombine has a transform which turns a pointer comparison into a comparison of pointer offsets, and some analysis passes use an APInt whose width is the pointer width rather than the int64_t AA uses. If pointer arithmetic wraps differently on your target, we probably need to add an attribute to the data layout, and adjust LangRef to define what it means.

I don't think that's necessary (for me at least) because it's OK for me to get undefined behavior / core dumps if one of these pointers actually overflows/underflows (assuming signed arithmetic). I.e. as long as adding 1, -1 or other small values works, that's good enough for me at the present time. In other words, you could assume that all my pointer operations are 'inbounds'. *technically* adding -1 causes overflow, if you look at it in the right way, but it happens in a way that plays well with truncation to a smaller sized pointer. My use-case involves a late optimization pass that changes these 128-bit pointers to {i64, 64-bit pointer} structures, at which point handling any necessary truncation is my problem. I'm saying that i64 arithmetic for my pointers is fine, but so is i128 arithmetic. I don't care what happens if you do something like load from NULL - 1.

It's certainly the case that we *could* try to modify LangRef to define pointers with offsets of a different size than the pointer, and I *could* use such semantics. But we'd have to adjust every optimization. I just don't think it's called for right now.

Remember, I'm looking for a bug fix?

But anyway this leaves me wondering - what in particular do we need to do, from your point of view, to move this patch forward? There have been quite a few suggestions and I can't tell which ones would be satisfactory (or which ones would be my problem to solve).

I'm working on a patch for this (replacing essentially all of the internal computations with APInt). Currently, my patch causes Analysis/BasicAA/gep-and-alias.ll to fail. I think this might be a real bug (I've made what I believe is a 64-bit version of this test case and it fails on trunk too). I'm investigating...

In D38499#888784, @mppf wrote:

In my use case, I have 128-bit pointers but "adding" to one of these will only ever modify the lower 64 bits. So for my use case, offsets > 64 bits just don't matter.

That's completely different.

... we have a bunch of code which actually assumes it represents two's complement arithmetic in the width of the pointer; for example, instcombine has a transform which turns a pointer comparison into a comparison of pointer offsets, and some analysis passes use an APInt whose width is the pointer width rather than the int64_t AA uses. If pointer arithmetic wraps differently on your target, we probably need to add an attribute to the data layout, and adjust LangRef to define what it means.

I don't think that's necessary (for me at least) because it's OK for me to get undefined behavior / core dumps if one of these pointers actually overflows/underflows (assuming signed arithmetic). I.e. as long as adding 1, -1 or other small values works, that's good enough for me at the present time. In other words, you could assume that all my pointer operations are 'inbounds'. *technically* adding -1 causes overflow, if you look at it in the right way, but it happens in a way that plays well with truncation to a smaller sized pointer. My use-case involves a late optimization pass that changes these 128-bit pointers to {i64, 64-bit pointer} structures, at which point handling any necessary truncation is my problem. I'm saying that i64 arithmetic for my pointers is fine, but so is i128 arithmetic. I don't care what happens if you do something like load from NULL - 1.

It's certainly the case that we *could* try to modify LangRef to define pointers with offsets of a different size than the pointer, and I *could* use such semantics. But we'd have to adjust every optimization. I just don't think it's called for right now.

Remember, I'm looking for a bug fix?

But anyway this leaves me wondering - what in particular do we need to do, from your point of view, to move this patch forward? There have been quite a few suggestions and I can't tell which ones would be satisfactory (or which ones would be my problem to solve).

I'm working on a patch for this (replacing essentially all of the internal computations with APInt). Currently, my patch causes Analysis/BasicAA/gep-and-alias.ll to fail. I think this might be a real bug (I've made what I believe is a 64-bit version of this test case and it fails on trunk too). I'm investigating...

(where by real bug I meant pre-existing bug)

mppf added a comment.Jan 2 2019, 9:31 AM

This patch has been replaced by D38662