Technically, non-integral pointers "have an unspecified bitwise
representation", so they are not bytewise values. In particular,
don't use memset on those values.
Details
Diff Detail
Event Timeline
include/llvm/Analysis/ValueTracking.h | ||
---|---|---|
227 | Provided a default value as this function is public so it won't break external uses. External uses probably don't care about non-integral pointer anyway. (E.g. clang has one use.) |
This patch made me curious because I remembered fixing this one. Turned out that's still a patch we're carrying downstream. Oops. :(
The general approach we took here was a bit different. I don't have a strong preference, but I want to see what you think of the alternate before continuing.
The basic approach we took was two fold: 1) Disallow stores of non-integral types as "legal stores" in the sense of this file, and 2) inhibit formation of geps off of null in an non-integral address space so as to prevent formation of inttoptrs.
What do you think of the merits of the two approaches? Is there something your approach catches ours doesn't?
Thank you for the comment!
The basic approach we took was two fold: 1) Disallow stores of non-integral types as "legal stores" in the sense of this file,
I guess "this file" you refers is ValueTracking.cpp? But it doesn't seem to mention legal stores. There is "legal store" in LoopIdiomRecognize.cpp. But in this case it probably won't completely fix our problem, where memset could be generated in MemCpyOpt. I initially thought of fixing just MemCpyOpt. As the spec says non-integral pointers "have an unspecified bitwise representation", I turn to the current patch, which I think is more aligned with the spec.
and 2) inhibit formation of geps off of null in an non-integral address space so as to prevent formation of inttoptrs.
Sorry, I'm not sure I fully understand this. This seems a reasonable thing to do. But I'm sure how it is related to storing null values into non-integral pointer typed memory.
If you would like to share your patch, I'm happy to take a look and give it a try. Thank you!
test/Transforms/MemCpyOpt/non-integral-ptr.ll | ||
---|---|---|
19 | It's not clear to me why a memset is "wrong" here; either way, you're setting all the bits to zero. Is the issue that const0 is not actually an all-zero constant? The LLVM null pointer constant is treated as an all-zero value in LLVM, even on targets where a null pointer constant in C would have some non-zero value. If you think that isn't appropriate, please start a thread on llvmdev; I haven't really thought it through completely, but it would probably get complicated due to the impact on various transforms like constant folding. If you need an "opaque" null pointer constant, there's probably a simpler solution: define a special global @llvm.opaque.nullptr or something like that, and make your frontend use it instead of ConstantPointerNull. |
test/Transforms/MemCpyOpt/non-integral-ptr.ll | ||
---|---|---|
19 | Thanks for the comment. Generating memset isn't immediately wrong, but other parts of the optimizer, like store-to-load forwarding, may try to do further optimizations and bad things could happen. For example, it may construct an integer 0 and convert to non-integral pointer type, which fails, or do other things that invalidate the non-integral pointer semantics. I'm not suggesting that it should have a non-zero value. It is a zero value, just cannot be converted to/from integer zero value. Also, there is a similar test test/Transforms/LoopIdiom/non-integral-pointers.ll that checks memset is not used for non-integral pointers. |
include/llvm/Analysis/ValueTracking.h | ||
---|---|---|
227 | I'd prefer not to do this; constructing a DataLayout like this is probably sort of expensive. It's not a big deal to update clang. | |
test/Transforms/MemCpyOpt/non-integral-ptr.ll | ||
19 | Either the bits of a null pointer can't be inspected at all, or optimizations can assume the bits are actually zero; there is no in-between state. And whichever choice is correct, changing isBytewiseValue like this doesn't help make LLVM consistent. If the bits can't be inspected, the IR shouldn't contain ConstantPointerNull, and isBytewiseValue won't be able to prove anything anyway. If the bits can be inspected, isBytewiseValue is already returning the correct answer. That said, this doesn't really have any impact on targets which don't have non-integral pointers, so I don't really care that much about what happens here; maybe this is helpful in practice even if it isn't logically consistent. |
include/llvm/Analysis/ValueTracking.h | ||
---|---|---|
227 | Ok, will do. | |
test/Transforms/MemCpyOpt/non-integral-ptr.ll | ||
19 | I'm not sure why ConstantPointerNull implies the bits can be inspected, or why memset must be used. In the documentation it says non-integral pointers "have an unspecified bitwise representation". If the zero value of non-integral pointer, unlike all other values, has a defined bitwise representation, it probably need to be documented this way, although it looks weird to me. Maybe @reames could help clarify the semantics. Thanks. |
include/llvm/Analysis/ValueTracking.h | ||
---|---|---|
227 | Done. |
I'm okay with this, I guess... but someone more familiar with non-integral pointers in practice should approve.
Also, it would be nice to fix LangRef to clarify the interaction of non-integral pointers with a known bit-pattern.
Also, it would be nice to fix LangRef to clarify the interaction of non-integral pointers with a known bit-pattern.
Added a clarification about the zero value. Thanks.
After taking a look at this again, I remain unconvinced of the need for this patch. In particular, I took your test case and ran it against our downstream copy, and discovered my previous comment was incorrect. We actually *don't* prevent the transform you're preventing here. I'm getting a strong sense that this is working around a deeper problem, and we need to figure out what that deeper problem is before going forward.
Just to check the obvious, you're not trying to use a non-zero value for null are you? If so, then this patch *is* warranted, but is definitely not anywhere near sufficient to resolve the whole problem.
test/Transforms/MemCpyOpt/non-integral-ptr.ll | ||
---|---|---|
19 | A bit of background here. For all practical GCs I know of, even relocating and compacting GCs, the bit pattern of null is in fact zero, and remains zero throughout the GC cycle. As such, we've chosen to take a practical approach where an non-integral address space pointer value has an unspecified value, but we've baked in null as the zero pattern in a few spots. As Eli points out, the optimize makes it practically hard to describe an address space for which the null value is not zero. We can describe an AS where null is dereferenceable, but not one where null != 0. </end background> |
Provided a default value as this function is public so it won't break external uses. External uses probably don't care about non-integral pointer anyway. (E.g. clang has one use.)