This is an archive of the discontinued LLVM Phabricator instance.

[FIX] Nonnull is not always implied by dereferenceable
Changes PlannedPublic

Authored by jdoerfert on Aug 23 2019, 9:58 AM.

Details

Summary

Depending on the function and address space nunnull is not implied by
dereferenceable. With this patch Value::getPointerDereferenceableBytes
will take into account that NULL might be a dereferenceable pointer.

This shows the problem and a solution but the TODO mentions how we
should actually go about it. See also D66618

This will also affect the following not yet patched tests:
CodeGen/AMDGPU/legalize-fp-load-invariant.ll
CodeGen/AMDGPU/parallelandifcollapse.ll
Transforms/InstCombine/addrspacecast.ll
Transforms/InstCombine/memcpy-addrspace.ll

Event Timeline

jdoerfert created this revision.Aug 23 2019, 9:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2019, 9:58 AM

getPointerDereferenceableBytes returns some number of dereferenceable bytes. If CanBeNull is true, that result is modified: if the pointer value is null, the number of known dereferenceable bytes is actually zero.

As far as I can tell, without your patch, the function implements this logic correctly, and the callers use the result correctly. "CanBeNull == false" doesn't mean the pointer is non-null; it means the caller doesn't have to prove the value is non-null before using the "known dereferenceable bytes" result. The parameter CanBeNull should probably be named to clarify that, though; maybe "OrNull" would be better?

getPointerDereferenceableBytes returns some number of dereferenceable bytes. If CanBeNull is true, that result is modified: if the pointer value is null, the number of known dereferenceable bytes is actually zero.

That is the part that is not true because we conflate two concepts, "can be a nullptr" and "is known to be deref". If the pointer value is null the there can well be dereferenceable bytes. And if there are dereferenceable bytes, the value can as well be null.

As far as I can tell, without your patch, the function implements this logic correctly, and the callers use the result correctly. "CanBeNull == false" doesn't mean the pointer is non-null; it means the caller doesn't have to prove the value is non-null before using the "known dereferenceable bytes" result. The parameter CanBeNull should probably be named to clarify that, though; maybe "OrNull" would be better?

I don't believe it makes sense to cling on the "null is special" concept (e.g., by naming it OrNull) here given that we give up on that for non-0 address spaces.
In D66618 I made the different cases explicit.

If you want to really expand out the meaning of "CanBeNull", it means "was the number of dereferenceable bytes computed using a dereferenceable_or_null attribute/metadata". It has nothing to do with whether a null pointer is generally valid in the given address space. The logic has always worked this way, since before it was extracted into a separate function in D17572.

getPointerDereferenceableBytes is not a good API, sure; if you want to refactor it, fine. But this patch just breaks it.

joerg added a subscriber: joerg.Aug 24 2019, 11:33 AM

It would be nice to also default to this for -ffreestanding.

If you want to really expand out the meaning of "CanBeNull", it means "was the number of dereferenceable bytes computed using a dereferenceable_or_null attribute/metadata". It has nothing to do with whether a null pointer is generally valid in the given address space. The logic has always worked this way, since before it was extracted into a separate function in D17572.

The logic always worked this way is hardly an argument, given that dereferencebale alone is already misused (=broken). While initially constructed for something, "CanBeNull" seems like a freestanding definition of the return value to me.

getPointerDereferenceableBytes is not a good API, sure; if you want to refactor it, fine.

That is what I proposed to do in my RFC to the list and prototyped already (D66618). Please take a look.

But this patch just breaks it.

I doubt it breaks the API, arguably it makes CanBeNull return false only if the pointer "cannot be null", which seems logical to me. When we additionally return "IsKnownDeref" (see D66618) we do not regress anything (probably what you mean by break here) and we could start to use the logic in the isKnownNonZero instead of duplicating stuff.

It would be nice to also default to this for -ffreestanding.

What default do you want exactly? Did you see D66618 and the RFC email I send to the list?

jdoerfert planned changes to this revision.Aug 25 2019, 6:54 PM

(This will never go in like this, I will keep it open nor for the discussions sake, D66618 is a better solution)

sanjoy resigned from this revision.Jan 29 2022, 5:40 PM
lebedev.ri resigned from this revision.Jan 12 2023, 4:57 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:57 PM