Page MenuHomePhabricator

[Attr] Introduce dereferenceable_globally
Needs ReviewPublic

Authored by jdoerfert on May 7 2019, 1:26 PM.

Details

Summary

The current uses of the "dereferenceable" attribute throughout
LLVM/Clang are not consistent in their interpretation of the semantics.

Clang, and some parts of LLVM, assume the annotated pointer value to be
dereferenceable "now", thus at the program point where the annotation is
(which happens to be the definition of the value).

LLVM optimizations and high-level front-ends tend to interpret the
dereferenceable attribute as a persistent property for the pointer.

Now having different interpretations is obviously problematic [1].

There has been ample discussion about which definition we should use and
which one should get its very own attribute [1,2]. (That we need both
seems to be unanimously accepted.)

With this patch we limit the meaning of "dereferenceable" to
"dereferenceable at the definition of the value".
For dereferenceability stretching the whole lifetime of a pointer value,
or at least all program points where the SSA value can be accessed, we
introduced "dereferenceable_in_globally".

Some pros and cons where summarized by @hfinkel [6].

The most important reason why we should go this way is that this
change is conservatively correct with regards to existing/currently
produced IR. This is especially important because auto-update is not
applicable when we keep the original dereferenceable name around (what
was updated what wasn't?). Again, this will not break front-ends that
produce dereferenceable attributes/metadata, regardless of which (of the
two) interpretations they use. This way forward was also "suggested" by
@rsmith [3] and @chandlerc [4]. The alternative approach, which was
preferred by people as well ([5] to highlight only one), gives us
"correct" and "better" optimizations but defines existing IR as broken.

Either way, it is clear that we need to fix something and introduce a
new attribute.

[1] http://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html
[2] https://reviews.llvm.org/D48239
[3] https://reviews.llvm.org/D48239#1135582
[4] https://reviews.llvm.org/D48239#1155026
[5] https://reviews.llvm.org/D48239#1134816
[6] http://lists.llvm.org/pipermail/llvm-dev/2018-July/124557.html

Event Timeline

jdoerfert created this revision.May 7 2019, 1:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2019, 1:26 PM
Herald added a subscriber: bollu. · View Herald Transcript

This patch goes the opposite way as proposed by D48239 and adds the new attribute we will need either way.

I'm fine with this. Probably worth adding a release note.

sanjoy added inline comments.May 14 2019, 9:05 PM
llvm/docs/LangRef.rst
1166

I'm not sure if this is a good semantic property. Consider (the CFG form of):

if (cond) {
  a = dereferenceable_in_scope(4) ...
} else {
  ...
}
side_effect_that_frees_a()

I think this program above is well defined because the definition of a does not dominate side_effect_that_frees_a(). But that means we can't tail duplicate the program into:

if (cond) {
  a = dereferenceable_in_scope(4) ...
  side_effect_that_frees_a()
} else {
  ...
  side_effect_that_frees_a()
}

Same applies for inlining. Any transform that can extend the dominance region of some value needs to be audited.

What is the use case for this attribute? Can we do something more targeted for this use case?

jdoerfert marked an inline comment as done.May 14 2019, 10:00 PM
jdoerfert added a subscriber: reames.
jdoerfert added inline comments.
llvm/docs/LangRef.rst
1166

What is the use case for this attribute? Can we do something more targeted for this use case?

My "use case" is basically the clang interpretation of (current) derferenceable which I tried to replicate with the new derferenceable definition. There are users of (current) derferenceable that interpret it as derferenceable_in_scope and they should have a way to do so.

I think this program above is well defined because the definition of a does not dominate
Same applies for inlining. Any transform that can extend the dominance region of some value needs to be audited.

I agree that there is a problem with the current wording wrt. inlining. I'll try to fix that.
Maybe @reames can help me define what they want/need.

For the record, the example above is not sufficient to expose the inlining problem though. Above, a was "accessible" at the location where a was freed, so the dereferenceable_in_scope was violated already. Thinking about it, even if we inline, SSA dominance property should be combined with reachability to a user to give you what you want.

sanjoy added inline comments.May 14 2019, 10:13 PM
llvm/docs/LangRef.rst
1166

I agree that there is a problem with the current wording wrt. inlining. I'll try to fix that. Maybe @reames can help me define what they want/need.

Maybe we need something stronger that says that a pointer is always dereferenceable(n), like a global variable?

Thinking about it, even if we inline, SSA dominance property should be combined with reachability to a user to give you what you want.

I was thinking about something like:

void f() {
  a = dereferenceable_in_scope(4) ...
}

void main() {
  f();
  free_a();
}

I think this is well defined according to the current wording since the free_a does not lie within the scope of a? If yes, then inlining becomes illegal.

jdoerfert marked an inline comment as done.May 15 2019, 9:27 AM
jdoerfert added inline comments.
llvm/docs/LangRef.rst
1166

What do you think about dereferenceable_globally(<n>), which would be the wording without the "in its scope (asdefined by the SSA dominance property)"?

So:

This indicates that the annotated pointer value has the
``dereferenceable(<n>)`` property *at any program point*. Thus, unlike pointer values
annotated with ``dereferenceable(<n>)``, ``dereferenceable_in_scope(<n>)``
pointer values can never lose the ``dereferenceable(<n>)``.
sanjoy accepted this revision.May 15 2019, 8:09 PM
sanjoy added inline comments.
llvm/docs/LangRef.rst
1166

I think this is implied, but how about "the annotated pointer value has the
`dereferenceable(<n>)` starting from the definition of the value to the termination of the program."?

This revision is now accepted and ready to land.May 15 2019, 8:09 PM
reames requested changes to this revision.May 15 2019, 8:32 PM

I think the framing of this in terms of scopes is likely to be problematic. The current definition of dereferenceable is (I think) *stronger* than a simple assertion of the dereferenceability within a particular scope. As an example, consider:
call void @foo(i8* dereferenceable(8) %p)
call void @may_trap_but_not_write()
%val = load i64, i64 @G

In this example, the current semantic seem to allow the load to be lifted above the second call, even though it is *not* contained within any scope marked dereferenceable. Now, that's my understanding of the current wording, but I have to admit, I can't find an optimization which actually does that style of reasoning, so maybe I'm mistaken.

Switching topics, I have a much more fundamental objection to your patch which comes in two parts.

Part 1 - By introducing a new attribute and not switching *all* users within the optimizer over, you are implicitly a) causing a bunch of miscompiles and b) setting up a situation where optimizations are likely to be weakened to "fix" bugs. This to me is a showstopper with the current patch, and on this basis I *strongly object* to this landing without this being addressed.

Part 2 - It's really not clear to me that your weaker "deref on entry" is strong enough for any useful optimization. This seems conceptually similar to a "llvm.mark_deref_here" intrinsic which just happens to be placed at the begining of each function. The problem with such a marker is that you can't (by the intention you've laid out), use it's presence for any path insensative reasoning. So for instance, you *can't* use isSafeToSpeculate's context free speculation or LICM's path insensative hoisting/sinking.

My concern here is purely a generic "is this worth the complexity" one. If you and others are strongly motivated to have a semantic like this, I will not resist. In your shoes, I would want to prototype out some key pieces, and would maybe start with an intrinsic before moving straight to an attribute/metadata.

This revision now requires changes to proceed.May 15 2019, 8:32 PM
jdoerfert updated this revision to Diff 199824.May 16 2019, 7:19 AM

Replace "scope" with "globally" (or global scope).

I think the framing of this in terms of scopes is likely to be problematic. The current definition of dereferenceable is (I think) *stronger* than a simple assertion of the dereferenceability within a particular scope. As an example, consider:
call void @foo(i8* dereferenceable(8) %p)
call void @may_trap_but_not_write()
%val = load i64, i64 @G

In this example, the current semantic seem to allow the load to be lifted above the second call, even though it is *not* contained within any scope marked dereferenceable. Now, that's my understanding of the current wording, but I have to admit, I can't find an optimization which actually does that style of reasoning, so maybe I'm mistaken.

The discussion with @sanjoy lead to the change from scope to globally, maybe that resolves this?

Switching topics, I have a much more fundamental objection to your patch which comes in two parts.

Part 1 - By introducing a new attribute and not switching *all* users within the optimizer over, you are implicitly a) causing a bunch of miscompiles and b) setting up a situation where optimizations are likely to be weakened to "fix" bugs. This to me is a showstopper with the current patch, and on this basis I *strongly object* to this landing without this being addressed.

The situation right now is that LLVM is broken. The front-end and middle-end disagree over the definition of dereferenceable. That is a problem because we cannot even "fix" because we don't know what part is "right" and what part is "wrong".

The situation after this patch would be the same but we would define which parts of LLVM are "wrong", the front-end or the middle-end (it would be the middle-end). This is a prerequirement to fixing LLVM as it is.

The only alternative (I see) is to define dereferenceable as dereferenceable_globally and introduce dereferenceable_here which would encode what Clang uses dereferenceable today for. I'd prefer it this way but at the end of the day, I need some disambiguation to start fixing the existing code.

Part 2 - It's really not clear to me that your weaker "deref on entry" is strong enough for any useful optimization. This seems conceptually similar to a "llvm.mark_deref_here" intrinsic which just happens to be placed at the begining of each function. The problem with such a marker is that you can't (by the intention you've laid out), use it's presence for any path insensative reasoning. So for instance, you *can't* use isSafeToSpeculate's context free speculation or LICM's path insensative hoisting/sinking.

You can't use this reasoning, correct. However, we do right now when Clang uses dereferenceable for dereferenceable_on_entry for lowering C++ references.

I wan this because I assume we have "deref on entry" annotation and an analysis that could reason about "no-free" for functions (D49165 integrated into D59918), we could propagate this annotation to pointer uses which would make it useful. Agreed?

My concern here is purely a generic "is this worth the complexity" one. If you and others are strongly motivated to have a semantic like this, I will not resist. In your shoes, I would want to prototype out some key pieces, and would maybe start with an intrinsic before moving straight to an attribute/metadata.

The main motivation is that this is broken right now. If we start to deduce and propagate dereferenceable we will expose these inconsistencies. I want to get ahead and make the semantics clear so we can decide which parts have to be fixed.

jdoerfert updated this revision to Diff 199829.May 16 2019, 7:45 AM
jdoerfert marked an inline comment as done.

Grammar + reintroduce existing "pointer only" restriction

I think the framing of this in terms of scopes is likely to be problematic. The current definition of dereferenceable is (I think) *stronger* than a simple assertion of the dereferenceability within a particular scope. As an example, consider:
call void @foo(i8* dereferenceable(8) %p)
call void @may_trap_but_not_write()
%val = load i64, i64 @G

In this example, the current semantic seem to allow the load to be lifted above the second call, even though it is *not* contained within any scope marked dereferenceable. Now, that's my understanding of the current wording, but I have to admit, I can't find an optimization which actually does that style of reasoning, so maybe I'm mistaken.

The discussion with @sanjoy lead to the change from scope to globally, maybe that resolves this?

It does. The globally wording seems broad enough. You're missing a bunch of detail from the original spec which needs to be preserved, but the basic notion seems reasonable.

Switching topics, I have a much more fundamental objection to your patch which comes in two parts.

Part 1 - By introducing a new attribute and not switching *all* users within the optimizer over, you are implicitly a) causing a bunch of miscompiles and b) setting up a situation where optimizations are likely to be weakened to "fix" bugs. This to me is a showstopper with the current patch, and on this basis I *strongly object* to this landing without this being addressed.

The situation right now is that LLVM is broken. The front-end and middle-end disagree over the definition of dereferenceable. That is a problem because we cannot even "fix" because we don't know what part is "right" and what part is "wrong".

Ok, no. I'm going to have to be really pedantic about this. *Clang* might be broken, but LLVM is not. Clang is simple one of several frontends which target LLVM. If Clang chose to ignore the documented meaning of an attribute, well, that's too bad for Clang.

To be clear here, I am not objecting to the addition of an attribute which fits the behaviour Clang wants. I am *strongly* objecting to the notion that it's okay to break other frontends which got it right because Clang didn't. To be very explicit, this patch as currently posted would badly break our Java frontend.

I would strongly suggest that you leave the existing attribute with the existing meaning and introduce a new dereferenceable_at_entry attribute with your desired semantics. I won't actively object to introducing a new attribute with the prior semantics and requiring a rename for frontends which used the old one correctly, but even that feels distinctly hostile to other frontends. It's not an acute issue for us, so I won't fight it actively, but the attitude it conveys is not exactly welcoming for new frontends.

I think the framing of this in terms of scopes is likely to be problematic. The current definition of dereferenceable is (I think) *stronger* than a simple assertion of the dereferenceability within a particular scope. As an example, consider:
call void @foo(i8* dereferenceable(8) %p)
call void @may_trap_but_not_write()
%val = load i64, i64 @G

In this example, the current semantic seem to allow the load to be lifted above the second call, even though it is *not* contained within any scope marked dereferenceable. Now, that's my understanding of the current wording, but I have to admit, I can't find an optimization which actually does that style of reasoning, so maybe I'm mistaken.

The discussion with @sanjoy lead to the change from scope to globally, maybe that resolves this?

It does. The globally wording seems broad enough. You're missing a bunch of detail from the original spec which needs to be preserved, but the basic notion seems reasonable.

Please elaborate on the details I'm missing, I want to get this "right". Even if we do not take this approach but the opposite one, I think we should clarify everything as good as possible. I say this also because I personally do not think the Clang interpretation violates the current wording of dereferenceable. While the intention is not 100% known to me [1], the wording right now is vague enough to be interpreted either way, or maybe even some other way. With the proposed patch, both versions have the same wording "a pointer is dereferenceable" as the current attribute has. The meaning is disambiguated only due to the added specifiers.

[1] According to r213385, the commit that introduced the attribute, C++ references were a use case. This would make the Clang interpretation the intended one. Though, the problems that come with that interpretation were not discussed/considered at that time.

Switching topics, I have a much more fundamental objection to your patch which comes in two parts.

Part 1 - By introducing a new attribute and not switching *all* users within the optimizer over, you are implicitly a) causing a bunch of miscompiles and b) setting up a situation where optimizations are likely to be weakened to "fix" bugs. This to me is a showstopper with the current patch, and on this basis I *strongly object* to this landing without this being addressed.

The situation right now is that LLVM is broken. The front-end and middle-end disagree over the definition of dereferenceable. That is a problem because we cannot even "fix" because we don't know what part is "right" and what part is "wrong".

Ok, no. I'm going to have to be really pedantic about this. *Clang* might be broken, but LLVM is not. Clang is simple one of several frontends which target LLVM. If Clang chose to ignore the documented meaning of an attribute, well, that's too bad for Clang.

I guess my comment from above represents my view point. Though, I actually do not care who is right and wrong (see below).

To be clear here, I am not objecting to the addition of an attribute which fits the behaviour Clang wants. I am *strongly* objecting to the notion that it's okay to break other frontends which got it right because Clang didn't. To be very explicit, this patch as currently posted would badly break our Java frontend.

I would strongly suggest that you leave the existing attribute with the existing meaning and introduce a new dereferenceable_at_entry attribute with your desired semantics. I won't actively object to introducing a new attribute with the prior semantics and requiring a rename for frontends which used the old one correctly, but even that feels distinctly hostile to other frontends. It's not an acute issue for us, so I won't fight it actively, but the attitude it conveys is not exactly welcoming for new frontends.

By making the existing attribute spelling less powerful I take away optimization opportunities for front-ends until they moved to deref_globally but I don't see how that would break them.
I proposed this solution not to make "Clang correct" but because I think it does not "break" anything. Doing it the other way around, however, would make old Clang code definitively invalid.

@reames What do you think?

jdoerfert updated this revision to Diff 204366.Jun 12 2019, 3:18 PM

Remove the alignment part

hfinkel accepted this revision.Jun 12 2019, 3:20 PM

This LGTM. I suggest that we commit this along with the implementation.

llvm/docs/LangRef.rst
1158

We also need to add the "at its definition" qualifier here?

1158

Ignore this (old comment).

jdoerfert updated this revision to Diff 204372.Jun 12 2019, 3:27 PM

Add argument definitions

reames requested changes to this revision.Jun 12 2019, 4:16 PM

Sorry for the late reply. Replying selectively to key points.

The discussion with @sanjoy lead to the change from scope to globally, maybe that resolves this?

It does. The globally wording seems broad enough. You're missing a bunch of detail from the original spec which needs to be preserved, but the basic notion seems reasonable.

Please elaborate on the details I'm missing,

The main thing is your missing the wording about restrictions on where the attribute can be applied (i.e. pointer types). Reading the existing dereferenceable_or_null wording more careful, we have precedent for just assuming it carries over to other dereferenceable* flavours, so I'll drop this objection. I may separately post a patch to refine the wording, but this is not a key issue.

Now, my remaining concern. This patch, as structured, introduces a number of purely definitional miscompiles into LLVM. There are existing transformations which assume the dereferenceable global property for correctness. By introducing a new attribute with the old semantics, and not updating all of the existing code to use the new attribute (code, tests, etc..), you're are defining (via the langref change) existing optimizations to be incorrect.

I really don't think that's okay. Particular, since it should be very easy to fix. A simple find replace in source and tests which replaces checks for dereferenceable with checks for dereferenceable_globally should be all that's needed.

I'm specifically asking that this change not be committed - despite Hal's LGTM - before this point is addressed.

p.s. I wouldn't be thrilled by it, but I'd even be okay with a clear statement of intent to handle this in follow up changes. It is more of a theoretical problem than a practical one in the near term.

This revision now requires changes to proceed.Jun 12 2019, 4:16 PM

I'm specifically asking that this change not be committed - despite Hal's LGTM - before this point is addressed.

I'm fine with this too. We'll soon have the nofree/nosync patches/functionality which can be used to change everything over in a hopefully-reasonable fashion.

Sorry for the late reply. Replying selectively to key points.

The discussion with @sanjoy lead to the change from scope to globally, maybe that resolves this?

It does. The globally wording seems broad enough. You're missing a bunch of detail from the original spec which needs to be preserved, but the basic notion seems reasonable.

Please elaborate on the details I'm missing,

The main thing is your missing the wording about restrictions on where the attribute can be applied (i.e. pointer types). Reading the existing dereferenceable_or_null wording more careful, we have precedent for just assuming it carries over to other dereferenceable* flavours, so I'll drop this objection. I may separately post a patch to refine the wording, but this is not a key issue.

Now, my remaining concern. This patch, as structured, introduces a number of purely definitional miscompiles into LLVM. There are existing transformations which assume the dereferenceable global property for correctness. By introducing a new attribute with the old semantics, and not updating all of the existing code to use the new attribute (code, tests, etc..), you're are defining (via the langref change) existing optimizations to be incorrect.

I really don't think that's okay. Particular, since it should be very easy to fix. A simple find replace in source and tests which replaces checks for dereferenceable with checks for dereferenceable_globally should be all that's needed.

I'm specifically asking that this change not be committed - despite Hal's LGTM - before this point is addressed.

p.s. I wouldn't be thrilled by it, but I'd even be okay with a clear statement of intent to handle this in follow up changes. It is more of a theoretical problem than a practical one in the near term.

I'm working on all the remaining changes now and I'll update this patch with the mechanics first before I create a new one to move the users.
We can wait with committing both until we merged the nofree and nosync patches and I updated the isDerefPointer helper function to use them.
Would that be OK?

I'm working on all the remaining changes now and I'll update this patch with the mechanics first before I create a new one to move the users.
We can wait with committing both until we merged the nofree and nosync patches and I updated the isDerefPointer helper function to use them.
Would that be OK?

Absolutely. That's exactly what I'm asking for.

jdoerfert retitled this revision from [LangRef][Attr] Clarify dereferenceable(_in_scope) to [Attr] Introduce dereferenceable_globally.Jun 12 2019, 5:55 PM
jdoerfert edited the summary of this revision. (Show Details)
jdoerfert updated this revision to Diff 204403.Jun 12 2019, 6:02 PM

Add all the mechanics for dereferenceable_globally we have for dereferenceable

The getter/setter functions are called "DereferenceableBytesGlobally" not "DereferenceableGloballyBytes". I would like to get some feedback on this.

Wow, we have a huge amount of boilerplate when adding an attribute.

Minor comments only. Seems straight forward and I didn't see any copy paste errors.

llvm/include/llvm/IR/Argument.h
67

For consistency, I'd prefer getDereferenableGloballyBytes. Doesn't read well, but it's consistent with all the other naming.

llvm/lib/IR/Verifier.cpp
4126–4127

Very minor style item, since we have multple of these now, maybe:
for (auto ID : {MD_...})

if (MDNode ...)
  visitDeref(...)
jdoerfert updated this revision to Diff 204409.Jun 12 2019, 8:18 PM
jdoerfert marked 2 inline comments as done.

Addressed comments: Getter/Setter rename + for loop

Wow, we have a huge amount of boilerplate when adding an attribute.

I agree...

Minor comments only. Seems straight forward and I didn't see any copy paste errors.

Thanks for checking, I passed run test and including the new ones for the new attributes but you never know.

Minor fixes

DereferenceableOrNullBytesGlobally -> DereferenceableOrNullGloballyBytes

@reames Was the last comment a "LGTM"? I need to start getting these patches of my plate :(

@reames Was the last comment a "LGTM"? I need to start getting these patches of my plate :(

Yes.

jdoerfert updated this revision to Diff 207705.Jul 2 2019, 9:45 PM

Rename leftover cases

uenoku added a subscriber: uenoku.Jul 17 2019, 9:48 AM