This is an archive of the discontinued LLVM Phabricator instance.

docs/LangRef.rst
1423	IIRC, annoyingly the backend considers that an instructions can access memory and still don't have side-effect. It'd be nice to align (but I think the backend is "wrong" on this one).

I added two new Intrinsic attributes IntrNoSideEffects and IntrHasSideEffects,
which make it possible to specify all the possible memory interaction / side effect
combinations. With these properties in place, it should be possible in the future
to drop the 'no side effect' portion of the intrinsic memory properties once targets
have been updated to use these new properties.

• tstellarAMD updated this object.May 13 2016, 6:39 PM

• tstellarAMD updated this object.

It don't really like the lack of orthogonality here. I'd rather get rid of the fact that IntrNoMem, IntrReadMem, IntrWriteMem, and IntrArgMemOnly specify in addition to memory interaction that an intrinsic has no side effects and have only IntrNoSideEffects as an attribute.

Oh I missed it the first time: it was an acknowledged goal in your last sentence...

Please also see D18714. I'd appreciate if you could at least use the invalid.ll.bc from there, since they use attribute kind 63 to avoid the need for future changes (63 is the highest possible given the binary layout).

Ping.

In D20116#458920, @tstellarAMD wrote:

Ping.

You're pinging but my impression is that it is waiting on you to address nhaehnle comment, did you miss it or did I miss an answer?

eli.friedman added a subscriber: eli.friedman.Jun 15 2016, 11:40 PM

eli.friedman added inline comments.

docs/LangRef.rst
1423	This description needs to be more thorough... "can be safely speculated" is an extremely fuzzy description. Your commit message says that "divide-by-zero" counts as a side-effect, but that isn't listed here. Does an infinite loop count as a side-effect? Can a read from or write to a global? An argument? A volatile load?
utils/TableGen/CodeGenIntrinsics.h
24	Saying that, for example, memcpy is nosideeffects seems very weird. "memcpy(0,0,8)" will crash. The same issue applies to basically any intrinsic that reads from or writes to its arguments.

In D20116#459637, @mehdi_amini wrote:

In D20116#458920, @tstellarAMD wrote:

Ping.

You're pinging but my impression is that it is waiting on you to address nhaehnle comment, did you miss it or did I miss an answer?

I didn't address it specifically, but I was planning to address it by waiting for D18714 to be committed first, since that patch has the correct test case. I was actually hoping that pinging this patch would get people to look at D18714 too, since that's been outstanding for much longer.

Rebase on top of ToT, and clarify nosideeffects definition.

• tstellarAMD updated this object.Jul 13 2016, 1:11 PM

• tstellarAMD updated this object.

• tstellarAMD added inline comments.Jul 13 2016, 1:23 PM

docs/LangRef.rst
1423	With my original definition, I was trying to match what we already have in the .td files for intrinsics. I've updated the definition in this patch to be: nosideeffects tells the optimizer that the function does not modify any state that isn't accessible from the IR (e.g. floating-point exception registers). I'm not sure if this is what you were thinking, but hopefully this gives us a better starting point for discussion.
utils/TableGen/CodeGenIntrinsics.h
128	This a problem with the current definitions of TableGen's intrinsic properties. Any intrinsic with IntrNoMem, IntrReadMem, IntrWriteMem, or IntrArgMemOnly is defined as having no side-effects. The goal with this patch is to make it possible to have an intrinsic, like memcpy which only reads/writes arg memory, but may have other sideeffects.

mehdi_amini added inline comments.Jul 13 2016, 2:05 PM

docs/LangRef.rst
1423	Typo `accessbile` Also it isn't clear how it interacts with memory. Are we only considering "non-memory" effects with this attribute? What are the side effects we want to track and what will they be used for? Is it just about "this won't not trap or exit"?

arsenm added inline comments.Jul 13 2016, 2:10 PM

docs/LangRef.rst
1423	I kind of think it should be renamed speculatable, since the intention is any kind of operation that would prevent speculating

hfinkel added a subscriber: hfinkel.Jul 13 2016, 2:17 PM

hfinkel added inline comments.

docs/LangRef.rst
1423	I agree. This is really the "safe to speculatively execute" attribute.

eli.friedman added inline comments.Jul 13 2016, 3:48 PM

docs/LangRef.rst
1424	The floating point status register is a weird example. The floating point status register is basically memory; it isn't actually addressable on most processors, but it behaves like a hidden global in every other way.
utils/TableGen/CodeGenIntrinsics.h
128	Oh, I see, this is an existing problem. :( I'd definitely like to see this resolved before you start changing optimizations to use this flag, but I guess you can change it in a followup. Not that it helps memcpy in particular, but it might be worth considering some approach which allows one to say "this intrinsic has no side-effects if the pointer arguments are dereferenceable(n)".

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

• tstellarAMD retitled this revision from Add nosideeffects function attribute to Add speculatable function attribute.Jul 13 2016, 9:39 PM

• tstellarAMD updated this object.

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

docs/LangRef.rst
1540	Saying "its result", instead of "the result", reads better to me.

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

I don't think we should avoid inferring speculatable via a cost model in the same way that we don't slap noinline on huge functions.
IMO, speculating function calls should likely be done via an IPO pass which is driven by some cost model.

We should probably have another, symmetric, attribute: sinkable. But that shouldn't be done with this patch.

docs/LangRef.rst
1538–1540	We should say something to indicate that speculatable does not imply CSE-able. Unless I am mistaken, it is possible for a function to be speculatable but return different results given the same parameters.

In D20116#484894, @majnemer wrote:

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

I don't think we should avoid inferring speculatable via a cost model in the same way that we don't slap noinline on huge functions.
IMO, speculating function calls should likely be done via an IPO pass which is driven by some cost model.

I agree; although we'll need to think about how the API will work for this. We can't take our current code which says:

if (isSafeToSpeculativelyExecute(I, ...)) {
  // speculate something
}

and turn it into:

if (!isa<CallInst>(I) && isSafeToSpeculativelyExecute(I, ...)) {
  // speculate something
}

because there are intrinsics calls that isSafeToSpeculativelyExecute currently handles, and we'd like that to continue.

Maybe we need to add some 'PotentiallyExpensive' output to isSafeToSpeculativelyExecute. In any case, we can have this discussion in a separate review for the isSafeToSpeculativelyExecute change so long as it does not impact the design of the attribute itself.

We should probably have another, symmetric, attribute: sinkable. But that shouldn't be done with this patch.

In D20116#484905, @hfinkel wrote:
In D20116#484894, @majnemer wrote:

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

I don't think we should avoid inferring speculatable via a cost model in the same way that we don't slap noinline on huge functions.
IMO, speculating function calls should likely be done via an IPO pass which is driven by some cost model.

I agree; although we'll need to think about how the API will work for this. We can't take our current code which says:
if (isSafeToSpeculativelyExecute(I, ...)) {
  // speculate something
}
and turn it into:
if (!isa<CallInst>(I) && isSafeToSpeculativelyExecute(I, ...)) {
  // speculate something
}
because there are intrinsics calls that isSafeToSpeculativelyExecute currently handles, and we'd like that to continue.

Maybe we need to add some 'PotentiallyExpensive' output to isSafeToSpeculativelyExecute.

Yeah, we could have it return an enum with Unsafe = 0, Cheap, Call. This is nice because existing callers would be correct without source changes.

In any case, we can have this discussion in a separate review for the isSafeToSpeculativelyExecute change so long as it does not impact the design of the attribute itself.

Agreed.

We should probably have another, symmetric, attribute: sinkable. But that shouldn't be done with this patch.

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute?

Yes, this is the idea.

hfinkel added inline comments.Jul 14 2016, 7:34 PM

docs/LangRef.rst
1538–1540	We should say something to indicate that speculatable does not imply CSE-able. Unless I am mistaken, it is possible for a function to be speculatable but return different results given the same parameters. True; only a readnone speculatable function can be CSE'd. It might be readonly, but then you can't CSE it unless you know more about the memory it might access.

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

The SpeculativeExecution pass already considers the cost

In D20116#484894, @majnemer wrote:

In D20116#484877, @hfinkel wrote:

In D20116#483719, @tstellarAMD wrote:

Rename the attribute to speculatable. This simplifies the patch a lot since
we are no longer trying to solve the problem where specifying one of
the memory properties for intrinsics implies that it has no side effects
(this should probably still be addressed in another patch).

I wasn't sure exactly how to define the speculatable attribute
in the language ref, so I copied the definition from the
isSafeToSpeculativelyExecute() function.

Makes sense to me. Is the idea to have isSafeToSpeculativelyExecute() return true on functions with this attribute? I'd find it strange if this were not the plan. Do we plan to have FuncAttrs infer the attribute? I'm wondering if we should say something about cost and how that will be handled. We don't want to speculate expensive-to-execute functions even if it is legal.

I don't think we should avoid inferring speculatable via a cost model in the same way that we don't slap noinline on huge functions.
IMO, speculating function calls should likely be done via an IPO pass which is driven by some cost model.

We should probably have another, symmetric, attribute: sinkable. But that shouldn't be done with this patch.

We sort of already have this, convergent. Owen was talking about splitting convergent so that it really means do not sink, and then to add a speculatable attribute. The current convergent semantics would be the combination of no sink and no speculate

-Matt

Fix wording in language ref and add a note about speculatable not implying
the function can be CSE'd.

• tstellarAMD mentioned this in D22413: [ValueTracking] Teach isSafeToSpeculativelyExecute() about the speculatable attribute.Jul 15 2016, 9:45 AM

hfinkel added inline comments.Jul 15 2016, 10:11 AM

docs/LangRef.rst
1541	I don't think that we should use "CSE'd" here, as a term. We should say something about this attribute not being enough to conclude that the number of calls executed along any particular execution path being externally observable, or something along those lines. Akin, perhaps, to what we say for volatile.

Replace the term CSE'd in the language ref with a more precise definition.

I reworded Hal's suggested description slightly to try to emphisize the
fact that it's talking about the *number* of calls being externally
observable, not just that the call itself is externally observable.

hfinkel added inline comments.Jul 15 2016, 4:55 PM

docs/LangRef.rst
1542	Do you mean "will not be"?

Fix typo in language ref.

• tstellarAMD marked an inline comment as done.Jul 18 2016, 3:57 AM

• tstellarAMD added inline comments.

docs/LangRef.rst
1542	Yes, I did. This is fixed now.

mehdi_amini added inline comments.Jul 18 2016, 3:30 PM

docs/LangRef.rst
1543	"does not have any effects besides calculating its result" and "speculatable is not enough to conclude that [...] the number of calls to this function will not be externally observable." seem contradictory to me. (Also you have a typo with `exection` instead of `execution`)

• tstellarAMD marked an inline comment as done.Nov 15 2016, 5:08 AM

• tstellarAMD added inline comments.

docs/LangRef.rst
1543	I would like to try to revive this discussion. We've gone back and forth a lot on the attribute description. The intention is that speculatable allows something to be speculatively executed, but is not enough by itself to determine whether or not the function can be CSE'd. Would it make sense to replace: 'does not have any effects besides calculating its result' with 'does not have any effects other than possibly reading/writing memory and calculating its result'

hfinkel added inline comments.Nov 15 2016, 6:59 AM

docs/LangRef.rst
1543	It also can't have any undefined behavior.

mehdi_amini added inline comments.Nov 15 2016, 8:56 AM

docs/LangRef.rst
1543	Isn't it enough to say: `This function attribute indicates that the function does not have undefined behavior, for any possible combination of arguments or global memory state.` ?

hfinkel added inline comments.Nov 15 2016, 2:10 PM

docs/LangRef.rst
1543	Yes, I think that sounds right (the comma is not necessary).

hfinkel mentioned this in D26930: Teach optimizer that pthread_self does not trap. It can be speculatively executed..Nov 21 2016, 3:43 PM

Update definition in LangReg.

Harbormaster completed remote builds in B3494: Diff 86656.Feb 1 2017, 9:35 AM

Herald added a subscriber: wdng. · View Herald TranscriptFeb 1 2017, 9:35 AM

LGTM. But let's wait a little to give a change to @sanjoy or @chandlerc to comment if they feel the need.

I repeat here one of the important earlier comment from @tstellarAMD , since it is not in the description and easy to miss:

I added two new Intrinsic attributes IntrNoSideEffects and IntrHasSideEffects,
which make it possible to specify all the possible memory interaction / side effect
combinations. With these properties in place, it should be possible in the future
to drop the 'no side effect' portion of the intrinsic memory properties once targets
have been updated to use these new properties.

This revision is now accepted and ready to land.Feb 1 2017, 5:18 PM

In D20116#664236, @mehdi_amini wrote:
LGTM. But let's wait a little to give a change to @sanjoy or @chandlerc to comment if they feel the need.

I repeat here one of the important earlier comment from @tstellarAMD , since it is not in the description and easy to miss:
I added two new Intrinsic attributes IntrNoSideEffects and IntrHasSideEffects,
which make it possible to specify all the possible memory interaction / side effect
combinations. With these properties in place, it should be possible in the future
to drop the 'no side effect' portion of the intrinsic memory properties once targets
have been updated to use these new properties.

This comment is actually from an earlier version of the patch. I've dropped this part and wrote this patch instead: https://reviews.llvm.org/D22459

ping

Sorry, I didn't know that this was blocked on me. I'll review this by end of day today.

I'm okay with this as long as this is allowed only on specific intrinsics (i.e. it cannot be used by external clients, but we know that certain intrinsics are speculatable). I don't think we can allow this as generic function attribute, since that allows dead code to affect program behavior. E.g.:

if (false)
  puts("hi") speculatable;

which can get transformed to

puts("hi") speculatable;
if (false)
  ;

The speculatable attribute on puts is incorrect, but we need to allow such "bogus" IR down dead paths. For instance, the original program could have been:

void do_call(fnptr f, bool is_speculatable) {
  if (is_speculatable)
    f("hi") speculatable;
  else
    f("hi");
}

// Later
do_call(puts, false);

In D20116#706967, @sanjoy wrote:
I'm okay with this as long as this is allowed only on specific intrinsics (i.e. it cannot be used by external clients, but we know that certain intrinsics are speculatable). I don't think we can allow this as generic function attribute, since that allows dead code to affect program behavior. E.g.:
if (false)
  puts("hi") speculatable;
which can get transformed to
puts("hi") speculatable;
if (false)
  ;
The speculatable attribute on puts is incorrect, but we need to allow such "bogus" IR down dead paths. For instance, the original program could have been:
void do_call(fnptr f, bool is_speculatable) {
  if (is_speculatable)
    f("hi") speculatable;
  else
    f("hi");
}

// Later
do_call(puts, false);

Then the program is broken to begin with? Why shouldn't speculatable apply to functions that only call other speculatable instructions, similar to how FunctionAttrs can infer this for readnone? Do you mean not expose a user visible attribute in the frontend?

In D20116#706969, @arsenm wrote:
In D20116#706967, @sanjoy wrote:
I'm okay with this as long as this is allowed only on specific intrinsics (i.e. it cannot be used by external clients, but we know that certain intrinsics are speculatable). I don't think we can allow this as generic function attribute, since that allows dead code to affect program behavior. E.g.:
if (false)
  puts("hi") speculatable;
which can get transformed to
puts("hi") speculatable;
if (false)
  ;
The speculatable attribute on puts is incorrect, but we need to allow such "bogus" IR down dead paths. For instance, the original program could have been:
void do_call(fnptr f, bool is_speculatable) {
  if (is_speculatable)
    f("hi") speculatable;
  else
    f("hi");
}

// Later
do_call(puts, false);
Then the program is broken to begin with? Why shouldn't speculatable apply to functions that only call other speculatable instructions, similar to how FunctionAttrs can infer this for readnone? Do you mean not expose a user visible attribute in the frontend?

readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

But thinking about it a bit, I concede that speculatable as a generic (i.e. both intrinsic and non-intrinsic) function attribute is fine. However, it doesn't make sense as a call site attribute: being speculatable only down a control flow path is basically the antithesis of speculatable.

In D20116#707083, @sanjoy wrote:

In D20116#706969, @arsenm wrote:

In D20116#706967, @sanjoy wrote:

readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

I don't follow why readnone and readonly based transformations don't need the same guarantee?

In D20116#707083, @sanjoy wrote:
In D20116#706969, @arsenm wrote:
In D20116#706967, @sanjoy wrote:
I'm okay with this as long as this is allowed only on specific intrinsics (i.e. it cannot be used by external clients, but we know that certain intrinsics are speculatable). I don't think we can allow this as generic function attribute, since that allows dead code to affect program behavior. E.g.:
if (false)
  puts("hi") speculatable;
which can get transformed to
puts("hi") speculatable;
if (false)
  ;
The speculatable attribute on puts is incorrect, but we need to allow such "bogus" IR down dead paths. For instance, the original program could have been:
void do_call(fnptr f, bool is_speculatable) {
  if (is_speculatable)
    f("hi") speculatable;
  else
    f("hi");
}

// Later
do_call(puts, false);
Then the program is broken to begin with? Why shouldn't speculatable apply to functions that only call other speculatable instructions, similar to how FunctionAttrs can infer this for readnone? Do you mean not expose a user visible attribute in the frontend?
readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

But thinking about it a bit, I concede that speculatable as a generic (i.e. both intrinsic and non-intrinsic) function attribute is fine. However, it doesn't make sense as a call site attribute: being speculatable only down a control flow path is basically the antithesis of speculatable.

True, but it can still be a callsite attribute. It can represent a data dependency:

int div(int a, int b) {
  return a/b;
}

div(q, 1); /* speculatable */

In D20116#707103, @hfinkel wrote:
readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

But thinking about it a bit, I concede that speculatable as a generic (i.e. both intrinsic and non-intrinsic) function attribute is fine. However, it doesn't make sense as a call site attribute: being speculatable only down a control flow path is basically the antithesis of speculatable.

True, but it can still be a callsite attribute. It can represent a data dependency:
int div(int a, int b) {
  return a/b;
}

div(q, 1); /* speculatable */

But div is not well defined for any possible combination of arguments or global memory state. I think I understand what you were getting at (that for all values of q the expression div(q, 1) is well defined), but the LangRef definition mentioned in this change does not phrase things that way.

However, this should be fine IMO:

int div_specialized(int a) speculatable {
  return div(a, 1);
}

In D20116#707100, @mehdi_amini wrote:

In D20116#707083, @sanjoy wrote:

readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

I don't follow why readnone and readonly based transformations don't need the same guarantee?

I'm talking about cases like this:

void f(bool to_write, int* ptr) {
  if (to_write) *ptr = 20;
}

void g() {
  int a;
  f(false, &a) readnone;
}

Now f is not readnone generally (there are argument values for which it writes to memory), but in that specific call site we know that it does not.

What I was trying to say before is that, because of how we're defined speculatable, "speculatable for this specific argument / control dependence" does not make sense -- if you expand it out, the statement would be "for this specific set of arguments, global memory state and control dependence, this function does not have undefined behavior for any possible combination of arguments or global memory state", which contradicts itself.

Of course, due to this attribute, we will start being able to hoist calls and invokes. If those are marked readnone etc. we will have to strip those attributes if we hoisted said call or invoke out of control flow. But that's a different story.

In D20116#707120, @sanjoy wrote:
In D20116#707103, @hfinkel wrote:
readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

But thinking about it a bit, I concede that speculatable as a generic (i.e. both intrinsic and non-intrinsic) function attribute is fine. However, it doesn't make sense as a call site attribute: being speculatable only down a control flow path is basically the antithesis of speculatable.

True, but it can still be a callsite attribute. It can represent a data dependency:
int div(int a, int b) {
  return a/b;
}

div(q, 1); /* speculatable */
But div is not well defined for any possible combination of arguments or global memory state. I think I understand what you were getting at (that for all values of q the expression div(q, 1) is well defined), but the LangRef definition mentioned in this change does not phrase things that way.

We might want to adjust the wording. When we say "any possible", for a call site attribute, we mean the values that can possibly present themselves at *that* call site (which may be constrained to be a subset of the values allowed for the input types by the semantics of the program). We mean the same things for other call site attributes (e.g. readnone).

However, this should be fine IMO:
int div_specialized(int a) speculatable {
  return div(a, 1);
}

In D20116#707126, @sanjoy wrote:
In D20116#707100, @mehdi_amini wrote:

In D20116#707083, @sanjoy wrote:

readnone etc. are different from speculatable, in that once you mark a call site as speculatable you've the said call site as speculatable throughout the lifetime of the program (since, by definition, it can be arbitrarily speculated). readnone, readonly etc. do not have that property.

I don't follow why readnone and readonly based transformations don't need the same guarantee?

I'm talking about cases like this:
void f(bool to_write, int* ptr) {
  if (to_write) *ptr = 20;
}

void g() {
  int a;
  f(false, &a) readnone;
}
Now f is not readnone generally (there are argument values for which it writes to memory), but in that specific call site we know that it does not.

What I was trying to say before is that, because of how we're defined speculatable, "speculatable for this specific argument / control dependence" does not make sense -- if you expand it out, the statement would be "for this specific set of arguments, global memory state and control dependence, this function does not have undefined behavior for any possible combination of arguments or global memory state", which contradicts itself.

Of course, due to this attribute, we will start being able to hoist calls and invokes. If those are marked readnone etc. we will have to strip those attributes if we hoisted said call or invoke out of control flow. But that's a different story.

But I think this is exactly the point. We can consider value constraints due to data dependencies, but not from control dependencies. The same infects the other attributes as well (if it is tagged readnone and speculatable, the readnone can also not be control dependent - attributes are not metadata; we shouldn't be stripping them).

I can phrase my concerns as a question -- what is the intended behavior of the following program if condition is false?

void f(int a, int b) { return a / b; }

void main() {
  if (condition)
    f(5, 0) speculatable;
}

As far as I can tell, there are two possibilities:

It is well defined -- main is a no-op.
It has undefined behavior, due to the incorrect speculatable attribute.

If we go with (1), then we cannot hoist the call to f above the if -- we'd introduce UB if we do that. This is the interpretation I'm leaning towards, but this interpretation makes the speculatable attribute useless for general functions, since we're prevented from doing what it was supposed to enable in the first place.

If we go with (2), then we've admitted that "dead code" (that is, code that is not run) can influence program behavior, which I find troubling. In particular, I think foo1 and foo2 should be equivalent:

void foo1() {
}

void foo2() {
  if (false) {
    // Arbitrary syntactically valid stuff
  }
}

In other words, adding dead code should not change program behavior.

Even

void f(int a, int b) speculatable { return a / b; }

void main() {
  if (condition)
    f(5, 0);
}

has the same problem, so I take back my earlier concession on function annotations.

Now it is fine for us to have a speculatable analysis which would infer speculatable in limited cases, like int f() { return 10; }. But that's a module analysis, and not an attribute.
Having them on intrinsics is fine too, since we axiomatically know that certain intrinsics are speculatable, just like we know the add instruction is speculatable.

It has undefined behavior, due to the incorrect speculatable attribute.

I don't see it as being anything other than this. Marking it as speculatable is saying it has no undefined behavior a condition could possibly be avoiding. If you are lying about this, you get what you get.

In D20116#707984, @arsenm wrote:

It has undefined behavior, due to the incorrect speculatable attribute.

I don't see it as being anything other than this. Marking it as speculatable is saying it has no undefined behavior a condition could possibly be avoiding. If you are lying about this, you get what you get.

I agree.

In D20116#707954, @sanjoy wrote:

If we go with (2), then we've admitted that "dead code" (that is, code that is not run) can influence program behavior, which I find troubling.

That troubles (and worries) me as well.

In D20116#708076, @mehdi_amini wrote:

In D20116#707954, @sanjoy wrote:

If we go with (2), then we've admitted that "dead code" (that is, code that is not run) can influence program behavior, which I find troubling.

That troubles (and worries) me as well.

Why? That's part of the promise we make when we tag the code as speculatable. We promise that doing the operation early, even in cases where it might otherwise have been performed, is fine. Furthermore, we're promising that any properties the call has (e.g. promising that a certain argument is nonnull) must not be control dependent. As a result, looking at it as dead code affecting live code is suboptimal; any other properties are just a kind of global assertion.

In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

That troubles (and worries) me as well.

Why?

Irrational fear? ;-)
It seems unusual, and I'm cautious about introducing unusual properties in the compiler in general, it makes it harder to reason about "stuff" when there aren't "simple" rules to guide the logic.
Are there existing other cases of UB induced by unreachable/dead code?

In D20116#708174, @mehdi_amini wrote:

In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

That troubles (and worries) me as well.

Why?

Irrational fear? ;-)

;-)

It seems unusual, and I'm cautious about introducing unusual properties in the compiler in general, it makes it harder to reason about "stuff" when there aren't "simple" rules to guide the logic.
Are there existing other cases of UB induced by unreachable/dead code?

Not as far as I know, but this is because we normally need to conservatively assume there might be control dependencies on just about everything we can't completely understand (i.e. a call to a function). This is novel because we're explicitly saying that there aren't such dependencies. This makes validity assertions more "global" (this is how I look at it), meaning that it matters less where in the CFG you put them (even in dead regions). The point is that, in some cases, this is exactly what we want and mean.

In D20116#708174, @mehdi_amini wrote:

In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

That troubles (and worries) me as well.

Why?

Irrational fear? ;-)
It seems unusual, and I'm cautious about introducing unusual properties in the compiler in general, it makes it harder to reason about "stuff" when there aren't "simple" rules to guide the logic.
Are there existing other cases of UB induced by unreachable/dead code?

I suppose it's the same as speculating a load from a pointer marked as dereferencable that isn't really, which is already done

In D20116#708191, @arsenm wrote:

In D20116#708174, @mehdi_amini wrote:

In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

That troubles (and worries) me as well.

Why?

Irrational fear? ;-)
It seems unusual, and I'm cautious about introducing unusual properties in the compiler in general, it makes it harder to reason about "stuff" when there aren't "simple" rules to guide the logic.
Are there existing other cases of UB induced by unreachable/dead code?

I suppose it's the same as speculating a load from a pointer marked as dereferencable that isn't really, which is already done

My understanding of what gives people pause is that:

if (p != nullptr)
  something();

if (false)
  foo(p /*nonnull*/) /* speculatable */

depending on the pass ordering, we might end up hoisting the call to foo and then using the nonnull assumption to simplify the if condition (even though that call is dynamically dead). We might not, however (and probably won't because we pretty eagerly remove dead code).

The thing to realize about speculatable is that it promotes all argument restrictions to properties of the argument values themselves. This might certainly seem surprising.

In D20116#708191, @arsenm wrote:

I suppose it's the same as speculating a load from a pointer marked as dereferencable that isn't really, which is already done

As far as I remember, dereferenceable and !dereferenceable are carefully designed to avoid this UB-from-dead-code situation.

In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

In D20116#707954, @sanjoy wrote:

If we go with (2), then we've admitted that "dead code" (that is, code that is not run) can influence program behavior, which I find troubling.

That troubles (and worries) me as well.

Why? That's part of the promise we make when we tag the code as speculatable. We promise that doing the operation early, even in cases where it might otherwise have been performed, is fine. Furthermore, we're promising that any properties the call has (e.g. promising that a certain argument is nonnull) must not be control dependent. As a result, looking at it as dead code affecting live code is suboptimal; any other properties are just a kind of global assertion.

Making even the behavior of a program dependent on instructions that are never executed seems like a fundamentally new thing, and I'm not yet convinced that that's safe. It may be possible to come up with a consistent model of this new thing, but I think the model will still be tricky to work with.

For instance, certain kinds of devritualization optimizations are wrong in that model. Say we had:

void f() { }
void g() { 1 / 0; } // unused

void main() {
  fnptr p = f;
  *p() speculatable;
}

Now you're saying you can't specialize the call via function pointer like this:

void f() { }
void g() { 1 / 0; } // unused

void main() {
  fnptr p = f;
  if (p == g)
    g() speculatable;  // incorrect
  else
    *p() speculatable;
}

which seems odd.

There are also (somewhat more complex) cases like this that do not involve indirect speculatable calls:

struct Base {
  int k;
  Base(int k) : k(k) {}
};

struct S : public Base {
  S(bool c) : Base(c ? 10 : 20) {}
  virtual void f() {
    div(1, k) speculatable;  // k is either 10 or 20
  }
};

struct T : public Base {
  T() : Base(0) {}
  virtual void f() { }
};

void bug(Base* b) {
  b->f();
}

We have a problem in bug if we ever devirtualize, inline and hoist a load:

void bug(Base *b) {
  int k = b->k;
  if (b->type == S) {
    div(1, k) speculatable;
  } else (b->type == T) {
  }
}

It also breaks "code compression" type optimizations:

if (a == 1 || a == 2) {
  switch (a) {
  case 1:
    div(m, 1) speculatable;
    break;
  case 2:
    div(m, 2) speculatable;
    break;
  }
}

if (a == 1 || a == 2) {
  div(m, a) speculatable;
}

if (a == 1 || a == 2) {
  if (a == 0)
    div(m, 0) speculatable;
  else
    div(m, a) speculatable;
}

In D20116#708238, @sanjoy wrote:

It may be possible to come up with a consistent model of this new thing, but I think the model will still be tricky to work with.

This ^ ended up sounding more FUD'dy than I intended. What I meant to say is that I'm being cautious because this isn't obviously okay, not because this is obviously not okay.

In D20116#708238, @sanjoy wrote:
In D20116#708109, @hfinkel wrote:

In D20116#708076, @mehdi_amini wrote:

In D20116#707954, @sanjoy wrote:

If we go with (2), then we've admitted that "dead code" (that is, code that is not run) can influence program behavior, which I find troubling.

That troubles (and worries) me as well.

Why? That's part of the promise we make when we tag the code as speculatable. We promise that doing the operation early, even in cases where it might otherwise have been performed, is fine. Furthermore, we're promising that any properties the call has (e.g. promising that a certain argument is nonnull) must not be control dependent. As a result, looking at it as dead code affecting live code is suboptimal; any other properties are just a kind of global assertion.

Making even the behavior of a program dependent on instructions that are never executed seems like a fundamentally new thing, and I'm not yet convinced that that's safe. It may be possible to come up with a consistent model of this new thing, but I think the model will still be tricky to work with.

For instance, certain kinds of devritualization optimizations are wrong in that model. Say we had:
void f() { }
void g() { 1 / 0; } // unused

void main() {
  fnptr p = f;
  *p() speculatable;
}
Now you're saying you can't specialize the call via function pointer like this:
void f() { }
void g() { 1 / 0; } // unused

void main() {
  fnptr p = f;
  if (p == g)
    g() speculatable;  // incorrect
  else
    *p() speculatable;
}
which seems odd.

There are also (somewhat more complex) cases like this that do not involve indirect speculatable calls:
struct Base {
  int k;
  Base(int k) : k(k) {}
};

struct S : public Base {
  S(bool c) : Base(c ? 10 : 20) {}
  virtual void f() {
    div(1, k) speculatable;  // k is either 10 or 20
  }
};

struct T : public Base {
  T() : Base(0) {}
  virtual void f() { }
};

void bug(Base* b) {
  b->f();
}
We have a problem in bug if we ever devirtualize, inline and hoist a load:
void bug(Base *b) {
  int k = b->k;
  if (b->type == S) {
    div(1, k) speculatable;
  } else (b->type == T) {
  }
}
It also breaks "code compression" type optimizations:
if (a == 1 || a == 2) {
  switch (a) {
  case 1:
    div(m, 1) speculatable;
    break;
  case 2:
    div(m, 2) speculatable;
    break;
  }
}
to
if (a == 1 || a == 2) {
  div(m, a) speculatable;
}
to
if (a == 1 || a == 2) {
  if (a == 0)
    div(m, 0) speculatable;

Your "code compression" optimization just introduced dead code ;)

else
  div(m, a) speculatable;

}

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute. I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).

In D20116#708268, @hfinkel wrote:
It also breaks "code compression" type optimizations:
if (a == 1 || a == 2) {
  switch (a) {
  case 1:
    div(m, 1) speculatable;
    break;
  case 2:
    div(m, 2) speculatable;
    break;
  }
}
to
if (a == 1 || a == 2) {
  div(m, a) speculatable;
}
to
if (a == 1 || a == 2) {
  if (a == 0)
    div(m, 0) speculatable;
Your "code compression" optimization just introduced dead code ;)

Yea, I don't even know why I called it "code compression". :)

else
  div(m, a) speculatable;
}
I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute. I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).

I'd say as an initial step support for intrinsics that we _know_ are speculatable (like we _know_ that add is speculatable) can land without any further discussion.

As for a generic speculatable attribute -- I need some time to think about this. Perhaps if done with sufficient care, it is possible.

I'd also like to ping @whitequark for comments. I had blocked D18738 some time back on similar grounds, but if we can reach a conclusion on specuatable, perhaps some of that can be transferable to !unconditionally_dereferenceable as well.

In D20116#708273, @sanjoy wrote:
In D20116#708268, @hfinkel wrote:
It also breaks "code compression" type optimizations:
if (a == 1 || a == 2) {
  switch (a) {
  case 1:
    div(m, 1) speculatable;
    break;
  case 2:
    div(m, 2) speculatable;
    break;
  }
}
to
if (a == 1 || a == 2) {
  div(m, a) speculatable;
}
to
if (a == 1 || a == 2) {
  if (a == 0)
    div(m, 0) speculatable;
Your "code compression" optimization just introduced dead code ;)
Yea, I don't even know why I called it "code compression". :)
else
  div(m, a) speculatable;
}
I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute. I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).
I'd say as an initial step support for intrinsics that we _know_ are speculatable (like we _know_ that add is speculatable) can land without any further discussion.

I'm fine with restricting speculatable to only appear where it appears on a function declaration/definition unless/until we can figure out semantics for it on a call site in general. I don't want it restricted to intrinsics specifically, but I don't think that's the problem.

As for a generic speculatable attribute -- I need some time to think about this. Perhaps if done with sufficient care, it is possible.

I'd also like to ping @whitequark for comments. I had blocked D18738 some time back on similar grounds, but if we can reach a conclusion on specuatable, perhaps some of that can be transferable to !unconditionally_dereferenceable as well.

In D20116#708332, @hfinkel wrote:

I'm fine with restricting speculatable to only appear where it appears on a function declaration/definition unless/until we can figure out semantics for it on a call site in general. I don't want it restricted to intrinsics specifically, but I don't think that's the problem.

Only function-level speculatable (and no call site specific speculatable) seems less problematic. It would mean having a function declaration or definition incorrectly marked as speculatable, even if it is never called, is UB; but I can live with that as long as that is properly documented.

Let me maybe zoom out and give a different perspective:
Right now call site and function attributes are an AND of predicates that are always guaranteed to hold for that specific call site or for all call sites, respectively.
Predicates include things like doesn't write to memory, only writes to memory that is not observable, etc. Using attributes we can state that several of these predicates hold.
In the ideal world, predicates wouldn't overlap, although since we can only state ANDs of predicates and not ORs, some overlap may be need in practice.

Then we have an orthogonal concern which is what's the precondition that is sufficient to justify an optimization. For example, whether a function call can be executed speculatively is one of such preconditions. It can be derived by looking at the function attributes. For example, for speculative execution we probably need to know that the function doesn't write to memory and that it terminates.

So I feel that this speculatable attribute is not the right answer. It should be a helper function that derives its result from a set of function attributes, but shouldn't be an attribute on its own.
The only reason I could see to have it as an attribute would be to carry cost information. Just because a function can be executed speculatively, it doesn't mean it should be. We have infrastructure to record cost of intra-procedural edges, but not across functions AFAIK. That may need a more general solution for LTO anyway.

I'm just slightly concerned that the set of function attributes is growing pretty quickly without a more throughout discussion of the whole set of attributes rather than adding a new attribute every time someone wants to fix a problem. Some attributes are not well supported across the pipeline. It really feels like we need to stop for a moment and refactor function attributes.
Probably what we are missing is a "halts"/"terminating" attributes (that states that the function always returns). It has been discussed multiple times, and now we have another use case.

In D20116#710866, @nlopes wrote:

Then we have an orthogonal concern which is what's the precondition that is sufficient to justify an optimization. For example, whether a function call can be executed speculatively is one of such preconditions. It can be derived by looking at the function attributes. For example, for speculative execution we probably need to know that the function doesn't write to memory and that it terminates.

It is not enough (for example division by zero).

So I feel that this speculatable attribute is not the right answer. It should be a helper function that derives its result from a set of function attributes, but shouldn't be an attribute on its own.

Feel free to propose an alternative solution, right now there is no combination of attributes that is enough.

The only reason I could see to have it as an attribute would be to carry cost information.

I'm not convinced that attributes should carry "cost" information, is there a precedent for this?

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.

A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.

I believe the set of conditions under which you could clone the attribute are:

The new call is CDEQ the original call (IE the set of conditions under which it executes is identical). IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.
The arguments are identical.
The function called is identical or marked speculatable

Note this is not entirely shocking,.
the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above) , it's just that, being an attribute about control dependence, the effects relate to control dependence.

Note: a lot of this is premature anyway. There is no possible way you could ever apply any of these optimizations correctly to speculatable, at all, until we fix post-dom to not be broken.

I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).

Definitely true.

Either the CDEQ set of the call must not change , or you must be able to prove that changes cannot impact the function (IE you don't make it any less dead, etc).

const int foo = bar = fred = 0;
if (foo)
if (bar)
if (fred)
   call baz() speculatable

You can prove hoisting into if (foo) cannot make it any less dead.

In D20116#710923, @mehdi_amini wrote:

In D20116#710866, @nlopes wrote:

Then we have an orthogonal concern which is what's the precondition that is sufficient to justify an optimization. For example, whether a function call can be executed speculatively is one of such preconditions. It can be derived by looking at the function attributes. For example, for speculative execution we probably need to know that the function doesn't write to memory and that it terminates.

It is not enough (for example division by zero).

Sure; I didn't mean to have enumerated a complete list.

So I feel that this speculatable attribute is not the right answer. It should be a helper function that derives its result from a set of function attributes, but shouldn't be an attribute on its own.

Feel free to propose an alternative solution, right now there is no combination of attributes that is enough.

My point is that attributes should be about the function behavior: they should be about the *how*, and not what kind of transformations you can do with respective call sites.
In the same way, today we don't have attributes like "call can be removed if result unused", "call can be removed if multiple calls with same argument", "call can be sank into a loop", etc.
Instead we have things like "doesn't write to memory", "only writes to memory that is not observable by the caller", etc.
From our set of attributes we can infer if a transformation is valid or not. This approach scales much better: it's one attribute per behavior we want to capture instead of 1 attribute per transformation we want to do, which is surely much higher.

The other point is that this approach separates concerns: we have an analysis pass that decorates functions with attributes about how they behave, and then we have transformations that consume this information in some way. To me it seems useful to have this separation: it makes the analysis part much easier to reason about and to implement correctly.

My biggest concern with speculatable is that there's no intuitive semantics for it. If people already have different opinions of what "readonly" means (and it was supposed to have trivial semantics), then something as complex as speculatable seems like a can of worms.
And a month from now people will want more and more speculatable attributes. For example, "can be speculated across stores", "can be speculated across stores and function calls", etc. Doesn't seem to scale.

The only reason I could see to have it as an attribute would be to carry cost information.

I'm not convinced that attributes should carry "cost" information, is there a precedent for this?

No, and I didn't propose we do that. But at some point for applications like ThinLTO and PGO it seems that an inter-proc cost/information framework will be needed. ThinLTO needs to summarize what functions do. Imagine extending ThinLTO summaries to include information like "simplifies a lot if 2nd argument is false", "returns a number in range [0, 4]", etc. So it feels that eventually we will need an additional set of information to be attached to functions that we can probably not cover with the attribute framework we have today.
Anyway, that's a separate discussion.

In D20116#711097, @nlopes wrote:

In D20116#710923, @mehdi_amini wrote:

In D20116#710866, @nlopes wrote:

Then we have an orthogonal concern which is what's the precondition that is sufficient to justify an optimization. For example, whether a function call can be executed speculatively is one of such preconditions. It can be derived by looking at the function attributes. For example, for speculative execution we probably need to know that the function doesn't write to memory and that it terminates.

It is not enough (for example division by zero).

Sure; I didn't mean to have enumerated a complete list.

Still: my point was you *can't* provide a complete list because we're not capturing everything with the existing attributes.

So I feel that this speculatable attribute is not the right answer. It should be a helper function that derives its result from a set of function attributes, but shouldn't be an attribute on its own.

Feel free to propose an alternative solution, right now there is no combination of attributes that is enough.

My point is that attributes should be about the function behavior: they should be about the *how*, and not what kind of transformations you can do with respective call sites.
In the same way, today we don't have attributes like "call can be removed if result unused", "call can be removed if multiple calls with same argument", "call can be sank into a loop", etc.
Instead we have things like "doesn't write to memory", "only writes to memory that is not observable by the caller", etc.
From our set of attributes we can infer if a transformation is valid or not. This approach scales much better: it's one attribute per behavior we want to capture instead of 1 attribute per transformation we want to do, which is surely much higher.

I get that, I'm still not sure what you're suggesting *in this case*.
Is it just the name that bothers your?
Looking at the description of the attribute, it fits your definition somehow: "This function attribute indicates that the function does not have undefined behavior for any possible combination of arguments or global memory state."

The other point is that this approach separates concerns: we have an analysis pass that decorates functions with attributes about how they behave, and then we have transformations that consume this information in some way. To me it seems useful to have this separation: it makes the analysis part much easier to reason about and to implement correctly.

I don't see how this is related to this attribute in any way.

My biggest concern with speculatable is that there's no intuitive semantics for it. If people already have different opinions of what "readonly" means (and it was supposed to have trivial semantics), then something as complex as speculatable seems like a can of worms.
And a month from now people will want more and more speculatable attributes. For example, "can be speculated across stores", "can be speculated across stores and function calls", etc. Doesn't seem to scale.

I don't have this impression. If you feel some parts of the definition need to be more carefully specified, that's fine. But the way you're putting it forward above seems like FUD to me.

The only reason I could see to have it as an attribute would be to carry cost information.

I'm not convinced that attributes should carry "cost" information, is there a precedent for this?

No, and I didn't propose we do that. But at some point for applications like ThinLTO and PGO it seems that an inter-proc cost/information framework will be needed. ThinLTO needs to summarize what functions do. Imagine extending ThinLTO summaries to include information like "simplifies a lot if 2nd argument is false", "returns a number in range [0, 4]", etc. So it feels that eventually we will need an additional set of information to be attached to functions that we can probably not cover with the attribute framework we have today.

The way this is structured in ThinLTO is using in-memory analysis that don't attach their result to the IR. This is then part of the summaries directly. These are just a serialization of the in-memory analysis, they don't need any IR construct like attribute or metadata.

In D20116#710866, @nlopes wrote:

Let me maybe zoom out and give a different perspective:
Right now call site and function attributes are an AND of predicates that are always guaranteed to hold for that specific call site or for all call sites, respectively.
Predicates include things like doesn't write to memory, only writes to memory that is not observable, etc. Using attributes we can state that several of these predicates hold.
In the ideal world, predicates wouldn't overlap, although since we can only state ANDs of predicates and not ORs, some overlap may be need in practice.

It behaves as an or right now? Last time I checked the implementation, call sites just scan its attributes and then check the set on the declaration

Disallow on call sites

In D20116#710937, @dberlin wrote:

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.

A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.

I believe the set of conditions under which you could clone the attribute are:

The new call is CDEQ the original call (IE the set of conditions under which it executes is identical). IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.

The arguments are identical.

The function called is identical or marked speculatable

Note this is not entirely shocking,.

(Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)

I think (1) is somewhat shocking. I'd say normally we follow a weaker constraint: "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".

For instance, if we unroll loops (say by a factor of 2):

for (i = 0; i < N; i++)
  f(i) readonly;      // X

it is reasonable that the result be:

for (i = 0; i < N / 2; i += 2) {
  f(i) readonly;       // A
  f(i + 1) readonly;   // B
}

if (i < N)
  f(i++) readonly;     // C

Assuming I correctly understood what you meant by "the set of conditions under which it executes is identical", we won't be able to keep any of the readonly attributes.

X is executed for all i < N.
A is executed for all even i if N > 1
B is executed for all odd i if N > 1
C is executed if N is 1.

Of course, in the program trace the instances in which f was executed stay the same in the pre-unroll and post-unroll program; and that's related to what I was trying to say earlier: with the speculatable attribute, the behavior of a program is no longer a property of its trace -- there could be a "hidden" speculatable call somewhere that influences the behavior of the program without actually being executed (in the initial program), by getting hoisted into the executable bits of the program. In order to mark these programs as "buggy", we will have to rule such "hidden" speculatable calls (that nevertheless have side effects) as tainting the program with UB, despite not being present in the program trace.

the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above)

Why can't you apply readonly to the devirtualization example?

it's just that, being an attribute about control dependence, the effects relate to control dependence.

Yes -- IMO that's the key problem with speculatable. Since it says "there is no control dependence", we cannot apply it in a control dependent manner.

I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).

Definitely true.

Either the CDEQ set of the call must not change , or you must be able to prove that changes cannot impact the function (IE you don't make it any less dead, etc).

What exactly do you mean by "the CDEQ set"? I could not find anything easily on Google. [edit: by CDEQ did you mean the cdequiv (control dependence equivalent) set? If so I think I know what you mean.]

Btw, is "make it any less dead" == "speculatively execute it", or something more subtle?

const int foo = bar = fred = 0;
if (foo)
if (bar)
if (fred)
   call baz() speculatable
You can prove hoisting into if (foo) cannot make it any less dead.

In D20116#711367, @arsenm wrote:

In D20116#710866, @nlopes wrote:

In the ideal world, predicates wouldn't overlap, although since we can only state ANDs of predicates and not ORs, some overlap may be need in practice.

It behaves as an or right now? Last time I checked the implementation, call sites just scan its attributes and then check the set on the declaration

That behavior precisely means it is an AND -- a call site marked as readonly nounwind is both readonly AND nounwind.

One concrete example illustrating Nuno's point is the dereferenceable_or_null attribute. If we had an is_null attribute (or emulated it via !range), then we would not need a new attribute to denote that a value is either dereferenceable or null.

In D20116#711697, @sanjoy wrote:

In D20116#711367, @arsenm wrote:

In D20116#710866, @nlopes wrote:

In the ideal world, predicates wouldn't overlap, although since we can only state ANDs of predicates and not ORs, some overlap may be need in practice.

It behaves as an or right now? Last time I checked the implementation, call sites just scan its attributes and then check the set on the declaration

That behavior precisely means it is an AND -- a call site marked as readonly nounwind is both readonly AND nounwind.

One concrete example illustrating Nuno's point is the dereferenceable_or_null attribute. If we had an is_null attribute (or emulated it via !range), then we would not need a new attribute to denote that a value is either dereferenceable or null.

OK, I was thinking about it like: Is this call readnone? It is if if the call site OR the declaration is readnone.

In D20116#711696, @sanjoy wrote:
In D20116#710937, @dberlin wrote:

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.

A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.

I believe the set of conditions under which you could clone the attribute are:

The new call is CDEQ the original call (IE the set of conditions under which it executes is identical). IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.

The arguments are identical.

The function called is identical or marked speculatable

Note this is not entirely shocking,.

(Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)

I think (1) is somewhat shocking. I'd say normally we follow a weaker constraint: "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".

For instance, if we unroll loops (say by a factor of 2):
for (i = 0; i < N; i++)
  f(i) readonly;      // X
it is reasonable that the result be:
for (i = 0; i < N / 2; i += 2) {
  f(i) readonly;       // A
  f(i + 1) readonly;   // B
}

I'm presuming you meant this loop to go half as much but still cover the same values in the same order in calls to f?
(it doesn't)

If so, agree we allow it in practice, butt hat's in practice
Here is an, IMHO, valid callsite attribute readonly implementation of f, not legally doable in C, but we're talking about llvm IR anyway.

void f(int a)
{
  // go scrobbling through instruction stream to see what the increment is
  if (increment == 2)
    write some memory
  otherwise
    do nothing
}

Now, obviously, i'm cheating, but the point is, f can read whatever state it wants here, and detect what you've done, and not be readonly in that case.
Again, if i'm understanding things right (the langref really is somewhat confusing to me here, but i assume it's legal to have not be readonly except for that call. That's what it seems like we are talking about here)

Given they can also unwind, and read state of other functions, i'm pretty positive you can come up with implementations easier than what i just did to detect and write memory in the two cases differently, without resorting to this trickery.

If you meant to make the loop iteration space change, i disagree with you that it's allowed even harder :)

Assuming I correctly understood what you meant by "the set of conditions under which it executes is identical", we won't be able to keep any of the readonly attributes.

X is executed for all i < N.

A is executed for all even i if N > 1

B is executed for all odd i if N > 1

C is executed if N is 1.

Of course, in the program trace the instances in which f was executed stay the same in the pre-unroll and post-unroll program; and that's related to what I was trying to say earlier: with the speculatable attribute, the behavior of a program is no longer a property of its trace -- there could be a "hidden" speculatable call somewhere that influences the behavior of the program without actually being executed (in the initial program), by getting hoisted into the executable bits of the program. In order to mark these programs as "buggy", we will have to rule such "hidden" speculatable calls (that nevertheless have side effects) as tainting the program with UB, despite not being present in the program trace.

Again, this is literally what it means to play with control dependence, so i *don't* find it shocking.

the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above)

Why can't you apply readonly to the devirtualization example?

Sorry, meant to delete it. you can because it's dead code, otherwise, you can't.

it's just that, being an attribute about control dependence, the effects relate to control dependence.

Yes -- IMO that's the key problem with speculatable. Since it says "there is no control dependence", we cannot apply it in a control dependent manner.

Sure, i'd agree with that, but you really cannot work around that.

I agree, however, that we need to think carefully about how to define what speculatable means on an individual call site. Perhaps they're like convergent functions in this regard: you can't introduce new control dependencies (at least not in general).

Definitely true.

Either the CDEQ set of the call must not change , or you must be able to prove that changes cannot impact the function (IE you don't make it any less dead, etc).

What exactly do you mean by "the CDEQ set"? I could not find anything easily on Google. [edit: by CDEQ did you mean the cdequiv (control dependence equivalent) set? If so I think I know what you mean.]

yes, cdequiv, sorry. it's called CDEQ by some papers, and cdequiv by the other.

It's the equivalence classes of the control dependence graph, basically. The set of blocks/statements/etc (depends on what level you look) that execute only under the same conditions.

Btw, is "make it any less dead" == "speculatively execute it", or something more subtle?

yes, that's what we'd call it. but it's not executed in any case, so i guess it'd be "speculatively hoist it".
:)

const int foo = bar = fred = 0;
if (foo)
if (bar)
if (fred)
   call baz() speculatable
You can prove hoisting into if (foo) cannot make it any less dead.

In D20116#711778, @dberlin wrote:
In D20116#711696, @sanjoy wrote:
In D20116#710937, @dberlin wrote:

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.

A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.

I believe the set of conditions under which you could clone the attribute are:

The new call is CDEQ the original call (IE the set of conditions under which it executes is identical). IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.

The arguments are identical.

The function called is identical or marked speculatable

Note this is not entirely shocking,.

(Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)

I think (1) is somewhat shocking. I'd say normally we follow a weaker constraint: "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".

For instance, if we unroll loops (say by a factor of 2):
for (i = 0; i < N; i++)
  f(i) readonly;      // X
it is reasonable that the result be:
for (i = 0; i < N / 2; i += 2) {
  f(i) readonly;       // A
  f(i + 1) readonly;   // B
}
I'm presuming you meant this loop to go half as much but still cover the same values in the same order in calls to f?
(it doesn't)

Do'h. I did a split-brain there. :)

If so, agree we allow it in practice, butt hat's in practice
Here is an, IMHO, valid callsite attribute readonly implementation of f, not legally doable in C, but we're talking about llvm IR anyway.
void f(int a)
{
  // go scrobbling through instruction stream to see what the increment is
  if (increment == 2)
    write some memory
  otherwise
    do nothing
}
Now, obviously, i'm cheating, but the point is, f can read whatever state it wants here, and detect what you've done, and not be readonly in that case.

I don't think we can reasonably allow functions to base their behavior on the instruction stream of their callers (or anywhere else, for that matter), in IR or in C.

For instance, if we allowed things like that, even basic optimizations like this:

void f() {
  int a = 2, b = 3;
  f(a + b);
}

void f() {
  f(5);
}

would not be valid, since an implementation of f could be:

void f() {
  cond = does the caller's instruction stream have an add instruction?
  if (cond)
    print("hi");
}

and we'd change observable behavior by optimizing out the add instruction.

Again, if i'm understanding things right (the langref really is somewhat confusing to me here, but i assume it's legal to have not be readonly except for that call. That's what it seems like we are talking about here)

I think code like this is fine:

void f(bool do_store, int* ptr) {
  if (do_store)
    *ptr = 42;
}

...
f(false, ptr) readnone
...

The example you gave above seems odd to me because the callee is has different behavior based on the instruction stream of the caller, but the conditionally-readonly aspect of it is fine.

Given they can also unwind, and read state of other functions, i'm pretty positive you can come up with implementations easier than what i just did to detect and write memory in the two cases differently, without resorting to this trickery.

If there was a legal way to detect a difference between the two cases above, then loop unrolling would be illegal. So even if there is some way to detect the difference, I'd consider that a bug in the LLVM IR semantics since it disallows an important optimization.

If you meant to make the loop iteration space change, i disagree with you that it's allowed even harder :)

Yes, we can't transform the iteration space, since if instead of executing f(0) first, we execute f(N - 1) first; we'd break an f that was implemented as:

void f(int i) {
  // I don't think this is possible to do without writing memory in C++,
  // but it may be possible in other languages.
  throw i;
}

in which case the pre-transformed program would throw 0, but the post-transform program would throw N - 1.

Assuming I correctly understood what you meant by "the set of conditions under which it executes is identical", we won't be able to keep any of the readonly attributes.

X is executed for all i < N.

A is executed for all even i if N > 1

B is executed for all odd i if N > 1

C is executed if N is 1.

Of course, in the program trace the instances in which f was executed stay the same in the pre-unroll and post-unroll program; and that's related to what I was trying to say earlier: with the speculatable attribute, the behavior of a program is no longer a property of its trace -- there could be a "hidden" speculatable call somewhere that influences the behavior of the program without actually being executed (in the initial program), by getting hoisted into the executable bits of the program. In order to mark these programs as "buggy", we will have to rule such "hidden" speculatable calls (that nevertheless have side effects) as tainting the program with UB, despite not being present in the program trace.

Again, this is literally what it means to play with control dependence, so i *don't* find it shocking.

I'm not entirely sure what you mean here -- did you mean that it is okay to have stuff outside the program trace affect program definedness? Or did you mean the converse?

the "no cloning" is true of other attributes (you can't clone and apply readonly like is done in the devirt example above)

Why can't you apply readonly to the devirtualization example?

Sorry, meant to delete it. you can because it's dead code, otherwise, you can't.

it's just that, being an attribute about control dependence, the effects relate to control dependence.

Yes -- IMO that's the key problem with speculatable. Since it says "there is no control dependence", we cannot apply it in a control dependent manner.

Sure, i'd agree with that, but you really cannot work around that.

Yes!

In D20116#711865, @sanjoy wrote:
In D20116#711778, @dberlin wrote:
In D20116#711696, @sanjoy wrote:
In D20116#710937, @dberlin wrote:

I think that all of this is right, you can't apply some of these optimizations to call sites with the speculatable attribute.

A lot of these (but not all of these) amount to "you cannot clone speculatable", ie if you clone the call, you must remove the attribute.

I believe the set of conditions under which you could clone the attribute are:

The new call is CDEQ the original call (IE the set of conditions under which it executes is identical). IF you are cloning from one function to another, it must be CDEQ using the interprocedural control dependence.

The arguments are identical.

The function called is identical or marked speculatable

Note this is not entirely shocking,.

(Caveat: I think we're arguing some subtle points here, so apologies if I misunderstood your intent.)

I think (1) is somewhat shocking. I'd say normally we follow a weaker constraint: "The new call is executed only if the old call was executed", not "The new call is executed if and only [edit: previously this incorrectly said "only if and only if"] if the old call was executed".

For instance, if we unroll loops (say by a factor of 2):
for (i = 0; i < N; i++)
  f(i) readonly;      // X
it is reasonable that the result be:
for (i = 0; i < N / 2; i += 2) {
  f(i) readonly;       // A
  f(i + 1) readonly;   // B
}
I'm presuming you meant this loop to go half as much but still cover the same values in the same order in calls to f?
(it doesn't)
Do'h. I did a split-brain there. :)
If so, agree we allow it in practice, butt hat's in practice
Here is an, IMHO, valid callsite attribute readonly implementation of f, not legally doable in C, but we're talking about llvm IR anyway.
void f(int a)
{
  // go scrobbling through instruction stream to see what the increment is
  if (increment == 2)
    write some memory
  otherwise
    do nothing
}
Now, obviously, i'm cheating, but the point is, f can read whatever state it wants here, and detect what you've done, and not be readonly in that case.
I don't think we can reasonably allow functions to base their behavior on the instruction stream of their callers (or anywhere else, for that matter), in IR or in C.

As mentioned, i don't think that's strictly necessary to screw up what you did. But i'm willing to admit those are likely semantic bugs.
:)
In any case, I was just half-pointing out that i don't think the semantics of the other attribute are so cut and dried in their control dependence, etc, when applied to callsites, that speculatable on callsites is that bad.

For instance, if we allowed things like that, even basic optimizations like this:
void f() {
  int a = 2, b = 3;
  f(a + b);
}
to
void f() {
  f(5);
}
would not be valid, since an implementation of f could be:
void f() {
  cond = does the caller's instruction stream have an add instruction?
  if (cond)
    print("hi");
}
and we'd change observable behavior by optimizing out the add instruction.

Right, my point was more on the side of "we are pretending we've got the semantics of these other instructions down really well and they are easy to understand", when, honestly, there is nothing in the langref that outlaws what i wrote :)

The closest it comes is "when called with the same set of arguments and global state"
But you aren't calling it with the same global state depending on the definition of global state, which .. we don't define anywhere.

The example you gave above seems odd to me because the callee is has different behavior based on the instruction stream of the caller, but the conditionally-readonly aspect of it is fine.

As mentioned I'm pretty positive i could construct an example just as bad given the limitations expressed in the langref :)
I

Given they can also unwind, and read state of other functions, i'm pretty positive you can come up with implementations easier than what i just did to detect and write memory in the two cases differently, without resorting to this trickery.

If there was a legal way to detect a difference between the two cases above, then loop unrolling would be illegal.

FWIW, i'm honestly too lazy to go construct one, but i'm basically positive you can. At least, legal gievn the semantics and restrictions on LLVM IR as it exists today, becuase LLVM IR is not *that* well defined.
I would not believe it well-defined in C, but only because unwinders, etc are usually not well defined.
This is basically an exercise in the moral equivalent of godel numbering.
I'm going to go with "yes"
Is it possible to define our IR well enough to outlaw that: Maybe?
I'm not going to tug at this thread any longer though :P

So even if there is some way to detect the difference, I'd consider that a bug in the LLVM IR semantics since it disallows an important optimization.

I guess my basic point was exactly this - the semantics of our other attributes, compared to speculatable, are not so easy or non-shocking that i think artificially limiting speculatable to non-callsites only makes sense. Particularly becuase those attributes may not be so well defined that their behavior makes complete and total sense when applied to callsites
Instead, when we discover they are wrong we fix them. So why not just mark speculatable on callsites as experimental , see what happens, and go from there?

Bottom line: IMHO, We are unlikely to ever figure out the specific set of issues we will hit on callsites with speculatable if we don't allow it there. Yeah, we'll figure out if it's completely broken if we don't, but we all seem to agree that it's probably not?

I'm going to drop the rest of this, fwiw, because i don't think it's worth pushing the readonly example any further.
I'm positive i can come up with a program that meets whatever requirements you throw at it and still breaks in your example, precisely because we don't define our IR well enough to prevent it (I'm on vacation, so i'm not going to think about whether i could prove that you can't make the IR well defined enough to prevent it and still have it express useful programs :P)

In D20116#711882, @dberlin wrote:

Right, my point was more on the side of "we are pretending we've got the semantics of these other instructions down really well and they are easy to understand", when, honestly, there is nothing in the langref that outlaws what i wrote :)

The closest it comes is "when called with the same set of arguments and global state"
But you aren't calling it with the same global state depending on the definition of global state, which .. we don't define anywhere.

Of course. I'm not claiming that the LangRef is a mathematically precise document (though parts of it could be, if we incorporated Vellvm). However, the converse of "it is mathematically precise" is not "anything goes". :)

In other words, I think despite the semantics of LLVM IR being only informally specified, we can still reasonably draw *some* boundaries on what the semantics of various constructs *can* be, based on the optimizations we want to be correct.

So even if there is some way to detect the difference, I'd consider that a bug in the LLVM IR semantics since it disallows an important optimization.

I guess my basic point was exactly this - the semantics of our other attributes, compared to speculatable, are not so easy or non-shocking that i think artificially limiting speculatable to non-callsites only makes sense.

I'm not sure if I agree with the "artificially" characterization. My objection was based around (what I think are) concrete problems that we'll have if we allow this.

Particularly becuase those attributes may not be so well defined that their behavior makes complete and total sense when applied to callsites

I agree we have room to improve today. However, I don't see how two wrongs make a right here.

Instead, when we discover they are wrong we fix them. So why not just mark speculatable on callsites as experimental , see what happens, and go from there?
Bottom line: IMHO, We are unlikely to ever figure out the specific set of issues we will hit on callsites with speculatable if we don't allow it there. Yeah, we'll figure out if it's completely broken if we don't, but we all seem to agree that it's probably not?

As I said, my objection was based on my opinion that they're already discovered to be wrong. Of course, my (counter-)examples may not stand up to scrutiny, in which case my objection is moot.

Having said all this: while I won't exactly be happy with it, I would be fine with allowing speculatable on normal call sites if that helps some really compelling use case. But I want to make it clear that we're making a tradeoff here.

I'm going to drop the rest of this, fwiw, because i don't think it's worth pushing the readonly example any further.
I'm positive i can come up with a program that meets whatever requirements you throw at it and still breaks in your example, precisely because we don't define our IR well enough to prevent it (I'm on vacation, so i'm not going to think about whether i could prove that you can't make the IR well defined enough to prevent it and still have it express useful programs :P)

So even if there is some way to detect the difference, I'd consider that a bug in the LLVM IR semantics since it disallows an important optimization.

I guess my basic point was exactly this - the semantics of our other attributes, compared to speculatable, are not so easy or non-shocking that i think artificially limiting speculatable to non-callsites only makes sense.

I'm not sure if I agree with the "artificially" characterization. My objection was based around (what I think are) concrete problems that we'll have if we allow this.

Sorry, let me try to explain:
You have identified concrete problems.
IMHO, your concrete problems are not with the definition of the attribute, but instead are "things wanting to optimize callsites and keep this attribute have to understand control dependence".
But this is pretty much a truism to me given *any* sane definition of the attribute on callsites.
Additionally, IMHO, there can be *no* sensible way to define this attribute such that the problems go away.
So from my perspective, the concrete problems you've identified are going to get solved in exactly the same way whether we add them for callsites today, tomorrow, or five years from now.
Hence i see the limitation as fairly artificial. We aren't going solve these problems other than by auditing all the callsite optimizations and either making them drop this attribute, or deeply understand control dependence enough to do a correct thing.

As I said, my objection was based on my opinion that they're already discovered to be wrong.

I guess this is where we disagree.
I do not believe you have pointed out that there is something wrong with the semantic, at least in a way that you could solve.
I believe you have pointed out it will interact badly if we don't audit our code.

I'd be fine if our answer was "hey, start a patch with all the fixes necessary to make this work properly on callsites".
But otherwise, what's the path forward?
IE what do you see as a definition of these attributes on callsites that is sane but doesn't have the issues you foresee?

In D20116#713430, @dberlin wrote:

So from my perspective, the concrete problems you've identified are going to get solved in exactly the same way whether we add them for callsites today, tomorrow, or five years from now.

I agree.

Hence i see the limitation as fairly artificial. We aren't going solve these problems other than by auditing all the callsite optimizations and either making them drop this attribute, or deeply understand control dependence enough to do a correct thing.

I guess my English isn't strong enough to quite grok that use of "artificial". :) I partially agree with "We aren't going solve these problems other ... enough to do a correct thing.", but please see below.

As I said, my objection was based on my opinion that they're already discovered to be wrong.

I guess this is where we disagree.
I do not believe you have pointed out that there is something wrong with the semantic, at least in a way that you could solve.
I believe you have pointed out it will interact badly if we don't audit our code.

What I was *trying* to show was that the context-sensitive-speculatable semantic introduces a fundamentally new behavior -- that dead code can (now) affect the semantics of a program. I don't think we have this in LLVM today (today the behavior of an LLVM program can be deduced solely from the *trace* of the instructions that were actually executed), and the impact of changing LLVM to allow dead code to have this "action at a distance" is unknown to me.

That ^ is really the core of my objection. Almost everything else I said can be traced back to the above.

And making dead code affect behavior sets of all sorts of alarms in my head. I've mentioned some of the more concrete problems earlier, but the fundamental thing that is bugging me is that if (false) { X } will not be a NOP for some values of X.

I guess what you're saying is that there is nothing fundamental about the dead-code-influencing-behavior problem, and that it is just a matter of fixing passes to do the right thing?

If so, I do not have a better answer to that than a vague sense of unease. :)

I'd be fine if our answer was "hey, start a patch with all the fixes necessary to make this work properly on callsites".
But otherwise, what's the path forward?
IE what do you see as a definition of these attributes on callsites that is sane but doesn't have the issues you foresee?

If I was able to sell you on semantically-relevant-dead-code is a bad thing, then there is no path forward with a call specific speculative that is as strong as is implied in this patch. We can probably do something weaker though (say allow it only on calls with all constant arguments).

If you're okay with semantically-relevant-dead-code and its consequences, then it is a matter of fixing all of the passes that assume arbitrary dead code is "okay".

I'm leaning towards the former, but I'd understand if people wanted to do the latter.

FWIW I don't see much of a practical need for speculatable to apply to individual call sites. There are a few cases where I think it might be useful on specific intrinsic call sites but aren't really interesting enough to worry about.

Only allow for intrinsics

In D20116#740922, @arsenm wrote:

Only allow for intrinsics

Why only for intrinsics? I thought we had concluded that we'd only allow it for declarations and not on call sites (which may technically mean on call sited but only matching the declaration). I think it is important that we can apply it to regular functions.

In D20116#741087, @hfinkel wrote:

In D20116#740922, @arsenm wrote:

Only allow for intrinsics

Why only for intrinsics? I thought we had concluded that we'd only allow it for declarations and not on call sites (which may technically mean on call sited but only matching the declaration). I think it is important that we can apply it to regular functions.

I think so too, but @sanjoy said to restrict it to intrinsics for now. Intrinsics are the important part. I also ran into one minor issue with the call site restriction in D32655 where speculatable intrinsic calls are sometimes replaced with non-speculatable libcalls.

In D20116#741087, @hfinkel wrote:

In D20116#740922, @arsenm wrote:

Why only for intrinsics? I thought we had concluded that we'd only allow it for declarations and not on call sites (which may technically mean on call sited but only matching the declaration). I think it is important that we can apply it to regular functions.

I think a general speculatable attribute that is allowed only on functions decls is *less problematic*[0] that a context sensitive one, but I think speculatable intrinsics are clearly okay. Therefore my opinion is (which I expressed on IRC to Matt) is to first land the intrinsic variant of this, since that's what he's blocked on; and then we can go ahead with more aggressive variants on subsequent patches.

[0] https://reviews.llvm.org/D20116#709352

In D20116#741130, @arsenm wrote:

In D20116#741087, @hfinkel wrote:

In D20116#740922, @arsenm wrote:

Only allow for intrinsics

Why only for intrinsics? I thought we had concluded that we'd only allow it for declarations and not on call sites (which may technically mean on call sited but only matching the declaration). I think it is important that we can apply it to regular functions.

I think so too, but @sanjoy said to restrict it to intrinsics for now.

He suggested that, and I said that I did not want that restriction, and he said that he was fine with that:

In D20116#709352, @sanjoy wrote:

In D20116#708332, @hfinkel wrote:

I'm fine with restricting speculatable to only appear where it appears on a function declaration/definition unless/until we can figure out semantics for it on a call site in general. I don't want it restricted to intrinsics specifically, but I don't think that's the problem.

Only function-level speculatable (and no call site specific speculatable) seems less problematic. It would mean having a function declaration or definition incorrectly marked as speculatable, even if it is never called, is UB; but I can live with that as long as that is properly documented.

Intrinsics are the important part.

Not to me ;)

I also ran into one minor issue with the call site restriction in D32655 where speculatable intrinsic calls are sometimes replaced with non-speculatable libcalls.

This seems like it is an unfortunate information loss that we should fix, but why is that a problem?

In D20116#741132, @sanjoy wrote:

In D20116#741087, @hfinkel wrote:

In D20116#740922, @arsenm wrote:

Why only for intrinsics? I thought we had concluded that we'd only allow it for declarations and not on call sites (which may technically mean on call sited but only matching the declaration). I think it is important that we can apply it to regular functions.

I think a general speculatable attribute that is allowed only on functions decls is *less problematic*[0] that a context sensitive one, but I think speculatable intrinsics are clearly okay. Therefore my opinion is (which I expressed on IRC to Matt) is to first land the intrinsic variant of this, since that's what he's blocked on; and then we can go ahead with more aggressive variants on subsequent patches.

[0] https://reviews.llvm.org/D20116#709352

Okay, unfortunately, this is only useful to me if we allow it on function declarations, and I don't see how kicking this can down the road helps in this regard. If he adds this for intrinsics and I immediately turn around a propose a patch to remove the restriction, that's a waste of everyone's time. I thought that we had agreed that allowing it on function declarations was okay so long as we documented the fact that this introduces potential UB just by declaring such a function, so let's do that.

In D20116#741170, @hfinkel wrote:

Okay, unfortunately, this is only useful to me if we allow it on function declarations

I had somehow missed this bit ^ and I was under the impression that the main motivation for a general attribute was more completeness than anything else.

I thought that we had agreed that allowing it on function declarations was okay so long as we documented the fact that this introduces potential UB just by declaring such a function, so let's do that.

I had not phrased my concession clearly. :)

Just to be clear, I don't think they're okay, but I can live with them in the spirit of begin pragmatic.

So yes, if this attribute will be useless to you without the generalization to non-intrinsics, then I won't object to checking in the previous version of this patch.

In D20116#741181, @sanjoy wrote:

In D20116#741170, @hfinkel wrote:

Okay, unfortunately, this is only useful to me if we allow it on function declarations

I had somehow missed this bit ^ and I was under the impression that the main motivation for a general attribute was more completeness than anything else.

No problem.

I thought that we had agreed that allowing it on function declarations was okay so long as we documented the fact that this introduces potential UB just by declaring such a function, so let's do that.

I had not phrased my concession clearly. :)

Just to be clear, I don't think they're okay, but I can live with them in the spirit of begin pragmatic.

I understand. In a theoretical sense, I see adding them on function declarations as the same as adding them to intrinsics. Obviously there are practical differences, however, I'm not sure that in practice you're more likely to introduce a call to an arbitrary function, just because it happens to have been declared as speculatable, than you are to an intrinsic, just because it happens to be similarly available. I can't imagine any general transformation doing anything for each declared function just on the basis of it being declared. You'd need to be operating in a very restricted environment for that to make sense, and in such an environment, you should reasonably have the power not to mark functions as speculatable in a problematic way.

So yes, if this attribute will be useless to you without the generalization to non-intrinsics, then I won't object to checking in the previous version of this patch.

Thanks!

r301680

whitequark mentioned this in D18738: Add new !unconditionally_dereferenceable load instruction metadata.May 11 2017, 6:50 AM

Revision Contents

Path

Size

docs/

LangRef.rst

10 lines

include/

llvm/

Bitcode/

LLVMBitCodes.h

3 lines

IR/

Attributes.td

3 lines

Function.h

8 lines

Intrinsics.td

3 lines

lib/

AsmParser/

LLLexer.cpp

1 line

LLParser.cpp

1 line

LLToken.h

1 line

Bitcode/

Reader/

BitcodeReader.cpp

3 lines

Writer/

BitcodeWriter.cpp

2 lines

IR/

Attributes.cpp

2 lines

Verifier.cpp

38 lines

test/

Bitcode/

attributes.ll

8 lines

compatibility.ll

10 lines

Verifier/

speculatable-callsite-invalid.ll

24 lines

speculatable-callsite.ll

21 lines

utils/

TableGen/

CodeGenIntrinsics.h

3 lines

CodeGenTarget.cpp

3 lines

IntrinsicEmitter.cpp

11 lines

Diff 97117

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,414 Lines • ▼ Show 20 Lines	``noreturn``
normally. This produces undefined behavior at runtime if the		normally. This produces undefined behavior at runtime if the
function ever does dynamically return.		function ever does dynamically return.
``norecurse``		``norecurse``
This function attribute indicates that the function does not call itself		This function attribute indicates that the function does not call itself
either directly or indirectly down any possible call path. This produces		either directly or indirectly down any possible call path. This produces
undefined behavior at runtime if the function ever does recurse.		undefined behavior at runtime if the function ever does recurse.
``nounwind``		``nounwind``
This function attribute indicates that the function never raises an		This function attribute indicates that the function never raises an
exception. If the function does raise an exception, its runtime		exception. If the function does raise an exception, its runtime
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions IIRC, annoyingly the backend considers that an instructions can access memory and still don't have side-effect. It'd be nice to align (but I think the backend is "wrong" on this one). mehdi_amini: IIRC, annoyingly the backend considers that an instructions can access memory and still don't…
		eli.friedmanUnsubmitted Not Done Reply Inline Actions This description needs to be more thorough... "can be safely speculated" is an extremely fuzzy description. Your commit message says that "divide-by-zero" counts as a side-effect, but that isn't listed here. Does an infinite loop count as a side-effect? Can a read from or write to a global? An argument? A volatile load? eli.friedman: This description needs to be more thorough... "can be safely speculated" is an extremely fuzzy…
		tstellarAMDUnsubmitted Not Done Reply Inline Actions With my original definition, I was trying to match what we already have in the .td files for intrinsics. I've updated the definition in this patch to be: nosideeffects tells the optimizer that the function does not modify any state that isn't accessible from the IR (e.g. floating-point exception registers). I'm not sure if this is what you were thinking, but hopefully this gives us a better starting point for discussion. tstellarAMD: With my original definition, I was trying to match what we already have in the .td files for…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Typo `accessbile` Also it isn't clear how it interacts with memory. Are we only considering "non-memory" effects with this attribute? What are the side effects we want to track and what will they be used for? Is it just about "this won't not trap or exit"? mehdi_amini: Typo `accessbile` Also it isn't clear how it interacts with memory. Are we only considering…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I kind of think it should be renamed speculatable, since the intention is any kind of operation that would prevent speculating arsenm: I kind of think it should be renamed speculatable, since the intention is any kind of operation…
		hfinkelUnsubmitted Not Done Reply Inline Actions I agree. This is really the "safe to speculatively execute" attribute. hfinkel: I agree. This is really the "safe to speculatively execute" attribute.
behavior is undefined. However, functions marked nounwind may still		behavior is undefined. However, functions marked nounwind may still
		eli.friedmanUnsubmitted Not Done Reply Inline Actions The floating point status register is a weird example. The floating point status register is basically memory; it isn't actually addressable on most processors, but it behaves like a hidden global in every other way. eli.friedman: The floating point status register is a weird example. The floating point status register is…
trap or generate asynchronous exceptions. Exception handling schemes		trap or generate asynchronous exceptions. Exception handling schemes
that are recognized by LLVM to handle asynchronous exceptions, such		that are recognized by LLVM to handle asynchronous exceptions, such
as SEH, will still provide their implementation defined semantics.		as SEH, will still provide their implementation defined semantics.
``optnone``		``optnone``
This function attribute indicates that most optimization passes will skip		This function attribute indicates that most optimization passes will skip
this function, with the exception of interprocedural optimization passes.		this function, with the exception of interprocedural optimization passes.
Code generation defaults to the "fast" instruction selector.		Code generation defaults to the "fast" instruction selector.
This attribute cannot be used together with the ``alwaysinline``		This attribute cannot be used together with the ``alwaysinline``
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	``sanitize_address``
This attribute indicates that AddressSanitizer checks		This attribute indicates that AddressSanitizer checks
(dynamic address safety analysis) are enabled for this function.		(dynamic address safety analysis) are enabled for this function.
``sanitize_memory``		``sanitize_memory``
This attribute indicates that MemorySanitizer checks (dynamic detection		This attribute indicates that MemorySanitizer checks (dynamic detection
of accesses to uninitialized memory) are enabled for this function.		of accesses to uninitialized memory) are enabled for this function.
``sanitize_thread``		``sanitize_thread``
This attribute indicates that ThreadSanitizer checks		This attribute indicates that ThreadSanitizer checks
(dynamic thread safety analysis) are enabled for this function.		(dynamic thread safety analysis) are enabled for this function.
		``speculatable``
		This function attribute indicates that the function does not have any
		effects besides calculating its result and does not have undefined behavior.
		hfinkelUnsubmitted Done Reply Inline Actions Saying "its result", instead of "the result", reads better to me. hfinkel: Saying "its result", instead of "the result", reads better to me.
		majnemerUnsubmitted Done Reply Inline Actions We should say something to indicate that speculatable does not imply CSE-able. Unless I am mistaken, it is possible for a function to be speculatable but return different results given the same parameters. majnemer: We should say something to indicate that speculatable does not imply CSE-able. Unless I am…
		hfinkelUnsubmitted Not Done Reply Inline Actions We should say something to indicate that speculatable does not imply CSE-able. Unless I am mistaken, it is possible for a function to be speculatable but return different results given the same parameters. True; only a readnone speculatable function can be CSE'd. It might be readonly, but then you can't CSE it unless you know more about the memory it might access. hfinkel: > We should say something to indicate that speculatable does not imply CSE-able. Unless I am…
		Note that ``speculatable`` is not enough to conclude that along any
		hfinkelUnsubmitted Not Done Reply Inline Actions I don't think that we should use "CSE'd" here, as a term. We should say something about this attribute not being enough to conclude that the number of calls executed along any particular execution path being externally observable, or something along those lines. Akin, perhaps, to what we say for volatile. hfinkel: I don't think that we should use "CSE'd" here, as a term. We should say something about this…
		particular exection path the number of calls to this function will not be
		hfinkelUnsubmitted Done Reply Inline Actions Do you mean "will not be"? hfinkel: Do you mean "will not be"?
		tstellarAMDUnsubmitted Not Done Reply Inline Actions Yes, I did. This is fixed now. tstellarAMD: Yes, I did. This is fixed now.
		externally observable. This attribute is only valid on intrinsic declarations
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions "does not have any effects besides calculating its result" and "speculatable is not enough to conclude that [...] the number of calls to this function will not be externally observable." seem contradictory to me. (Also you have a typo with `exection` instead of `execution`) mehdi_amini: "does not have any effects besides calculating its result" and "speculatable is not enough to…
		tstellarAMDUnsubmitted Not Done Reply Inline Actions I would like to try to revive this discussion. We've gone back and forth a lot on the attribute description. The intention is that speculatable allows something to be speculatively executed, but is not enough by itself to determine whether or not the function can be CSE'd. Would it make sense to replace: 'does not have any effects besides calculating its result' with 'does not have any effects other than possibly reading/writing memory and calculating its result' tstellarAMD: I would like to try to revive this discussion. We've gone back and forth a lot on the…
		hfinkelUnsubmitted Not Done Reply Inline Actions It also can't have any undefined behavior. hfinkel: It also can't have any undefined behavior.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Isn't it enough to say: `This function attribute indicates that the function does not have undefined behavior, for any possible combination of arguments or global memory state.` ? mehdi_amini: Isn't it enough to say: `This function attribute indicates that the function does not have…
		hfinkelUnsubmitted Not Done Reply Inline Actions Yes, I think that sounds right (the comma is not necessary). hfinkel: Yes, I think that sounds right (the comma is not necessary).
		declarations, not on general functions or individual call
		sites. If a function is incorrectly marked as speculatable and
		really does exhibit undefined behavior, the undefined behavior may
		be observed even if the call site is dead code.
``ssp``		``ssp``
This attribute indicates that the function should emit a stack		This attribute indicates that the function should emit a stack
smashing protector. It is in the form of a "canary" --- a random value		smashing protector. It is in the form of a "canary" --- a random value
placed on the stack before the local variables that's checked upon		placed on the stack before the local variables that's checked upon
return from the function to see if it has been overwritten. A		return from the function to see if it has been overwritten. A
heuristic is used to determine if a function needs stack protectors		heuristic is used to determine if a function needs stack protectors
or not. The heuristic used will enable protectors for functions with:		or not. The heuristic used will enable protectors for functions with:

▲ Show 20 Lines • Show All 11,724 Lines • Show Last 20 Lines

include/llvm/Bitcode/LLVMBitCodes.h

Show First 20 Lines • Show All 539 Lines • ▼ Show 20 Lines	enum AttributeKindCodes {
ATTR_KIND_SAFESTACK = 44,		ATTR_KIND_SAFESTACK = 44,
ATTR_KIND_ARGMEMONLY = 45,		ATTR_KIND_ARGMEMONLY = 45,
ATTR_KIND_SWIFT_SELF = 46,		ATTR_KIND_SWIFT_SELF = 46,
ATTR_KIND_SWIFT_ERROR = 47,		ATTR_KIND_SWIFT_ERROR = 47,
ATTR_KIND_NO_RECURSE = 48,		ATTR_KIND_NO_RECURSE = 48,
ATTR_KIND_INACCESSIBLEMEM_ONLY = 49,		ATTR_KIND_INACCESSIBLEMEM_ONLY = 49,
ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEMONLY = 50,		ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEMONLY = 50,
ATTR_KIND_ALLOC_SIZE = 51,		ATTR_KIND_ALLOC_SIZE = 51,
ATTR_KIND_WRITEONLY = 52		ATTR_KIND_WRITEONLY = 52,
		ATTR_KIND_SPECULATABLE = 53
};		};

enum ComdatSelectionKindCodes {		enum ComdatSelectionKindCodes {
COMDAT_SELECTION_KIND_ANY = 1,		COMDAT_SELECTION_KIND_ANY = 1,
COMDAT_SELECTION_KIND_EXACT_MATCH = 2,		COMDAT_SELECTION_KIND_EXACT_MATCH = 2,
COMDAT_SELECTION_KIND_LARGEST = 3,		COMDAT_SELECTION_KIND_LARGEST = 3,
COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,		COMDAT_SELECTION_KIND_NO_DUPLICATES = 4,
COMDAT_SELECTION_KIND_SAME_SIZE = 5,		COMDAT_SELECTION_KIND_SAME_SIZE = 5,
Show All 10 Lines

include/llvm/IR/Attributes.td

	Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines

	/// Sign extended before/after call.			/// Sign extended before/after call.
	def SExt : EnumAttr<"signext">;			def SExt : EnumAttr<"signext">;

	/// Alignment of stack for function (3 bits) stored as log2 of alignment with			/// Alignment of stack for function (3 bits) stored as log2 of alignment with
	/// +1 bias 0 means unaligned (different from alignstack=(1)).			/// +1 bias 0 means unaligned (different from alignstack=(1)).
	def StackAlignment : EnumAttr<"alignstack">;			def StackAlignment : EnumAttr<"alignstack">;

				/// Function can be speculated.
				def Speculatable : EnumAttr<"speculatable">;

	/// Stack protection.			/// Stack protection.
	def StackProtect : EnumAttr<"ssp">;			def StackProtect : EnumAttr<"ssp">;

	/// Stack protection required.			/// Stack protection required.
	def StackProtectReq : EnumAttr<"sspreq">;			def StackProtectReq : EnumAttr<"sspreq">;

	/// Strong Stack protection.			/// Strong Stack protection.
	def StackProtectStrong : EnumAttr<"sspstrong">;			def StackProtectStrong : EnumAttr<"sspstrong">;
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

include/llvm/IR/Function.h

Show First 20 Lines • Show All 410 Lines • ▼ Show 20 Lines	public:
}		}
void setConvergent() {		void setConvergent() {
addFnAttr(Attribute::Convergent);		addFnAttr(Attribute::Convergent);
}		}
void setNotConvergent() {		void setNotConvergent() {
removeFnAttr(Attribute::Convergent);		removeFnAttr(Attribute::Convergent);
}		}

		/// @brief Determine if the call has sideeffects.
		bool isSpeculatable() const {
		return hasFnAttribute(Attribute::Speculatable);
		}
		void setSpeculatable() {
		addFnAttr(Attribute::Speculatable);
		}

/// Determine if the function is known not to recurse, directly or		/// Determine if the function is known not to recurse, directly or
/// indirectly.		/// indirectly.
bool doesNotRecurse() const {		bool doesNotRecurse() const {
return hasFnAttribute(Attribute::NoRecurse);		return hasFnAttribute(Attribute::NoRecurse);
}		}
void setDoesNotRecurse() {		void setDoesNotRecurse() {
addFnAttr(Attribute::NoRecurse);		addFnAttr(Attribute::NoRecurse);
}		}
▲ Show 20 Lines • Show All 280 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	// Parallels the noduplicate attribute on LLVM IR functions.			// Parallels the noduplicate attribute on LLVM IR functions.
	def IntrNoDuplicate : IntrinsicProperty;			def IntrNoDuplicate : IntrinsicProperty;

	// IntrConvergent - Calls to this intrinsic are convergent and may not be made			// IntrConvergent - Calls to this intrinsic are convergent and may not be made
	// control-dependent on any additional values.			// control-dependent on any additional values.
	// Parallels the convergent attribute on LLVM IR functions.			// Parallels the convergent attribute on LLVM IR functions.
	def IntrConvergent : IntrinsicProperty;			def IntrConvergent : IntrinsicProperty;

				// This property indicates that the intrinsic is safe to speculate.
				def IntrSpeculatable : IntrinsicProperty;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Types used by intrinsics.			// Types used by intrinsics.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	class LLVMType<ValueType vt> {			class LLVMType<ValueType vt> {
	ValueType VT = vt;			ValueType VT = vt;
	}			}

	▲ Show 20 Lines • Show All 703 Lines • Show Last 20 Lines

lib/AsmParser/LLLexer.cpp

Show First 20 Lines • Show All 642 Lines • ▼ Show 20 Lines	#define KEYWORD(STR) \
KEYWORD(nounwind);		KEYWORD(nounwind);
KEYWORD(optnone);		KEYWORD(optnone);
KEYWORD(optsize);		KEYWORD(optsize);
KEYWORD(readnone);		KEYWORD(readnone);
KEYWORD(readonly);		KEYWORD(readonly);
KEYWORD(returned);		KEYWORD(returned);
KEYWORD(returns_twice);		KEYWORD(returns_twice);
KEYWORD(signext);		KEYWORD(signext);
		KEYWORD(speculatable);
KEYWORD(sret);		KEYWORD(sret);
KEYWORD(ssp);		KEYWORD(ssp);
KEYWORD(sspreq);		KEYWORD(sspreq);
KEYWORD(sspstrong);		KEYWORD(sspstrong);
KEYWORD(safestack);		KEYWORD(safestack);
KEYWORD(sanitize_address);		KEYWORD(sanitize_address);
KEYWORD(sanitize_thread);		KEYWORD(sanitize_thread);
KEYWORD(sanitize_memory);		KEYWORD(sanitize_memory);
▲ Show 20 Lines • Show All 359 Lines • Show Last 20 Lines

lib/AsmParser/LLParser.cpp

Show First 20 Lines • Show All 1,089 Lines • ▼ Show 20 Lines	while (true) {
case lltok::kw_norecurse: B.addAttribute(Attribute::NoRecurse); break;		case lltok::kw_norecurse: B.addAttribute(Attribute::NoRecurse); break;
case lltok::kw_nounwind: B.addAttribute(Attribute::NoUnwind); break;		case lltok::kw_nounwind: B.addAttribute(Attribute::NoUnwind); break;
case lltok::kw_optnone: B.addAttribute(Attribute::OptimizeNone); break;		case lltok::kw_optnone: B.addAttribute(Attribute::OptimizeNone); break;
case lltok::kw_optsize: B.addAttribute(Attribute::OptimizeForSize); break;		case lltok::kw_optsize: B.addAttribute(Attribute::OptimizeForSize); break;
case lltok::kw_readnone: B.addAttribute(Attribute::ReadNone); break;		case lltok::kw_readnone: B.addAttribute(Attribute::ReadNone); break;
case lltok::kw_readonly: B.addAttribute(Attribute::ReadOnly); break;		case lltok::kw_readonly: B.addAttribute(Attribute::ReadOnly); break;
case lltok::kw_returns_twice:		case lltok::kw_returns_twice:
B.addAttribute(Attribute::ReturnsTwice); break;		B.addAttribute(Attribute::ReturnsTwice); break;
		case lltok::kw_speculatable: B.addAttribute(Attribute::Speculatable); break;
case lltok::kw_ssp: B.addAttribute(Attribute::StackProtect); break;		case lltok::kw_ssp: B.addAttribute(Attribute::StackProtect); break;
case lltok::kw_sspreq: B.addAttribute(Attribute::StackProtectReq); break;		case lltok::kw_sspreq: B.addAttribute(Attribute::StackProtectReq); break;
case lltok::kw_sspstrong:		case lltok::kw_sspstrong:
B.addAttribute(Attribute::StackProtectStrong); break;		B.addAttribute(Attribute::StackProtectStrong); break;
case lltok::kw_safestack: B.addAttribute(Attribute::SafeStack); break;		case lltok::kw_safestack: B.addAttribute(Attribute::SafeStack); break;
case lltok::kw_sanitize_address:		case lltok::kw_sanitize_address:
B.addAttribute(Attribute::SanitizeAddress); break;		B.addAttribute(Attribute::SanitizeAddress); break;
case lltok::kw_sanitize_thread:		case lltok::kw_sanitize_thread:
▲ Show 20 Lines • Show All 5,466 Lines • Show Last 20 Lines

lib/AsmParser/LLToken.h

Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	enum Kind {
kw_nounwind,		kw_nounwind,
kw_optnone,		kw_optnone,
kw_optsize,		kw_optsize,
kw_readnone,		kw_readnone,
kw_readonly,		kw_readonly,
kw_returned,		kw_returned,
kw_returns_twice,		kw_returns_twice,
kw_signext,		kw_signext,
		kw_speculatable,
kw_ssp,		kw_ssp,
kw_sspreq,		kw_sspreq,
kw_sspstrong,		kw_sspstrong,
kw_safestack,		kw_safestack,
kw_sret,		kw_sret,
kw_sanitize_thread,		kw_sanitize_thread,
kw_sanitize_memory,		kw_sanitize_memory,
kw_swifterror,		kw_swifterror,
▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 1,113 Lines • ▼ Show 20 Lines	static uint64_t getRawAttributeMask(Attribute::AttrKind Val) {
case Attribute::Convergent: return 1ULL << 46;		case Attribute::Convergent: return 1ULL << 46;
case Attribute::SafeStack: return 1ULL << 47;		case Attribute::SafeStack: return 1ULL << 47;
case Attribute::NoRecurse: return 1ULL << 48;		case Attribute::NoRecurse: return 1ULL << 48;
case Attribute::InaccessibleMemOnly: return 1ULL << 49;		case Attribute::InaccessibleMemOnly: return 1ULL << 49;
case Attribute::InaccessibleMemOrArgMemOnly: return 1ULL << 50;		case Attribute::InaccessibleMemOrArgMemOnly: return 1ULL << 50;
case Attribute::SwiftSelf: return 1ULL << 51;		case Attribute::SwiftSelf: return 1ULL << 51;
case Attribute::SwiftError: return 1ULL << 52;		case Attribute::SwiftError: return 1ULL << 52;
case Attribute::WriteOnly: return 1ULL << 53;		case Attribute::WriteOnly: return 1ULL << 53;
		case Attribute::Speculatable: return 1ULL << 54;
case Attribute::Dereferenceable:		case Attribute::Dereferenceable:
llvm_unreachable("dereferenceable attribute not supported in raw format");		llvm_unreachable("dereferenceable attribute not supported in raw format");
break;		break;
case Attribute::DereferenceableOrNull:		case Attribute::DereferenceableOrNull:
llvm_unreachable("dereferenceable_or_null attribute not supported in raw "		llvm_unreachable("dereferenceable_or_null attribute not supported in raw "
"format");		"format");
break;		break;
case Attribute::ArgMemOnly:		case Attribute::ArgMemOnly:
▲ Show 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
case bitc::ATTR_KIND_READ_ONLY:		case bitc::ATTR_KIND_READ_ONLY:
return Attribute::ReadOnly;		return Attribute::ReadOnly;
case bitc::ATTR_KIND_RETURNED:		case bitc::ATTR_KIND_RETURNED:
return Attribute::Returned;		return Attribute::Returned;
case bitc::ATTR_KIND_RETURNS_TWICE:		case bitc::ATTR_KIND_RETURNS_TWICE:
return Attribute::ReturnsTwice;		return Attribute::ReturnsTwice;
case bitc::ATTR_KIND_S_EXT:		case bitc::ATTR_KIND_S_EXT:
return Attribute::SExt;		return Attribute::SExt;
		case bitc::ATTR_KIND_SPECULATABLE:
		return Attribute::Speculatable;
case bitc::ATTR_KIND_STACK_ALIGNMENT:		case bitc::ATTR_KIND_STACK_ALIGNMENT:
return Attribute::StackAlignment;		return Attribute::StackAlignment;
case bitc::ATTR_KIND_STACK_PROTECT:		case bitc::ATTR_KIND_STACK_PROTECT:
return Attribute::StackProtect;		return Attribute::StackProtect;
case bitc::ATTR_KIND_STACK_PROTECT_REQ:		case bitc::ATTR_KIND_STACK_PROTECT_REQ:
return Attribute::StackProtectReq;		return Attribute::StackProtectReq;
case bitc::ATTR_KIND_STACK_PROTECT_STRONG:		case bitc::ATTR_KIND_STACK_PROTECT_STRONG:
return Attribute::StackProtectStrong;		return Attribute::StackProtectStrong;
▲ Show 20 Lines • Show All 4,297 Lines • Show Last 20 Lines

lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 682 Lines • ▼ Show 20 Lines	static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
case Attribute::ReadOnly:		case Attribute::ReadOnly:
return bitc::ATTR_KIND_READ_ONLY;		return bitc::ATTR_KIND_READ_ONLY;
case Attribute::Returned:		case Attribute::Returned:
return bitc::ATTR_KIND_RETURNED;		return bitc::ATTR_KIND_RETURNED;
case Attribute::ReturnsTwice:		case Attribute::ReturnsTwice:
return bitc::ATTR_KIND_RETURNS_TWICE;		return bitc::ATTR_KIND_RETURNS_TWICE;
case Attribute::SExt:		case Attribute::SExt:
return bitc::ATTR_KIND_S_EXT;		return bitc::ATTR_KIND_S_EXT;
		case Attribute::Speculatable:
		return bitc::ATTR_KIND_SPECULATABLE;
case Attribute::StackAlignment:		case Attribute::StackAlignment:
return bitc::ATTR_KIND_STACK_ALIGNMENT;		return bitc::ATTR_KIND_STACK_ALIGNMENT;
case Attribute::StackProtect:		case Attribute::StackProtect:
return bitc::ATTR_KIND_STACK_PROTECT;		return bitc::ATTR_KIND_STACK_PROTECT;
case Attribute::StackProtectReq:		case Attribute::StackProtectReq:
return bitc::ATTR_KIND_STACK_PROTECT_REQ;		return bitc::ATTR_KIND_STACK_PROTECT_REQ;
case Attribute::StackProtectStrong:		case Attribute::StackProtectStrong:
return bitc::ATTR_KIND_STACK_PROTECT_STRONG;		return bitc::ATTR_KIND_STACK_PROTECT_STRONG;
▲ Show 20 Lines • Show All 3,315 Lines • Show Last 20 Lines

lib/IR/Attributes.cpp

Show First 20 Lines • Show All 309 Lines • ▼ Show 20 Lines	std::string Attribute::getAsString(bool InAttrGrp) const {
if (hasAttribute(Attribute::WriteOnly))		if (hasAttribute(Attribute::WriteOnly))
return "writeonly";		return "writeonly";
if (hasAttribute(Attribute::Returned))		if (hasAttribute(Attribute::Returned))
return "returned";		return "returned";
if (hasAttribute(Attribute::ReturnsTwice))		if (hasAttribute(Attribute::ReturnsTwice))
return "returns_twice";		return "returns_twice";
if (hasAttribute(Attribute::SExt))		if (hasAttribute(Attribute::SExt))
return "signext";		return "signext";
		if (hasAttribute(Attribute::Speculatable))
		return "speculatable";
if (hasAttribute(Attribute::StackProtect))		if (hasAttribute(Attribute::StackProtect))
return "ssp";		return "ssp";
if (hasAttribute(Attribute::StackProtectReq))		if (hasAttribute(Attribute::StackProtectReq))
return "sspreq";		return "sspreq";
if (hasAttribute(Attribute::StackProtectStrong))		if (hasAttribute(Attribute::StackProtectStrong))
return "sspstrong";		return "sspstrong";
if (hasAttribute(Attribute::SafeStack))		if (hasAttribute(Attribute::SafeStack))
return "safestack";		return "safestack";
▲ Show 20 Lines • Show All 1,330 Lines • Show Last 20 Lines

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 1,197 Lines • ▼ Show 20 Lines	void Verifier::visitComdat(const Comdat &C) {
// with private linkage don't have entries in the symbol table.		// with private linkage don't have entries in the symbol table.
if (const GlobalValue *GV = M.getNamedValue(C.getName()))		if (const GlobalValue *GV = M.getNamedValue(C.getName()))
Assert(!GV->hasPrivateLinkage(), "comdat global value has private linkage",		Assert(!GV->hasPrivateLinkage(), "comdat global value has private linkage",
GV);		GV);
}		}

void Verifier::visitModuleIdents(const Module &M) {		void Verifier::visitModuleIdents(const Module &M) {
const NamedMDNode *Idents = M.getNamedMetadata("llvm.ident");		const NamedMDNode *Idents = M.getNamedMetadata("llvm.ident");
if (!Idents)		if (!Idents)
return;		return;

// llvm.ident takes a list of metadata entry. Each entry has only one string.		// llvm.ident takes a list of metadata entry. Each entry has only one string.
// Scan each llvm.ident entry and make sure that this requirement is met.		// Scan each llvm.ident entry and make sure that this requirement is met.
for (const MDNode *N : Idents->operands()) {		for (const MDNode *N : Idents->operands()) {
Assert(N->getNumOperands() == 1,		Assert(N->getNumOperands() == 1,
"incorrect number of operands in llvm.ident metadata", N);		"incorrect number of operands in llvm.ident metadata", N);
Assert(dyn_cast_or_null<MDString>(N->getOperand(0)),		Assert(dyn_cast_or_null<MDString>(N->getOperand(0)),
("invalid value for llvm.ident metadata entry operand"		("invalid value for llvm.ident metadata entry operand"
"(the operand should be a string)"),		"(the operand should be a string)"),
N->getOperand(0));		N->getOperand(0));
}		}
}		}

void Verifier::visitModuleFlags(const Module &M) {		void Verifier::visitModuleFlags(const Module &M) {
const NamedMDNode *Flags = M.getModuleFlagsMetadata();		const NamedMDNode *Flags = M.getModuleFlagsMetadata();
if (!Flags) return;		if (!Flags) return;

// Scan each flag, and track the flags and requirements.		// Scan each flag, and track the flags and requirements.
DenseMap<const MDString, const MDNode> SeenIDs;		DenseMap<const MDString, const MDNode> SeenIDs;
▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	static bool isFuncOnlyAttr(Attribute::AttrKind Kind) {
case Attribute::OptimizeNone:		case Attribute::OptimizeNone:
case Attribute::JumpTable:		case Attribute::JumpTable:
case Attribute::Convergent:		case Attribute::Convergent:
case Attribute::ArgMemOnly:		case Attribute::ArgMemOnly:
case Attribute::NoRecurse:		case Attribute::NoRecurse:
case Attribute::InaccessibleMemOnly:		case Attribute::InaccessibleMemOnly:
case Attribute::InaccessibleMemOrArgMemOnly:		case Attribute::InaccessibleMemOrArgMemOnly:
case Attribute::AllocSize:		case Attribute::AllocSize:
		case Attribute::Speculatable:
return true;		return true;
default:		default:
break;		break;
}		}
return false;		return false;
}		}

/// Return true if this is a function attribute that can also appear on		/// Return true if this is a function attribute that can also appear on
▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines	Assert(NumDeoptArgs >= 0, "gc.statepoint number of deoptimization arguments "
"must be positive",		"must be positive",
&CI);		&CI);

const int ExpectedNumArgs =		const int ExpectedNumArgs =
7 + NumCallArgs + NumTransitionArgs + NumDeoptArgs;		7 + NumCallArgs + NumTransitionArgs + NumDeoptArgs;
Assert(ExpectedNumArgs <= (int)CS.arg_size(),		Assert(ExpectedNumArgs <= (int)CS.arg_size(),
"gc.statepoint too few arguments according to length fields", &CI);		"gc.statepoint too few arguments according to length fields", &CI);

// Check that the only uses of this gc.statepoint are gc.result or		// Check that the only uses of this gc.statepoint are gc.result or
// gc.relocate calls which are tied to this statepoint and thus part		// gc.relocate calls which are tied to this statepoint and thus part
// of the same statepoint sequence		// of the same statepoint sequence
for (const User *U : CI.users()) {		for (const User *U : CI.users()) {
const CallInst *Call = dyn_cast<const CallInst>(U);		const CallInst *Call = dyn_cast<const CallInst>(U);
Assert(Call, "illegal use of statepoint token", &CI, U);		Assert(Call, "illegal use of statepoint token", &CI, U);
if (!Call) continue;		if (!Call) continue;
Assert(isa<GCRelocateInst>(Call) \|\| isa<GCResultInst>(Call),		Assert(isa<GCRelocateInst>(Call) \|\| isa<GCResultInst>(Call),
"gc.result or gc.relocate are the only value uses "		"gc.result or gc.relocate are the only value uses "
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	for (const Argument &Arg : F.args()) {

// Check that swifterror argument is only used by loads and stores.		// Check that swifterror argument is only used by loads and stores.
if (Attrs.hasParamAttribute(i, Attribute::SwiftError)) {		if (Attrs.hasParamAttribute(i, Attribute::SwiftError)) {
verifySwiftErrorValue(&Arg);		verifySwiftErrorValue(&Arg);
}		}
++i;		++i;
}		}

if (!isLLVMdotName)		if (!isLLVMdotName) {
Assert(!F.getReturnType()->isTokenTy(),		Assert(!F.getReturnType()->isTokenTy(),
"Functions returns a token but isn't an intrinsic", &F);		"Functions returns a token but isn't an intrinsic", &F);

		Assert(!Attrs.hasFnAttribute(Attribute::Speculatable),
		"Attribute 'speculatable' only applies to intrinsics");
		}

// Get the function metadata attachments.		// Get the function metadata attachments.
SmallVector<std::pair<unsigned, MDNode *>, 4> MDs;		SmallVector<std::pair<unsigned, MDNode *>, 4> MDs;
F.getAllMetadata(MDs);		F.getAllMetadata(MDs);
assert(F.hasMetadata() != MDs.empty() && "Bit out-of-sync");		assert(F.hasMetadata() != MDs.empty() && "Bit out-of-sync");
verifyFunctionMetadata(MDs);		verifyFunctionMetadata(MDs);

// Check validity of the personality function		// Check validity of the personality function
if (F.hasPersonalityFn()) {		if (F.hasPersonalityFn()) {
▲ Show 20 Lines • Show All 565 Lines • ▼ Show 20 Lines	Assert(CS.getArgument(i)->getType() == FTy->getParamType(i),
"Call parameter type does not match function signature!",		"Call parameter type does not match function signature!",
CS.getArgument(i), FTy->getParamType(i), I);		CS.getArgument(i), FTy->getParamType(i), I);

AttributeList Attrs = CS.getAttributes();		AttributeList Attrs = CS.getAttributes();

Assert(verifyAttributeCount(Attrs, CS.arg_size()),		Assert(verifyAttributeCount(Attrs, CS.arg_size()),
"Attribute after last parameter!", I);		"Attribute after last parameter!", I);

		if (Attrs.hasAttribute(AttributeList::FunctionIndex, Attribute::Speculatable)) {
		// Don't allow speculatable on call sites, unless the underlying function
		// declaration is also speculatable.
		Function *Callee
		= dyn_cast<Function>(CS.getCalledValue()->stripPointerCasts());
		Assert(Callee && Callee->isSpeculatable(),
		"speculatable attribute may not apply to call sites", I);
		}

// Verify call attributes.		// Verify call attributes.
verifyFunctionAttrs(FTy, Attrs, I);		verifyFunctionAttrs(FTy, Attrs, I);

// Conservatively check the inalloca argument.		// Conservatively check the inalloca argument.
// We have a bug if we can find that there is an underlying alloca without		// We have a bug if we can find that there is an underlying alloca without
// inalloca.		// inalloca.
if (CS.hasInAllocaArgument()) {		if (CS.hasInAllocaArgument()) {
Value *InAllocaArg = CS.getArgument(FTy->getNumParams() - 1);		Value *InAllocaArg = CS.getArgument(FTy->getNumParams() - 1);
▲ Show 20 Lines • Show All 1,282 Lines • ▼ Show 20 Lines	void Verifier::visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS) {
Assert(ExpectedName == IF->getName(),		Assert(ExpectedName == IF->getName(),
"Intrinsic name not mangled correctly for type arguments! "		"Intrinsic name not mangled correctly for type arguments! "
"Should be: " +		"Should be: " +
ExpectedName,		ExpectedName,
IF);		IF);

// If the intrinsic takes MDNode arguments, verify that they are either global		// If the intrinsic takes MDNode arguments, verify that they are either global
// or are local to this function.		// or are local to this function.
for (Value *V : CS.args())		for (Value *V : CS.args())
if (auto *MD = dyn_cast<MetadataAsValue>(V))		if (auto *MD = dyn_cast<MetadataAsValue>(V))
visitMetadataAsValue(*MD, CS.getCaller());		visitMetadataAsValue(*MD, CS.getCaller());

switch (ID) {		switch (ID) {
default:		default:
break;		break;
case Intrinsic::coro_id: {		case Intrinsic::coro_id: {
auto *InfoArg = CS.getArgOperand(3)->stripPointerCasts();		auto *InfoArg = CS.getArgOperand(3)->stripPointerCasts();
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	case Intrinsic::memcpy_element_atomic: {
Assert(ElementSizeVal.isPowerOf2(),		Assert(ElementSizeVal.isPowerOf2(),
"element size of the element-wise atomic memory intrinsic "		"element size of the element-wise atomic memory intrinsic "
"must be a power of 2",		"must be a power of 2",
CS);		CS);

auto IsValidAlignment = [&](uint64_t Alignment) {		auto IsValidAlignment = [&](uint64_t Alignment) {
return isPowerOf2_64(Alignment) && ElementSizeVal.ule(Alignment);		return isPowerOf2_64(Alignment) && ElementSizeVal.ule(Alignment);
};		};

uint64_t DstAlignment = CS.getParamAlignment(1),		uint64_t DstAlignment = CS.getParamAlignment(1),
SrcAlignment = CS.getParamAlignment(2);		SrcAlignment = CS.getParamAlignment(2);

Assert(IsValidAlignment(DstAlignment),		Assert(IsValidAlignment(DstAlignment),
"incorrect alignment of the destination argument",		"incorrect alignment of the destination argument",
CS);		CS);
Assert(IsValidAlignment(SrcAlignment),		Assert(IsValidAlignment(SrcAlignment),
"incorrect alignment of the source argument",		"incorrect alignment of the source argument",
▲ Show 20 Lines • Show All 222 Lines • ▼ Show 20 Lines	void Verifier::visitIntrinsicCallSite(Intrinsic::ID ID, CallSite CS) {
case Intrinsic::eh_exceptioncode:		case Intrinsic::eh_exceptioncode:
case Intrinsic::eh_exceptionpointer: {		case Intrinsic::eh_exceptionpointer: {
Assert(isa<CatchPadInst>(CS.getArgOperand(0)),		Assert(isa<CatchPadInst>(CS.getArgOperand(0)),
"eh.exceptionpointer argument must be a catchpad", CS);		"eh.exceptionpointer argument must be a catchpad", CS);
break;		break;
}		}
case Intrinsic::masked_load: {		case Intrinsic::masked_load: {
Assert(CS.getType()->isVectorTy(), "masked_load: must return a vector", CS);		Assert(CS.getType()->isVectorTy(), "masked_load: must return a vector", CS);

Value *Ptr = CS.getArgOperand(0);		Value *Ptr = CS.getArgOperand(0);
//Value *Alignment = CS.getArgOperand(1);		//Value *Alignment = CS.getArgOperand(1);
Value *Mask = CS.getArgOperand(2);		Value *Mask = CS.getArgOperand(2);
Value *PassThru = CS.getArgOperand(3);		Value *PassThru = CS.getArgOperand(3);
Assert(Mask->getType()->isVectorTy(),		Assert(Mask->getType()->isVectorTy(),
"masked_load: mask must be vector", CS);		"masked_load: mask must be vector", CS);

// DataTy is the overloaded type		// DataTy is the overloaded type
Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();		Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();
Assert(DataTy == CS.getType(),		Assert(DataTy == CS.getType(),
"masked_load: return must match pointer type", CS);		"masked_load: return must match pointer type", CS);
Assert(PassThru->getType() == DataTy,		Assert(PassThru->getType() == DataTy,
"masked_load: pass through and data type must match", CS);		"masked_load: pass through and data type must match", CS);
Assert(Mask->getType()->getVectorNumElements() ==		Assert(Mask->getType()->getVectorNumElements() ==
DataTy->getVectorNumElements(),		DataTy->getVectorNumElements(),
"masked_load: vector mask must be same length as data", CS);		"masked_load: vector mask must be same length as data", CS);
break;		break;
}		}
case Intrinsic::masked_store: {		case Intrinsic::masked_store: {
Value *Val = CS.getArgOperand(0);		Value *Val = CS.getArgOperand(0);
Value *Ptr = CS.getArgOperand(1);		Value *Ptr = CS.getArgOperand(1);
//Value *Alignment = CS.getArgOperand(2);		//Value *Alignment = CS.getArgOperand(2);
Value *Mask = CS.getArgOperand(3);		Value *Mask = CS.getArgOperand(3);
Assert(Mask->getType()->isVectorTy(),		Assert(Mask->getType()->isVectorTy(),
"masked_store: mask must be vector", CS);		"masked_store: mask must be vector", CS);

// DataTy is the overloaded type		// DataTy is the overloaded type
Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();		Type *DataTy = cast<PointerType>(Ptr->getType())->getElementType();
Assert(DataTy == Val->getType(),		Assert(DataTy == Val->getType(),
"masked_store: storee must match pointer type", CS);		"masked_store: storee must match pointer type", CS);
Assert(Mask->getType()->getVectorNumElements() ==		Assert(Mask->getType()->getVectorNumElements() ==
DataTy->getVectorNumElements(),		DataTy->getVectorNumElements(),
"masked_store: vector mask must be same length as data", CS);		"masked_store: vector mask must be same length as data", CS);
break;		break;
}		}

case Intrinsic::experimental_guard: {		case Intrinsic::experimental_guard: {
Assert(CS.isCall(), "experimental_guard cannot be invoked", CS);		Assert(CS.isCall(), "experimental_guard cannot be invoked", CS);
Assert(CS.countOperandBundlesOfType(LLVMContext::OB_deopt) == 1,		Assert(CS.countOperandBundlesOfType(LLVMContext::OB_deopt) == 1,
"experimental_guard must have exactly one "		"experimental_guard must have exactly one "
▲ Show 20 Lines • Show All 629 Lines • Show Last 20 Lines

test/Bitcode/attributes.ll

	Show First 20 Lines • Show All 198 Lines • ▼ Show 20 Lines
	}			}

	declare void @nobuiltin()			declare void @nobuiltin()

	define void @f34()			define void @f34()
	; CHECK: define void @f34()			; CHECK: define void @f34()
	{			{
	call void @nobuiltin() nobuiltin			call void @nobuiltin() nobuiltin
	; CHECK: call void @nobuiltin() #33			; CHECK: call void @nobuiltin() #34
	ret void;			ret void;
	}			}

	define void @f35() optnone noinline			define void @f35() optnone noinline
	; CHECK: define void @f35() #23			; CHECK: define void @f35() #23
	{			{
	ret void;			ret void;
	}			}
	▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines
	}			}

	; CHECK: define void @f56() #32			; CHECK: define void @f56() #32
	define void @f56() writeonly			define void @f56() writeonly
	{			{
	ret void			ret void
	}			}

				; CHECK: declare void @llvm.test.speculatable() #33
				declare void @llvm.test.speculatable() speculatable

	; CHECK: attributes #0 = { noreturn }			; CHECK: attributes #0 = { noreturn }
	; CHECK: attributes #1 = { nounwind }			; CHECK: attributes #1 = { nounwind }
	; CHECK: attributes #2 = { readnone }			; CHECK: attributes #2 = { readnone }
	; CHECK: attributes #3 = { readonly }			; CHECK: attributes #3 = { readonly }
	; CHECK: attributes #4 = { noinline }			; CHECK: attributes #4 = { noinline }
	; CHECK: attributes #5 = { alwaysinline }			; CHECK: attributes #5 = { alwaysinline }
	; CHECK: attributes #6 = { optsize }			; CHECK: attributes #6 = { optsize }
	; CHECK: attributes #7 = { ssp }			; CHECK: attributes #7 = { ssp }
	Show All 17 Lines
	; CHECK: attributes #25 = { convergent }			; CHECK: attributes #25 = { convergent }
	; CHECK: attributes #26 = { argmemonly }			; CHECK: attributes #26 = { argmemonly }
	; CHECK: attributes #27 = { norecurse }			; CHECK: attributes #27 = { norecurse }
	; CHECK: attributes #28 = { inaccessiblememonly }			; CHECK: attributes #28 = { inaccessiblememonly }
	; CHECK: attributes #29 = { inaccessiblemem_or_argmemonly }			; CHECK: attributes #29 = { inaccessiblemem_or_argmemonly }
	; CHECK: attributes #30 = { allocsize(0) }			; CHECK: attributes #30 = { allocsize(0) }
	; CHECK: attributes #31 = { allocsize(0,1) }			; CHECK: attributes #31 = { allocsize(0,1) }
	; CHECK: attributes #32 = { writeonly }			; CHECK: attributes #32 = { writeonly }
	; CHECK: attributes #33 = { nobuiltin }			; CHECK: attributes #33 = { speculatable }
				; CHECK: attributes #34 = { nobuiltin }

test/Bitcode/compatibility.ll

; Bitcode compatibility test for llvm		; Bitcode compatibility test for llvm
;		;
; Please update this file when making any IR changes. Information on the		; Please update this file when making any IR changes. Information on the
; release process for this file is available here:		; release process for this file is available here:
;		;
; http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility		; http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility

; RUN: llvm-as < %s \| llvm-dis \| llvm-as \| llvm-dis \| FileCheck %s		; RUN: llvm-as < %s \| llvm-dis \| llvm-as \| llvm-dis \| FileCheck %s
; RUN-PR24755: verify-uselistorder < %s		; RUN-PR24755: verify-uselistorder < %s

target datalayout = "E"		target datalayout = "E"
; CHECK: target datalayout = "E"		; CHECK: target datalayout = "E"

target triple = "x86_64-apple-macosx10.10.0"		target triple = "x86_64-apple-macosx10.10.0"
▲ Show 20 Lines • Show All 1,226 Lines • ▼ Show 20 Lines	exit:
; CHECK: phi i32 [ %v1, %L1 ], [ %v2, %L2 ], [ %op1, %entry ]		; CHECK: phi i32 [ %v1, %L1 ], [ %v2, %L2 ], [ %op1, %entry ]

select i1 true, i32 0, i32 1		select i1 true, i32 0, i32 1
; CHECK: select i1 true, i32 0, i32 1		; CHECK: select i1 true, i32 0, i32 1
select <2 x i1> <i1 true, i1 false>, <2 x i8> <i8 2, i8 3>, <2 x i8> <i8 3, i8 2>		select <2 x i1> <i1 true, i1 false>, <2 x i8> <i8 2, i8 3>, <2 x i8> <i8 3, i8 2>
; CHECK: select <2 x i1> <i1 true, i1 false>, <2 x i8> <i8 2, i8 3>, <2 x i8> <i8 3, i8 2>		; CHECK: select <2 x i1> <i1 true, i1 false>, <2 x i8> <i8 2, i8 3>, <2 x i8> <i8 3, i8 2>

call void @f.nobuiltin() builtin		call void @f.nobuiltin() builtin
; CHECK: call void @f.nobuiltin() #41		; CHECK: call void @f.nobuiltin() #42

call fastcc noalias i32* @f.noalias() noinline		call fastcc noalias i32* @f.noalias() noinline
; CHECK: call fastcc noalias i32* @f.noalias() #12		; CHECK: call fastcc noalias i32* @f.noalias() #12
tail call ghccc nonnull i32* @f.nonnull() minsize		tail call ghccc nonnull i32* @f.nonnull() minsize
; CHECK: tail call ghccc nonnull i32* @f.nonnull() #7		; CHECK: tail call ghccc nonnull i32* @f.nonnull() #7

ret void		ret void
}		}
▲ Show 20 Lines • Show All 350 Lines • ▼ Show 20 Lines
normal:		normal:
ret void		ret void
}		}


declare void @f.writeonly() writeonly		declare void @f.writeonly() writeonly
; CHECK: declare void @f.writeonly() #40		; CHECK: declare void @f.writeonly() #40

		declare void @llvm.f.speculatable() speculatable
		; CHECK: declare void @llvm.f.speculatable() #41

;; Constant Expressions		;; Constant Expressions

define i8** @constexpr() {		define i8** @constexpr() {
; CHECK: ret i8** getelementptr inbounds ({ [4 x i8], [4 x i8] }, { [4 x i8], [4 x i8] }* null, i32 0, inrange i32 1, i32 2)		; CHECK: ret i8** getelementptr inbounds ({ [4 x i8], [4 x i8] }, { [4 x i8], [4 x i8] }* null, i32 0, inrange i32 1, i32 2)
ret i8** getelementptr inbounds ({ [4 x i8], [4 x i8] }, { [4 x i8], [4 x i8] }* null, i32 0, inrange i32 1, i32 2)		ret i8** getelementptr inbounds ({ [4 x i8], [4 x i8] }, { [4 x i8], [4 x i8] }* null, i32 0, inrange i32 1, i32 2)
}		}

; CHECK: attributes #0 = { alignstack=4 }		; CHECK: attributes #0 = { alignstack=4 }
Show All 32 Lines
; CHECK: attributes #33 = { inaccessiblememonly }		; CHECK: attributes #33 = { inaccessiblememonly }
; CHECK: attributes #34 = { inaccessiblemem_or_argmemonly }		; CHECK: attributes #34 = { inaccessiblemem_or_argmemonly }
; CHECK: attributes #35 = { nounwind readnone }		; CHECK: attributes #35 = { nounwind readnone }
; CHECK: attributes #36 = { argmemonly nounwind readonly }		; CHECK: attributes #36 = { argmemonly nounwind readonly }
; CHECK: attributes #37 = { argmemonly nounwind }		; CHECK: attributes #37 = { argmemonly nounwind }
; CHECK: attributes #38 = { nounwind readonly }		; CHECK: attributes #38 = { nounwind readonly }
; CHECK: attributes #39 = { inaccessiblemem_or_argmemonly nounwind }		; CHECK: attributes #39 = { inaccessiblemem_or_argmemonly nounwind }
; CHECK: attributes #40 = { writeonly }		; CHECK: attributes #40 = { writeonly }
; CHECK: attributes #41 = { builtin }		; CHECK: attributes #41 = { speculatable }
		; CHECK: attributes #42 = { builtin }

;; Metadata		;; Metadata

; Metadata -- Module flags		; Metadata -- Module flags
!llvm.module.flags = !{!0, !1, !2, !4, !5, !6}		!llvm.module.flags = !{!0, !1, !2, !4, !5, !6}
; CHECK: !llvm.module.flags = !{!0, !1, !2, !4, !5, !6}		; CHECK: !llvm.module.flags = !{!0, !1, !2, !4, !5, !6}

!0 = !{i32 1, !"mod1", i32 0}		!0 = !{i32 1, !"mod1", i32 0}
Show All 23 Lines

test/Verifier/speculatable-callsite-invalid.ll

This file was added.

				; RUN: not llvm-as %s -o /dev/null 2>&1 \| FileCheck %s

				; Make sure that speculatable is not allowed on a call site if the
				; declaration is not also speculatable.

				declare i32 @llvm.not_speculatable()

				; CHECK: speculatable attribute may not apply to call sites
				; CHECK-NEXT: %ret = call i32 @llvm.not_speculatable() #0
				define i32 @call_not_speculatable() {
				%ret = call i32 @llvm.not_speculatable() #0
				ret i32 %ret
				}

				@gv = internal unnamed_addr constant i32 0

				; CHECK: speculatable attribute may not apply to call sites
				; CHECK-NEXT: %ret = call float bitcast (i32* @gv to float ()*)() #0
				define float @call_bitcast_speculatable() {
				%ret = call float bitcast (i32* @gv to float()*)() #0
				ret float %ret
				}

				attributes #0 = { speculatable }

test/Verifier/speculatable-callsite.ll

This file was added.

				; RUN: llvm-as %s -o /dev/null

				; Make sure speculatable is accepted on a call site if the declaration
				; is also speculatable.

				declare i32 @llvm.speculatable() #0

				; Make sure this the attribute is accepted on the call site if the
				; declaration matches.
				define i32 @call_speculatable() {
				%ret = call i32 @llvm.speculatable() #0
				ret i32 %ret
				}

				; This should be tesed when speculatable may apply to arbitrary functions
				; define float @call_bitcast_speculatable() {
				; %ret = call float bitcast (i32()* @speculatable to float()*)() #0
				; ret float %ret
				; }

				attributes #0 = { speculatable }

utils/TableGen/CodeGenIntrinsics.h

Show All 15 Lines

#include "llvm/CodeGen/MachineValueType.h"		#include "llvm/CodeGen/MachineValueType.h"
#include <string>		#include <string>
#include <vector>		#include <vector>

namespace llvm {		namespace llvm {
class Record;		class Record;
class RecordKeeper;		class RecordKeeper;
class CodeGenTarget;		class CodeGenTarget;
		eli.friedmanUnsubmitted Not Done Reply Inline Actions Saying that, for example, memcpy is nosideeffects seems very weird. "memcpy(0,0,8)" will crash. The same issue applies to basically any intrinsic that reads from or writes to its arguments. eli.friedman: Saying that, for example, memcpy is nosideeffects seems very weird. "memcpy(0,0,8)" will crash.

struct CodeGenIntrinsic {		struct CodeGenIntrinsic {
Record *TheDef; // The actual record defining this intrinsic.		Record *TheDef; // The actual record defining this intrinsic.
std::string Name; // The name of the LLVM function "llvm.bswap.i32"		std::string Name; // The name of the LLVM function "llvm.bswap.i32"
std::string EnumName; // The name of the enum "bswap_i32"		std::string EnumName; // The name of the enum "bswap_i32"
std::string GCCBuiltinName; // Name of the corresponding GCC builtin, or "".		std::string GCCBuiltinName; // Name of the corresponding GCC builtin, or "".
std::string MSBuiltinName; // Name of the corresponding MS builtin, or "".		std::string MSBuiltinName; // Name of the corresponding MS builtin, or "".
std::string TargetPrefix; // Target prefix, e.g. "ppc" for t-s intrinsics.		std::string TargetPrefix; // Target prefix, e.g. "ppc" for t-s intrinsics.
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	struct CodeGenIntrinsic {
bool isNoDuplicate;		bool isNoDuplicate;

/// True if the intrinsic is no-return.		/// True if the intrinsic is no-return.
bool isNoReturn;		bool isNoReturn;

/// True if the intrinsic is marked as convergent.		/// True if the intrinsic is marked as convergent.
bool isConvergent;		bool isConvergent;

		// True if the intrinsic is marked as speculatable.
		bool isSpeculatable;

		tstellarAMDUnsubmitted Not Done Reply Inline Actions This a problem with the current definitions of TableGen's intrinsic properties. Any intrinsic with IntrNoMem, IntrReadMem, IntrWriteMem, or IntrArgMemOnly is defined as having no side-effects. The goal with this patch is to make it possible to have an intrinsic, like memcpy which only reads/writes arg memory, but may have other sideeffects. tstellarAMD: This a problem with the current definitions of TableGen's intrinsic properties. Any intrinsic…
		eli.friedmanUnsubmitted Not Done Reply Inline Actions Oh, I see, this is an existing problem. :( I'd definitely like to see this resolved before you start changing optimizations to use this flag, but I guess you can change it in a followup. Not that it helps memcpy in particular, but it might be worth considering some approach which allows one to say "this intrinsic has no side-effects if the pointer arguments are dereferenceable(n)". eli.friedman: Oh, I see, this is an existing problem. :( I'd definitely like to see this resolved before you…
enum ArgAttribute { NoCapture, Returned, ReadOnly, WriteOnly, ReadNone };		enum ArgAttribute { NoCapture, Returned, ReadOnly, WriteOnly, ReadNone };
std::vector<std::pair<unsigned, ArgAttribute>> ArgumentAttributes;		std::vector<std::pair<unsigned, ArgAttribute>> ArgumentAttributes;

CodeGenIntrinsic(Record *R);		CodeGenIntrinsic(Record *R);
};		};

class CodeGenIntrinsicTable {		class CodeGenIntrinsicTable {
std::vector<CodeGenIntrinsic> Intrinsics;		std::vector<CodeGenIntrinsic> Intrinsics;
Show All 22 Lines

utils/TableGen/CodeGenTarget.cpp

Show First 20 Lines • Show All 509 Lines • ▼ Show 20 Lines	CodeGenIntrinsic::CodeGenIntrinsic(Record *R) {
std::string DefName = R->getName();		std::string DefName = R->getName();
ModRef = ReadWriteMem;		ModRef = ReadWriteMem;
isOverloaded = false;		isOverloaded = false;
isCommutative = false;		isCommutative = false;
canThrow = false;		canThrow = false;
isNoReturn = false;		isNoReturn = false;
isNoDuplicate = false;		isNoDuplicate = false;
isConvergent = false;		isConvergent = false;
		isSpeculatable = false;

if (DefName.size() <= 4 \|\|		if (DefName.size() <= 4 \|\|
std::string(DefName.begin(), DefName.begin() + 4) != "int_")		std::string(DefName.begin(), DefName.begin() + 4) != "int_")
PrintFatalError("Intrinsic '" + DefName + "' does not start with 'int_'!");		PrintFatalError("Intrinsic '" + DefName + "' does not start with 'int_'!");

EnumName = std::string(DefName.begin()+4, DefName.end());		EnumName = std::string(DefName.begin()+4, DefName.end());

if (R->getValue("GCCBuiltinName")) // Ignore a missing GCCBuiltinName field.		if (R->getValue("GCCBuiltinName")) // Ignore a missing GCCBuiltinName field.
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = PropList->size(); i != e; ++i) {
else if (Property->getName() == "Throws")		else if (Property->getName() == "Throws")
canThrow = true;		canThrow = true;
else if (Property->getName() == "IntrNoDuplicate")		else if (Property->getName() == "IntrNoDuplicate")
isNoDuplicate = true;		isNoDuplicate = true;
else if (Property->getName() == "IntrConvergent")		else if (Property->getName() == "IntrConvergent")
isConvergent = true;		isConvergent = true;
else if (Property->getName() == "IntrNoReturn")		else if (Property->getName() == "IntrNoReturn")
isNoReturn = true;		isNoReturn = true;
		else if (Property->getName() == "IntrSpeculatable")
		isSpeculatable = true;
else if (Property->isSubClassOf("NoCapture")) {		else if (Property->isSubClassOf("NoCapture")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, NoCapture));		ArgumentAttributes.push_back(std::make_pair(ArgNo, NoCapture));
} else if (Property->isSubClassOf("Returned")) {		} else if (Property->isSubClassOf("Returned")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
ArgumentAttributes.push_back(std::make_pair(ArgNo, Returned));		ArgumentAttributes.push_back(std::make_pair(ArgNo, Returned));
} else if (Property->isSubClassOf("ReadOnly")) {		} else if (Property->isSubClassOf("ReadOnly")) {
unsigned ArgNo = Property->getValueAsInt("ArgNo");		unsigned ArgNo = Property->getValueAsInt("ArgNo");
Show All 14 Lines

utils/TableGen/IntrinsicEmitter.cpp

Show First 20 Lines • Show All 470 Lines • ▼ Show 20 Lines	if (L->isNoDuplicate != R->isNoDuplicate)
return R->isNoDuplicate;		return R->isNoDuplicate;

if (L->isNoReturn != R->isNoReturn)		if (L->isNoReturn != R->isNoReturn)
return R->isNoReturn;		return R->isNoReturn;

if (L->isConvergent != R->isConvergent)		if (L->isConvergent != R->isConvergent)
return R->isConvergent;		return R->isConvergent;

		if (L->isSpeculatable != R->isSpeculatable)
		return R->isSpeculatable;

// Try to order by readonly/readnone attribute.		// Try to order by readonly/readnone attribute.
CodeGenIntrinsic::ModRefBehavior LK = L->ModRef;		CodeGenIntrinsic::ModRefBehavior LK = L->ModRef;
CodeGenIntrinsic::ModRefBehavior RK = R->ModRef;		CodeGenIntrinsic::ModRefBehavior RK = R->ModRef;
if (LK != RK) return (LK > RK);		if (LK != RK) return (LK > RK);

// Order by argument attributes.		// Order by argument attributes.
// This is reliable because each side is already sorted internally.		// This is reliable because each side is already sorted internally.
return (L->ArgumentAttributes < R->ArgumentAttributes);		return (L->ArgumentAttributes < R->ArgumentAttributes);
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	if (ae) {
OS << " AS[" << numAttrs++ << "] = AttributeList::get(C, "		OS << " AS[" << numAttrs++ << "] = AttributeList::get(C, "
<< argNo + 1 << ", AttrParam" << argNo + 1 << ");\n";		<< argNo + 1 << ", AttrParam" << argNo + 1 << ");\n";
}		}
}		}

if (!intrinsic.canThrow \|\|		if (!intrinsic.canThrow \|\|
intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem \|\|		intrinsic.ModRef != CodeGenIntrinsic::ReadWriteMem \|\|
intrinsic.isNoReturn \|\| intrinsic.isNoDuplicate \|\|		intrinsic.isNoReturn \|\| intrinsic.isNoDuplicate \|\|
intrinsic.isConvergent) {		intrinsic.isConvergent \|\| intrinsic.isSpeculatable) {
OS << " const Attribute::AttrKind Atts[] = {";		OS << " const Attribute::AttrKind Atts[] = {";
bool addComma = false;		bool addComma = false;
if (!intrinsic.canThrow) {		if (!intrinsic.canThrow) {
OS << "Attribute::NoUnwind";		OS << "Attribute::NoUnwind";
addComma = true;		addComma = true;
}		}
if (intrinsic.isNoReturn) {		if (intrinsic.isNoReturn) {
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::NoReturn";		OS << "Attribute::NoReturn";
addComma = true;		addComma = true;
}		}
if (intrinsic.isNoDuplicate) {		if (intrinsic.isNoDuplicate) {
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::NoDuplicate";		OS << "Attribute::NoDuplicate";
addComma = true;		addComma = true;
}		}
if (intrinsic.isConvergent) {		if (intrinsic.isConvergent) {
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::Convergent";		OS << "Attribute::Convergent";
addComma = true;		addComma = true;
}		}
		if (intrinsic.isSpeculatable) {
		if (addComma)
		OS << ",";
		OS << "Attribute::Speculatable";
		addComma = true;
		}

switch (intrinsic.ModRef) {		switch (intrinsic.ModRef) {
case CodeGenIntrinsic::NoMem:		case CodeGenIntrinsic::NoMem:
if (addComma)		if (addComma)
OS << ",";		OS << ",";
OS << "Attribute::ReadNone";		OS << "Attribute::ReadNone";
break;		break;
case CodeGenIntrinsic::ReadArgMem:		case CodeGenIntrinsic::ReadArgMem:
▲ Show 20 Lines • Show All 183 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Add speculatable function attributeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 97117

docs/LangRef.rst

include/llvm/Bitcode/LLVMBitCodes.h

include/llvm/IR/Attributes.td

include/llvm/IR/Function.h

include/llvm/IR/Intrinsics.td

lib/AsmParser/LLLexer.cpp

lib/AsmParser/LLParser.cpp

lib/AsmParser/LLToken.h

lib/Bitcode/Reader/BitcodeReader.cpp

lib/Bitcode/Writer/BitcodeWriter.cpp

lib/IR/Attributes.cpp

lib/IR/Verifier.cpp

test/Bitcode/attributes.ll

test/Bitcode/compatibility.ll

test/Verifier/speculatable-callsite-invalid.ll

test/Verifier/speculatable-callsite.ll

utils/TableGen/CodeGenIntrinsics.h

utils/TableGen/CodeGenTarget.cpp

utils/TableGen/IntrinsicEmitter.cpp

Add speculatable function attribute
ClosedPublic