This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/docs/
-
docs/
4/6
LangRef.rst

Differential D79636

[LangRef] Clarify the semantics of the `byval` attribute
Needs ReviewPublic

Authored by jdoerfert on May 8 2020, 10:48 AM.

Download Raw Diff

Details

Reviewers

arsenm
spatel
efriedma
fhahn
rnk
hfinkel
regehr
nlopes
lebedev.ri

Summary

After the discussion in D79454 it seemed clear we need to clarify the
byval attribute semantics. This is an attempt to make it consistent
with the de-facto behavior and expectations of different LLVM passes.

The clarified rules are phrased such that:

an empty function with a byval argument is correctly classified as readnone (which it is today: https://godbolt.org/z/nePXM6).
a call of a readnone function with a byval argument is not classified as readnone (which it is today: https://godbolt.org/z/dDfQ5r)
we can eliminate the Clang rules that prevent the effect of pure or const on functions that take a byval argument (as they are circumvented by the middle-end anyway)

I doubt this will break existing backends or other parts but actually
avoid us creating (temporary) broken IR during the middle-end
optimizations.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jdoerfert created this revision.May 8 2020, 10:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2020, 10:48 AM

Herald added subscribers: dmgreen, bollu, wdng. · View Herald Transcript

efriedma added inline comments.May 8 2020, 11:29 AM

llvm/docs/LangRef.rst
1050	"Attributes on the call site argument and function argument are associated with the original and copied memory respectively" This seems to fly in the face of existing practice, which is that function attributes are copied to each callsite. I'd strongly prefer to keep the meaning of the attributes consistent, even if it leads to a weird result like writing to an argument marked "readonly".
1051	"wrtie "

Harbormaster failed remote builds in B56170: Diff 262900!May 8 2020, 12:20 PM

Some copy editing comments, but I agree with the semantics: From the IR perspective, it is better to think of argument stack memory as belonging to the callee. A byval argument has more in common with a local static alloca than a passed in pointer.

llvm/docs/LangRef.rst
1046–1057	Missing period? "scalars The copy"
1047	typo "memmory"

Fix spelling

"Attributes on the call site argument and function argument are associated with the original and copied memory respectively"

This seems to fly in the face of existing practice, which is that function attributes are copied to each callsite. I'd strongly prefer to keep the meaning of the attributes consistent, even if it leads to a weird result like writing to an argument marked "readonly".

First, this doesn't break with the practice that argument attributes are copied/applied to each call site, byval is still present at both the call site argument and the argument. It does state however that the pointer of the attribute is pointing to something different for the call site argument and for the argument. That is no different to the current semantic, as far as I can tell, just spelled out. Do you object to say that the call site argument and the argument point to distinct memory locations or something else?

Minor wording tweak

Harbormaster completed remote builds in B56239: Diff 263024.May 9 2020, 11:09 AM

Do you object to say that the call site argument and the argument point to distinct memory locations or something else?

Like I said, my issue is with the "Attributes on the call site argument and function argument are associated with the original and copied memory respectively". I assume this means "attributes other than byval". If I'm understanding this correctly, this means it isn't legal to copy attributes from the caller to the callee. If an argument is marked "readnone byval" on a function, it's illegal to copy that "readnone" to the callsite, because the readnone would then be associated with the original memory, not the copied memory. Or, a more silly example, say you had "byval returned" on the called function, and that got copied to the callsite: that clearly can't mean the original pointer is returned by the function.

Harbormaster completed remote builds in B56241: Diff 263026.May 9 2020, 12:12 PM

I have a minor question:

a call of a readnone function with a byval argument is not classified as readnone (which it is today: https://godbolt.org/z/dDfQ5r)

%0 at caller has readnone attribute - is it related with the propagation of readnone attribute from %0 of empty function to the caller?
Some comments above seems to be related with this question, but I rather wonder about the validity of the propagation of readnone in this example.

Actually I wonder whether things will become clearer if an alloca-and-copy (or something that is equivalent with this) is explicitly used to show the behavior of pass-as-value rather than byval implicitly encoding the behavior; my impression is that byval is different from other attributes like readnone or nonnull, because it isn't the result of value analysis. This will be a lot of work though...

arsenm added inline comments.May 9 2020, 3:57 PM

llvm/docs/LangRef.rst
1052–1053	Should this also specify the meaning of readonly/readnone as a callee side parameter attribute? is it disallowed to write to a readonly byval argument?

In D79636#2028294, @efriedma wrote:

Do you object to say that the call site argument and the argument point to distinct memory locations or something else?

Like I said, my issue is with the "Attributes on the call site argument and function argument are associated with the original and copied memory respectively". I assume this means "attributes other than byval".

Yes. byval applies to both.

If I'm understanding this correctly, this means it isn't legal to copy attributes from the caller to the callee.

Correct. It is not legal to do that now either, see below, but we just don't say so. TBH, I have not heard a proposal in which it would be legal but the copy would still happen implicitly (somewhere).

If an argument is marked "readnone byval" on a function, it's illegal to copy that "readnone" to the callsite, because the readnone would then be associated with the original memory, not the copied memory.

Right. That is what I think needs to be the semantics.

Or, a more silly example, say you had "byval returned" on the called function, and that got copied to the callsite: that clearly can't mean the original pointer is returned by the function.

I think byval returned example shows nicely that we cannot copy the attributes, right?

@efriedma I'm a bit confused. Could you propose some wording so I get a feeling where you want to go?

In D79636#2028363, @aqjune wrote:

I have a minor question:

a call of a readnone function with a byval argument is not classified as readnone (which it is today: https://godbolt.org/z/dDfQ5r)

%0 at caller has readnone attribute - is it related with the propagation of readnone attribute from %0 of empty function to the caller?
Some comments above seems to be related with this question, but I rather wonder about the validity of the propagation of readnone in this example.

The propagation in this example is *not* valid. This patch makes this clear (I hope).

Actually I wonder whether things will become clearer if an alloca-and-copy (or something that is equivalent with this) is explicitly used to show the behavior of pass-as-value rather than byval implicitly encoding the behavior; my impression is that byval is different from other attributes like readnone or nonnull, because it isn't the result of value analysis. This will be a lot of work though...

You are not wrong. Making it explicit would actually help us. I am in favor. Nonetheless, we currently seem to have no clear semantics on what byval means and how it interacts with other attributes. Clang strips __attribute__((pure)) if byval arguments are present but functionattrs will just add it again. More generally, this does currently not work:

for (Instruction *I : instructions(Fn))
  MayReadOrWrite |= I->mayReadOrWriteMemory();

if a call takes a byval argument.

Long story short, I would prefer this change to make the current behavior consistent and then a transition away from byval to some explicit copy model.

jdoerfert marked an inline comment as done.May 9 2020, 4:27 PM

jdoerfert added inline comments.

llvm/docs/LangRef.rst
1052–1053	Should this also specify the meaning of readonly/readnone as a callee side parameter attribute? I thought by specifying what memory the call site and argument pointer refer to, the readonly/readnone (and other attributes) fall in line. They apply to the respective memory. is it disallowed to write to a readonly byval argument? Writing any `readonly` argument is UB, I mean the store instruction causes UB.

@efriedma I'm a bit confused. Could you propose some wording so I get a feeling where you want to go?

Depending on which direction we go, either:

"Attributes on a function or callsite describe the behavior of the callee excluding the implied copy. For example, an optimization can infer readnone on a a byval argument if the callee does not access the copied memory."
"Attributes on a function or callsite describe the behavior of a call including the implied copy. For example, an optimization can never infer readnone on a call with a byval argument readnone because the implied copy reads memory. byval implies nocapture: there isn't any way to retrieve the original address in the callee."

I don't really care which we choose, we can easily model it either way. But I think we need to choose one or the other. Making attributes describe different things on each side of the call is much worse than either of those alternatives; it confuses what it means for a function or callsite to have an attribute.

In D79636#2028479, @efriedma wrote:

@efriedma I'm a bit confused. Could you propose some wording so I get a feeling where you want to go?

Depending on which direction we go, either:

"Attributes on a function or callsite describe the behavior of the callee excluding the implied copy. For example, an optimization can infer readnone on a a byval argument if the callee does not access the copied memory."

"Attributes on a function or callsite describe the behavior of a call including the implied copy. For example, an optimization can never infer readnone on a call with a byval argument readnone because the implied copy reads memory. byval implies nocapture: there isn't any way to retrieve the original address in the callee."

I don't really care which we choose, we can easily model it either way. But I think we need to choose one or the other. Making attributes describe different things on each side of the call is much worse than either of those alternatives; it confuses what it means for a function or callsite to have an attribute.

I think this helped. Thanks. I also understand your concerns now.

Given the choice between the two, I vote for the second:
Pros:

There is a place the read is attributed to, thus no need for D79454.
The byval -> nocapture step at the call site can be useful (done already by the Attributor).
It also is probably important to have byval -> noalias in the callee.

Cons:

You don't get byval -> readonly at the call site anymore (done by the Attributor).
Alignment at the call site might be higher than of the copy, breaking with the idea that the call site and callee "properties" match. Though, the attributes can probably be kept in sync if we teach the relevant parts.

aeubanks added a subscriber: aeubanks.May 9 2020, 9:35 PM

Sure, I'm happy with the second option.

Alignment at the call site might be higher than of the copy, breaking with the idea that the call site and callee "properties" match. Though, the attributes can probably be kept in sync if we teach the relevant parts.

The meaning of the "align" attribute with byval is weird from any perspective: it specifies both the alignment of the allocation, and the required alignment of the input. It would be nice to separate those at some point. (That would probably mean something like byval(<ty>, <align>) to specify the alignment of the copied memory, and then the align attribute would just specify the alignment of the input, like it does for other calls.)

jdoerfert mentioned this in D79744: clang: Use byref for aggregate kernel arguments.May 20 2020, 3:39 PM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 5:18 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

16 lines

Diff 263026

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,037 Lines • ▼ Show 20 Lines	``inreg``
target-specific.		target-specific.
``byval`` or ``byval(<ty>)``		``byval`` or ``byval(<ty>)``
This indicates that the pointer parameter should really be passed by		This indicates that the pointer parameter should really be passed by
value to the function. The attribute implies that a hidden copy of		value to the function. The attribute implies that a hidden copy of
the pointee is made between the caller and the callee, so the callee		the pointee is made between the caller and the callee, so the callee
is unable to modify the value in the caller. This attribute is only		is unable to modify the value in the caller. This attribute is only
valid on LLVM pointer arguments. It is generally used to pass		valid on LLVM pointer arguments. It is generally used to pass
structs and arrays by value, but is also valid on pointers to		structs and arrays by value, but is also valid on pointers to
scalars. The copy is considered to belong to the caller not the		scalars. The copy is conceptually made on the call edge. This means that
callee (for example, ``readonly`` functions should not write to		the pointer argument at the call site refers to the original memory and
		rnkUnsubmitted Done Reply Inline Actions typo "memmory" rnk: typo "memmory"
``byval`` parameters). This is not a valid attribute for return		the pointer argument in the callee is referring to the copy. Attributes on
values.		the call site argument and function argument are associated with the
		original and copied memory respectively. The copy is considered to be local
		efriedmaUnsubmitted Not Done Reply Inline Actions "Attributes on the call site argument and function argument are associated with the original and copied memory respectively" This seems to fly in the face of existing practice, which is that function attributes are copied to each callsite. I'd strongly prefer to keep the meaning of the attributes consistent, even if it leads to a weird result like writing to an argument marked "readonly". efriedma: "Attributes on the call site argument and function argument are associated with the original…
		memory of the callee. That means, a callee can write a ``byval`` parameter
		efriedmaUnsubmitted Done Reply Inline Actions "wrtie " efriedma: "wrtie "
		and still be ``readonly`` or ``readnone`` because the write is, similar to
		a write to an ``alloca``, not visible from the outside. To create the copy
		arsenmUnsubmitted Not Done Reply Inline Actions Should this also specify the meaning of readonly/readnone as a callee side parameter attribute? is it disallowed to write to a readonly byval argument? arsenm: Should this also specify the meaning of readonly/readnone as a callee side parameter attribute?
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions Should this also specify the meaning of readonly/readnone as a callee side parameter attribute? I thought by specifying what memory the call site and argument pointer refer to, the readonly/readnone (and other attributes) fall in line. They apply to the respective memory. is it disallowed to write to a readonly byval argument? Writing any `readonly` argument is UB, I mean the store instruction causes UB. jdoerfert: > Should this also specify the meaning of readonly/readnone as a callee side parameter…
		the original memory, which is the call site argument, is always read. This
		effect is also associated with the call site, thus it cannot be `readnone`
		or `writeonly` regardless of the callee. This is not a valid attribute for
		return values.
		rnkUnsubmitted Done Reply Inline Actions Missing period? "scalars The copy" rnk: Missing period? "scalars The copy"

The byval attribute also supports an optional type argument, which must be		The byval attribute also supports an optional type argument, which must be
the same as the pointee type of the argument.		the same as the pointee type of the argument.

The byval attribute also supports specifying an alignment with the		The byval attribute also supports specifying an alignment with the
align attribute. It indicates the alignment of the stack slot to		align attribute. It indicates the alignment of the stack slot to
form and the known alignment of the pointer specified to the call		form and the known alignment of the pointer specified to the call
site. If the alignment is not specified, then the code generator		site. If the alignment is not specified, then the code generator
▲ Show 20 Lines • Show All 18,755 Lines • Show Last 20 Lines