This is an archive of the discontinued LLVM Phabricator instance.

Emit section information for extern variables
ClosedPublic

Authored by eandrews on Aug 14 2017, 1:58 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

eandrews created this revision.Aug 14 2017, 1:58 PM
efriedma added inline comments.
docs/LangRef.rst
629 ↗(On Diff #111066)

This doesn't really explain the part that matters. For LangRef, it doesn't matter how the frontend decides to generate declarations (and it might be different for frontends other than clang). And OpenCL might motivate this, but it has nothing to do with the semantics.

The key question here is what it actually means. Is it a promise the that a definition will exist for the given symbol in the given section? If so, what happens if the symbol is actually in a different section? ("Undefined behavior" is probably an acceptable answer, but it needs to be stated explicitly.)

sdardis requested changes to this revision.Aug 15 2017, 6:56 AM
sdardis added a reviewer: kparzysz.

+kparzysz as Hexagon also makes use of the small data section.

docs/LangRef.rst
629 ↗(On Diff #111066)

to enable OpenCL processes.

for targets which make use of this section information.

I would also work in a sentence that says that attaching section information to a external declaration is an assertion or promise that the definition is located in that section. If the definition is actually located in a different section the behaviour is undefined.

I'm favouring "undefined behaviour" in the error case as behaviour is so target and environment specific.

This revision now requires changes to proceed.Aug 15 2017, 6:56 AM
kparzysz added inline comments.Aug 15 2017, 7:24 AM
docs/LangRef.rst
629 ↗(On Diff #111066)

The other side of the problem is, what if the declarations don't have any section information, but the definition does? Is that also an undefined behavior? I'm in favor of considering it UB as well, but I'm not sure what impact it would have on currently valid IR.

efriedma edited edge metadata.Aug 15 2017, 11:53 AM

The other side of the problem is, what if the declarations don't have any section information, but the definition does? Is that also an undefined behavior?

I can't see how "undefined behavior" could possibly be the right answer in that case. Every definition has to end up in some section eventually, and in many cases we don't know what section that will be when a global is declared. (For example, we put constants into different sections depending on the contents of the initializer.)

But yes, it would be nice to explicitly define how this behaves.

eandrews updated this revision to Diff 111227.Aug 15 2017, 12:40 PM
eandrews edited edge metadata.

I updated the documentation to include Simon's comments. I wasn't sure whether to include the line about section information being retained in LLVM IR, since Eli mentioned it could vary for different front-ends. I have included it in this revision.

kparzysz edited edge metadata.Aug 15 2017, 12:59 PM

I can't see how "undefined behavior" could possibly be the right answer in that case. Every definition has to end up in some section eventually, and in many cases we don't know what section that will be when a global is declared. (For example, we put constants into different sections depending on the contents of the initializer.)

These constants don't have explicit section information in the IR. The typical scenario is that neither the declarations of a global, nor its definition have that. In the cases when the section is explicitly given on a definition, it was likely imposed by something like the "section" attribute in the source. I don't think it's unreasonable to expect that the declarations (in the original source as well as in the generated IR) should carry that information as well. However, since clang has apparently been ignoring that attribute on declarations, it's been generating IR where declarations may not have sections, but the corresponding definitions do.

eandrews added inline comments.Aug 15 2017, 1:16 PM
docs/LangRef.rst
629 ↗(On Diff #111066)

The changes I made in Clang assume that the definition is located in the same section specified in the declaration. If its different, the IR for the translation unit with the extern declaration will emit whatever section was specified in the declaration while the variable is actually located elsewhere. I am not sure what the consequences of this are.

eandrews marked an inline comment as done.Aug 15 2017, 1:18 PM
efriedma added inline comments.Aug 15 2017, 4:21 PM
docs/LangRef.rst
628 ↗(On Diff #111227)

"its".

eandrews updated this revision to Diff 111341.Aug 16 2017, 7:48 AM

Corrected spelling error.

eandrews marked 2 inline comments as done.Aug 16 2017, 7:49 AM

In the cases when the section is explicitly given on a definition, it was likely imposed by something like the "section" attribute in the source. I don't think it's unreasonable to expect that the declarations (in the original source as well as in the generated IR) should carry that information as well. However, since clang has apparently been ignoring that attribute on declarations, it's been generating IR where declarations may not have sections, but the corresponding definitions do.

Does this result in unexpected behavior though? Won't this just result in the global being defined in the specified section?

I can mention this case explicitly in the documentation. However I am not sure whether it's UB or if the global will be defined in specified section.

Does this result in unexpected behavior though? Won't this just result in the global being defined in the specified section?

If there is no section given explicitly, there is function SelectSectionForGlobal that will determine the section based on the properties of the global. If there is code that only sees the declaration, but needs to know the section, it will get it from that function. If the definition of that global doesn't have section either, that same function will be used, hopefully with the same result. On the other hand, if the definition of the global has an explicit section, it may differ from the one calculated for the declarations of that global. This could lead to a problem.

The problem is that the mismatch between sections does not have to lead to any undesirable behavior. Whether it does or not depends on a particular case. I think we should be consistent though---if we decide that a mismatch between the section for a declaration and the section for a definition leads to an undefined behavior, then it should be so regardless of whether the sections were given explicitly or inferred.

eandrews updated this revision to Diff 111388.Aug 16 2017, 11:42 AM

Updated the patch to include Krzysztof's comment about explicitly stating undefined behavior for section information mismatch in global variable declaration and definition. This should cover the case where section is explicitly specified in definition but not declaration.

The problem is that the mismatch between sections does not have to lead to any undesirable behavior. Whether it does or not depends on a particular case. I think we should be consistent though---if we decide that a mismatch between the section for a declaration and the section for a definition leads to an undefined behavior, then it should be so regardless of whether the sections were given explicitly or inferred.

Ok. That makes sense. I've updated the documentation to reflect this. Please let me know if it needs to be more detailed.

sdardis accepted this revision.Aug 22 2017, 5:35 AM

I agree with @kparzysz here. If there is a mis-match between the declaration and definition, there are cases where undesirable behaviour (as such) will not occur depending on how the mismatch occurs.

One suggestion inline for the wording inline, the important part is making explicit

LGTM, anyone else have more comments?

docs/LangRef.rst
582–584 ↗(On Diff #111388)

"If there is a mismatch between the explicit or inferred section information for the variable declaration and its definition the resulting behaviour is undefined."

This revision is now accepted and ready to land.Aug 22 2017, 5:35 AM

One suggestion inline for the wording inline, the important part is making explicit

The important part is making explicit that section information for a variable can be explicit or inferred.

eandrews updated this revision to Diff 112172.Aug 22 2017, 7:30 AM

Updated the documentation to explicitly state that section information for a variable can be explicit or inferred.

eandrews marked an inline comment as done.Aug 22 2017, 7:30 AM
This revision was automatically updated to reflect the committed changes.

If there is no section given explicitly, there is function SelectSectionForGlobal that will determine the section based on the properties of the global. If there is code that only sees the declaration, but needs to know the section, it will get it from that function.

SelectSectionForGlobal takes a SectionKind, and in general you can't compute the right SectionKind without knowing the initializer. (See TargetLoweringObjectFile::getKindForGlobal.)