Update IR generated to retain section information for external declarations. This is related to https://reviews.llvm.org/D36487
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
docs/LangRef.rst | ||
---|---|---|
629 ↗ | (On Diff #111066) | This doesn't really explain the part that matters. For LangRef, it doesn't matter how the frontend decides to generate declarations (and it might be different for frontends other than clang). And OpenCL might motivate this, but it has nothing to do with the semantics. The key question here is what it actually means. Is it a promise the that a definition will exist for the given symbol in the given section? If so, what happens if the symbol is actually in a different section? ("Undefined behavior" is probably an acceptable answer, but it needs to be stated explicitly.) |
+kparzysz as Hexagon also makes use of the small data section.
docs/LangRef.rst | ||
---|---|---|
629 ↗ | (On Diff #111066) |
for targets which make use of this section information. I would also work in a sentence that says that attaching section information to a external declaration is an assertion or promise that the definition is located in that section. If the definition is actually located in a different section the behaviour is undefined. I'm favouring "undefined behaviour" in the error case as behaviour is so target and environment specific. |
docs/LangRef.rst | ||
---|---|---|
629 ↗ | (On Diff #111066) | The other side of the problem is, what if the declarations don't have any section information, but the definition does? Is that also an undefined behavior? I'm in favor of considering it UB as well, but I'm not sure what impact it would have on currently valid IR. |
The other side of the problem is, what if the declarations don't have any section information, but the definition does? Is that also an undefined behavior?
I can't see how "undefined behavior" could possibly be the right answer in that case. Every definition has to end up in some section eventually, and in many cases we don't know what section that will be when a global is declared. (For example, we put constants into different sections depending on the contents of the initializer.)
But yes, it would be nice to explicitly define how this behaves.
I updated the documentation to include Simon's comments. I wasn't sure whether to include the line about section information being retained in LLVM IR, since Eli mentioned it could vary for different front-ends. I have included it in this revision.
These constants don't have explicit section information in the IR. The typical scenario is that neither the declarations of a global, nor its definition have that. In the cases when the section is explicitly given on a definition, it was likely imposed by something like the "section" attribute in the source. I don't think it's unreasonable to expect that the declarations (in the original source as well as in the generated IR) should carry that information as well. However, since clang has apparently been ignoring that attribute on declarations, it's been generating IR where declarations may not have sections, but the corresponding definitions do.
docs/LangRef.rst | ||
---|---|---|
629 ↗ | (On Diff #111066) | The changes I made in Clang assume that the definition is located in the same section specified in the declaration. If its different, the IR for the translation unit with the extern declaration will emit whatever section was specified in the declaration while the variable is actually located elsewhere. I am not sure what the consequences of this are. |
docs/LangRef.rst | ||
---|---|---|
628 ↗ | (On Diff #111227) | "its". |
Does this result in unexpected behavior though? Won't this just result in the global being defined in the specified section?
I can mention this case explicitly in the documentation. However I am not sure whether it's UB or if the global will be defined in specified section.
If there is no section given explicitly, there is function SelectSectionForGlobal that will determine the section based on the properties of the global. If there is code that only sees the declaration, but needs to know the section, it will get it from that function. If the definition of that global doesn't have section either, that same function will be used, hopefully with the same result. On the other hand, if the definition of the global has an explicit section, it may differ from the one calculated for the declarations of that global. This could lead to a problem.
The problem is that the mismatch between sections does not have to lead to any undesirable behavior. Whether it does or not depends on a particular case. I think we should be consistent though---if we decide that a mismatch between the section for a declaration and the section for a definition leads to an undefined behavior, then it should be so regardless of whether the sections were given explicitly or inferred.
Updated the patch to include Krzysztof's comment about explicitly stating undefined behavior for section information mismatch in global variable declaration and definition. This should cover the case where section is explicitly specified in definition but not declaration.
Ok. That makes sense. I've updated the documentation to reflect this. Please let me know if it needs to be more detailed.
I agree with @kparzysz here. If there is a mis-match between the declaration and definition, there are cases where undesirable behaviour (as such) will not occur depending on how the mismatch occurs.
One suggestion inline for the wording inline, the important part is making explicit
LGTM, anyone else have more comments?
docs/LangRef.rst | ||
---|---|---|
582–584 ↗ | (On Diff #111388) | "If there is a mismatch between the explicit or inferred section information for the variable declaration and its definition the resulting behaviour is undefined." |
One suggestion inline for the wording inline, the important part is making explicit
The important part is making explicit that section information for a variable can be explicit or inferred.
Updated the documentation to explicitly state that section information for a variable can be explicit or inferred.
If there is no section given explicitly, there is function SelectSectionForGlobal that will determine the section based on the properties of the global. If there is code that only sees the declaration, but needs to know the section, it will get it from that function.
SelectSectionForGlobal takes a SectionKind, and in general you can't compute the right SectionKind without knowing the initializer. (See TargetLoweringObjectFile::getKindForGlobal.)