This is an archive of the discontinued LLVM Phabricator instance.

Represent runtime preemption in the IR.
ClosedPublic

Authored by sfertile on May 12 2016, 11:43 AM.

Details

Summary

Currently we do not represent runtime preemption in the IR, which has several drawbacks:

  1. The semantics of GlobalValues differ depending on the object file format you are targeting (as well as the relocation-model and -fPIE value).
  2. We have no way of disabling inlining of run time interposable functions, since in the IR we only know if a function is link-time interposable. Because of this llvm cannot support elf-interposition semantics.
  3. In LTO builds of executables we will have extra knowledge that a symbol resolved to a local definition and can't be preemptable, but have no way to propagate that knowledge through the compiler.

This patch adds preemptability specifiers to the IR with the following meaning:
dso_local means the compiler may assume the symbol will resolve to a definition within the current linkage unit and the symbol may be accessed directly even if the definition is not within this compilation unit.
dso_preemptable means that the compiler must assume the GlobalValue may be replaced with a definition from outside the current linkage unit at runtime.

There is a question with the implementation though: should these be optional, or do we eventually want these to be required on every GlobalValue? My motivation for picking up this patch is to resolve the lto limitation, and that can be done equally well either way. I believe the original motivation though was issues 1 and 2. To support an option like '-fsemantic-interposition' then clang will need to properly mark the global values with dso_local/dso_preemptable, and we can support older bitcode files simply by treating global values as preemptable when not marked as explicitly dso_local. That will be pessimistic with old bitcode but at least have the correct semantics and not require breaking compatibility. I would appreciate some feedback in this area, how do people want to proceed?

Diff Detail

Repository
rL LLVM

Event Timeline

rafael retitled this revision from to Add direct control of whether or not a symbol is preemtable at runtime.
rafael updated this object.
rafael added reviewers: rnk, rjmccall, tmsriram.
rafael added a subscriber: llvm-commits.
tmsriram edited edge metadata.May 12 2016, 12:33 PM

Do you guys think this is a reasonable direction? If so I will add bitcode serialization support and upgrade and plug it to the other object formats.

Thanks for taking the time to do this. If you could also address the use cases of these new linkages. I can think of two ways:

  • Using new attributes to specify these linkage types

Annotating the source with this is going to be difficult and error prone as the same source file could be potentially used to link a shared object and an executable.

  • Using compile-time flags like -fvisibilty=extern-local or -fvisibility=strong-preemptable

Do you envision something like this? In that case, with -fvisiblity=extern-local, if we do find some extern symbols are not local, are you planning to emit an error?

For ELF atleast, preemptability can be inferred by looking at -fPIC/-fPIE/none. What is unknown is extern local. If I do not want to do source modifications, how would I get around this?

I agree with the general idea. I ranted about that in the past: I don't like the fact that a "default visibility" means different things depending on the object format or the platform. Whatever new attribute we come up with, it would be nice if it allows to simplifies the IR definition of "visibility" so that "default" would be "declaration is visible to other modules" and nothing more.

We should rethink the predicate we have based on the linkage in GlobalValue (things like mayBeOverridden(), isInterposable(), etc.) in light of a new attribute such as what you're proposing here.

lib/IR/Globals.cpp
55 ↗(On Diff #57076)

I'm not fond of "This bit is set when we want to mark the runtime preemption as being different from the link time one", I rather have a real flag on the global value that say "isRuntimePremptable()" and avoid doing complicated inference in shouldAssumeDSOLocal() or in LLParser.cpp. I.e. it seems to me that it is easier to think about it as an "absolute" instead of a "relative" concept.

rnk added inline comments.May 12 2016, 3:55 PM
lib/AsmParser/LLLexer.cpp
501–502 ↗(On Diff #57076)

What do you think about dso_local and dso_preemptable? The code is very clear that this is all about DSO storage, but the langref keywords are a little ambiguous.

lib/IR/Globals.cpp
44 ↗(On Diff #57076)

Some comments would be useful here, the encoding of both local and preemtable in one bit is too clever.

Or we could do as Mehdi suggested and not worry about being so clever. :)

Can you confirm that this attribute would apply to declaration as well? So that we would generate direct accesses and not go through the GOT/PLT? Even for symbol defined in different module from their uses?

rafael updated this revision to Diff 57340.May 16 2016, 5:35 AM
rafael edited edge metadata.

This now implements all the codegen cases on x86.

I will implement the suggestions (absolute flag, rename .ll attributes) and upload again.

emaste added a subscriber: emaste.Jan 24 2017, 8:57 AM
davide added a subscriber: davide.May 19 2017, 11:04 AM

@rafael Hi Rafael, I'm interested in picking this work up. I have rebased your patch to top of trunk and made the changes suggested in the review comments (absolute flag, prepend 'dso_' to the new keywords). I have a question though,

Can you confirm that this attribute would apply to declaration as well? So that we would generate direct accesses and not go through the GOT/PLT? Even for symbol defined in different module from their uses?

Is this the intention? This is my motivation for picking this up.

sfertile commandeered this revision.Jun 9 2017, 6:50 AM
sfertile added a reviewer: rafael.
sfertile updated this revision to Diff 102025.EditedJun 9 2017, 8:18 AM
sfertile added a reviewer: hfinkel.
sfertile set the repository for this revision to rL LLVM.
sfertile added subscribers: kbarton, nemanjai, echristo and 6 others.

Updated the previous patch with a couple of the suggestions (make the dso_location flag absolute, add 'dso_' to the new keywords).
Refined the PowerPC FunctionCall lowering to be less conservative when deciding if a call may need a TOC restore, and updated relevant tests.
Added a PowerPC preemption test similar to the X86 test in the previous patch.

Since we now have TargetMachine::shouldAssumeDSOLocal which takes the compile options and object file format into account I didn't think we needed shouldAssumeDSOLocal in GlobalValue anymore, instead I simply respect the dso_location if it has been set explicitly.

The X86 test posted in the previous diff had a number of failures even when I stripped out everything except the 'default' cases (ie no dso_local/dso_preemptible keyword) . I left in the X86-Darwin tests since the default tests pass if I run on trunk and all the tests pass with the new changes.

Once we decide on the direction we want to take this patch then I will follow up with:

  1. Refine the PowerPC Taill call eligibility checks to use 'shouldAssumeDifferentTOC' instead of 'resideInSameSection'
  2. Add back the rest of the changes from the X86 target.
  3. fix the removed X86 tests for linux/windows.
  4. add bitcode serialization for dso_location.
  5. update isInterposable() in GlobalValue.
  6. Add a run step with -mcmodel=small to the 'ppc64-calls.ll' test.
pcc added a subscriber: pcc.Jun 9 2017, 9:01 PM
sfertile updated this revision to Diff 102201.Jun 12 2017, 10:49 AM

Updated the Tail-Call eligibility in the PowerPC back-end to use 'shouldAssumeDifferentTOC' and updated relevant tests.
Added '-code-model small' run steps to the PowerPC tests that check for tail-calls and TOC-restore nops.
Added (and updated) a Win32 preemption test.
Added 32-bit + pic run steps for the Darwin preemption test.

rnk edited edge metadata.Jun 12 2017, 12:41 PM

Nice, let's finally do this. :)

lib/Target/PowerPC/PPCISelLowering.cpp
4249 ↗(On Diff #102201)

Please split this change out so that it can be committed, tested, and reverted if necessary independently from the IR change.

Which word is more common, preemptable or preemptible?
This patch uses preemptable, and the discussion here uses both. man elf on Linux uses preemptible.

include/llvm/IR/GlobalValue.h
89 ↗(On Diff #102201)

nit: It is better to break a line; this line seems too long.

298 ↗(On Diff #102201)

nit: Unnecessary white line?

Which word is more common, preemptable or preemptible?
This patch uses preemptable, and the discussion here uses both. man elf on Linux uses preemptible.

Google n-grams suggests the "preemptible" is on the way out and was passed by "preemptable" in the 1990s.
https://books.google.com/ngrams/graph?content=pre-emptible%2C+preemptible%2C+preemptable&year_start=1800&year_end=2000&corpus=15&smoothing=3&share=&direct_url=t1%3B%2Cpre%20-%20emptible%3B%2Cc0%3B.t1%3B%2Cpreemptible%3B%2Cc0%3B.t1%3B%2Cpreemptable%3B%2Cc0

Mirriam-Webster and American Heritage online dictionaries have no entry for either spelling, suggesting it might be a more recent coinage. I'm told OED has "preemptible," but I don't have access to a copy to verify. M-W confirms that "-able" is more common in general than "-ible."

Grammarist explains that -ible is only for older words which have historically used -ible and that all recent coinages use -able. "All accepted -ible words are listed in the dictionary." http://grammarist.com/usage/able-ible/

So I would choose "-able" as it seems more future proof.

sfertile updated this revision to Diff 102569.Jun 14 2017, 10:28 AM

Split out the PowerPC codegen changes into a separate patch, and made some minor formatting changes.

sfertile marked 3 inline comments as done.Jun 14 2017, 10:29 AM

Missing LangRef update.

include/llvm/IR/GlobalValue.h
76 ↗(On Diff #102569)

Documentation?

113 ↗(On Diff #102569)

documentation?

sfertile updated this revision to Diff 106444.Jul 13 2017, 10:01 AM

Updated patch to use only 2 values for dso_location: dso_preemptable/dso_local.
Added Bitcode Reading/Writing.
Added tests for diagnostics issued when parsing.
Add test for llvm-as/llvm-dis.

sfertile edited the summary of this revision. (Show Details)Jul 13 2017, 10:14 AM
rnk added a comment.Sep 7 2017, 11:02 AM

This should update LangRef with a description of how this is supposed to interact with our existing linkages. For example, today we can inline external function definitions because they are not interposable according to our model. LLVM basically implements the C++ model: multiple definitions of an external linkage function would be an ODR violation, widespread usage of malloc interposition notwithstanding. You seem to be proposing to change that, because dso_local defaults to false. dso_preemptible seems to be your default. Why did you remote the old tri-state dso storage in the latest patch? It seems like we could use the third state to preserve our existing behavior unless the frontend tells us otherwise.

lib/Bitcode/Reader/BitcodeReader.cpp
926 ↗(On Diff #106444)

Unnecessary whitespace change

! In D20217#863605, @rnk wrote:
This should update LangRef with a description of how this is supposed to interact with our existing linkages. For example, today we can inline external function definitions because they are not interposable according to our model. LLVM basically implements the C++ model: multiple definitions of an external linkage function would be an ODR violation, widespread usage of malloc interposition notwithstanding. You seem to be proposing to change that, because dso_local defaults to false. dso_preemptible seems to be your default.

I removed the tri-state enum after thinking about Rafaels question on what the long term goal is. If the long term goal is to have IR producers marking every GlobalValue as either local/preemptable then introducing a 'default' enum value is undesirable. I can get the same behavior by treating one of the other enum values as a default in the interim. I picked the preemptable value to treat as 'unkown' for 2 reasons:

  1. So I could use local to represent the case where we can't infer from a single compilation unit if a symbol will resolve local but do know at link time with lto Use the linker resolutions to mark global values as dso_local.
  2. If we want to support an option similar to gcc '-fsemantic-interposition' then we essentially can't mark a symbol as local by default. We must consider a function as preemptable unless we know for sure it can't be interposed.

! In D20217#863605, @rnk wrote:

It seems like we could use the third state to preserve our existing behavior unless the frontend tells us otherwise.

I don't intend to change the default behavior to start considering runtime interposition, basically if we do add '-fsemantic-interposition' then it would default to off and let users who wanted full elf interpostion semantics to opt in.

I had replied to the commits list trying to get some feedback before updating the lang-ref but never got any replies. I'll go ahead and update it with what I currently have planned and hopefully that gets the discussion moving again.

sfertile updated this revision to Diff 118033.Oct 6 2017, 11:16 AM
sfertile retitled this revision from Add direct control of whether or not a symbol is preemtable at runtime to Represent runtime preemption in the IR..
sfertile edited the summary of this revision. (Show Details)

Add a section to the langref for preemption specifiers, and updated global variables and function sections.

sfertile marked 4 inline comments as done.Oct 6 2017, 11:18 AM
sfertile updated this revision to Diff 119106.Oct 15 2017, 8:47 PM

add preemption specifier to the alias entry in the LangRef and update the comments in LLParser to add PreemptionSpecifier.

sfertile removed subscribers: pcc, mehdi_amini.
pcc edited edge metadata.Oct 17 2017, 3:40 PM

Two states makes sense to me as well. From a bitcode upgrade perspective I think it would be valid to upgrade the current meaning of dso_preemptable to mean "always use the GOT" once that is implemented.

lib/AsmParser/LLParser.cpp
1639 ↗(On Diff #119106)

Should this also be a verifier check?

lib/Bitcode/Reader/BitcodeReader.cpp
3057 ↗(On Diff #119106)

Nit: new code should probably go between lines 3055 and 3056.

lib/Bitcode/Writer/BitcodeWriter.cpp
1200 ↗(On Diff #119106)

Please update docs/BitCodeFormat.rst.

sfertile updated this revision to Diff 119541.Oct 18 2017, 7:05 PM
  1. Updated the bitcode format document.
  2. Added Verifier check for dso_local and dllimport storage class on same global value.
  3. Updated the comments in BitcodeReader to match the current formats for GlobalVar/Function/GlobalIndirectSymbol.
sfertile marked 3 inline comments as done.Oct 18 2017, 7:07 PM
sfertile added inline comments.
lib/AsmParser/LLParser.cpp
1639 ↗(On Diff #119106)

Yes, added.

sfertile marked an inline comment as done.Oct 25 2017, 10:45 AM

ping.

pcc added a comment.Oct 25 2017, 10:52 AM

Looks good, but since @rnk had comments about the two states he should also take a look.

rnk accepted this revision.Oct 25 2017, 11:56 AM
In D20217#906852, @pcc wrote:

Looks good, but since @rnk had comments about the two states he should also take a look.

Peter's suggestion of how to handle backwards compatibility makes sense to me, so two states sounds good to me.

This revision is now accepted and ready to land.Oct 25 2017, 11:56 AM

New keyword[s]? Seems it might be a nice addition to also add them to utils/vim/syntax/llvm.vim and friends if appropriate.

New keyword[s]? Seems it might be a nice addition to also add them to utils/vim/syntax/llvm.vim and friends if appropriate.

Good point. I'll have a look at those, and update in a subsequent patch.

This revision was automatically updated to reflect the committed changes.