This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Linker/
-
Linker/
3/9
IRMover.cpp
-
test/Linker/
-
Linker/
1/1
blockaddress.ll

Differential D120781

[Bitcode] materialize Functions early when BlockAddress taken
ClosedPublic

Authored by nickdesaulniers on Mar 1 2022, 3:16 PM.

Download Raw Diff

Details

Reviewers

dexonsmith
tejohnson
MaskRay
void
pcc
rickyz

Commits

rG23ec5782c3cc: [Bitcode] materialize Functions early when BlockAddress taken

Summary

IRLinker builds a work list of functions to materialize, then moves them
from a source module to a destination module one at a time.

This is a problem for blockaddress Constants, since they need not refer
to the function they are used in; IPSCCP is quite good at sinking these
constants deep into other functions when passed as arguments.

This would lead to curious errors during LTO:

ld.lld: error: Never resolved function from blockaddress ...

based on the ordering of function definitions in IR.

The problem was that IRLinker would basically do:

for function f in worklist:
  materialize f
  splice f from source module to destination module

in one pass, with Functions being lazily added to the running worklist.
This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.
This causes BitcodeReader to insert Functions into its BasicBlockFwdRefs
list incorrectly, as it will never re-materialize an already
materialized (but spliced out) function.

Because of the possibility that blockaddress Constants may appear in
Functions other than the ones they reference, this patch adds a new
bitcode function code FUNC_CODE_BLOCKADDR_USERS that is a simple list of
Functions that contain BlockAddress Constants that refer back to this
Function, rather then the Function they are scoped in. We then
materialize those functions when materializing f from the example loop
above. This might over-materialize Functions should the user of
BitcodeReader ultimately decide not to link those Functions, but we can
at least now we can avoid this ordering related issue with blockaddresses.

Fixes: https://github.com/llvm/llvm-project/issues/52787
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1215

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nickdesaulniers created this revision.Mar 1 2022, 3:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 3:16 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

nickdesaulniers requested review of this revision.Mar 1 2022, 3:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 3:16 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

nickdesaulniers edited the summary of this revision. (Show Details)Mar 1 2022, 3:19 PM

update comment in test

Thanks for working on this; it's pretty hairy.

This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.

Another way to solve this would be to mark the function somehow (either intrusively, somehow, or by adding it to a set / map). Can you contrast with that approach?

llvm/lib/Linker/IRMover.cpp
1525–1529	Can `materialize()` add things to the Worklist? If so, I don't think the ranged-based for is safe. I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time?

Harbormaster completed remote builds in B152069: Diff 412277.Mar 1 2022, 4:43 PM

In D120781#3353144, @dexonsmith wrote:

Thanks for working on this; it's pretty hairy.

This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.

Another way to solve this would be to mark the function somehow (either intrusively, somehow, or by adding it to a set / map). Can you contrast with that approach?

BitcodeReader could maintain a set of all functions it's imported, so that it could do the right thing here (rather than assume a function being empty means not yet parsed);
https://github.com/llvm/llvm-project/blob/1e16272ba793e5a6e7308898ecf6ef0dc99e2ad3/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L3037-L3066

In D120781#3353199, @nickdesaulniers wrote:

In D120781#3353144, @dexonsmith wrote:

Thanks for working on this; it's pretty hairy.

This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.

Another way to solve this would be to mark the function somehow (either intrusively, somehow, or by adding it to a set / map). Can you contrast with that approach?

BitcodeReader could maintain a set of all functions it's imported, so that it could do the right thing here (rather than assume a function being empty means not yet parsed);
https://github.com/llvm/llvm-project/blob/1e16272ba793e5a6e7308898ecf6ef0dc99e2ad3/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L3037-L3066

Can it also check isMaterializable() to differentiate? Is the "materializable" bit reset after materializing, allowing us to differentiate between "empty becuase it's materializable" and "empty because content got moved away"?

Not entirely sure this'll work, because of the cycle case (testcase @c and @d)... but maybe if you know it's been moved, there's a way to look up the location of the new definition? (is the valuemapper available?)

(If this isn't going to materialize anything *extra*, it seems fine to materialize in an earlier loop (your patch).)

Another potential concern: what happens with weak linkage? I.e., what if you have:

define void @sink(i8*)
define linkonce_odr void @haslabel() {
  br label %label
label:
  ret void
}
define void @usesblockaddress() {
  call void @sink(i8* blockaddress(@haslabel, %label))
  ret void
}

What if this TU's @haslabel is not selected by the linker? (Maybe this is just a failed link, since there's no guarantee the other @haslabel still has a basic block, or that it means the same thing, since it could be optimized away if its TU didn't have a blockaddress pointing at it.)

In D120781#3353252, @dexonsmith wrote:

In D120781#3353199, @nickdesaulniers wrote:

In D120781#3353144, @dexonsmith wrote:

Thanks for working on this; it's pretty hairy.

This confuses BitcodeReader, which cannot disambiguate whether a
blockaddress is referring to a function which has not yet been parsed
("materialized") or is simply empty because its body was spliced out.

Another way to solve this would be to mark the function somehow (either intrusively, somehow, or by adding it to a set / map). Can you contrast with that approach?

BitcodeReader could maintain a set of all functions it's imported, so that it could do the right thing here (rather than assume a function being empty means not yet parsed);
https://github.com/llvm/llvm-project/blob/1e16272ba793e5a6e7308898ecf6ef0dc99e2ad3/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L3037-L3066

Can it also check isMaterializable() to differentiate? Is the "materializable" bit reset after materializing, allowing us to differentiate between "empty becuase it's materializable" and "empty because content got moved away"?

Playing with this idea; I don't think maintaining a set of parsed functions will work; for the @x/@y case, @x has already been moved; there's no way to find the %label in @x. (Dumping the ir prints call void @f(i8* blockaddress(@x, <badref>)) since IRMover has spliced out the body of @x, we can't find the Value %label).

Not entirely sure this'll work, because of the cycle case (testcase @c and @d)... but maybe if you know it's been moved, there's a way to look up the location of the new definition? (is the valuemapper available?)

The value mapper is not available to the bitcode reader, AFAICT. The bitcode reader doesn't know about value mapping; IRMover does. IRMover is using BitcodeReader to materialize functions then splice+map values from one source module to another destination module.

Maybe passing an optional mapper into bitcode reader than querying that is the way to go? That feels odd, since the mapper is (IIUC) mapping Values from one Module to another, while BitcodeReader AFAICT is only concerned ATM with one single Module. I'm not sure that the mapper will handle mapping a Value from the destination module (since I suspect it only works with values from the source module as inputs).

(If this isn't going to materialize anything *extra*, it seems fine to materialize in an earlier loop (your patch).)

I think your early concern "can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time" is accurate; I think more unmaterialized functions can be. Items are added to the Worklist via IRLinker::maybeAdd(). IRMover calls this in its constructor, but also via a callback declared in IRLinker::shouldLink. IRLinker::AddLazyFor is a callback, which comes from IRMover::move. This callback is only defined to do anything in ModuleLinker::run. ModuleLinker::addLazyFor seems like it is adding items to the worklist (IIUC) (possibly multiple for COMDAT). (FWIW, there are 2 other callers of IRMover::move that both use empty callbacks that don't appear to add more values to the Worklist: LTO::linkRegularLTO and FunctionImporter::importFunctions. So ModuleLinker::run is the case in which unmaterialized functions MIGHT be added to the Worklist, IIUC).

Another potential concern: what happens with weak linkage? I.e., what if you have:
declare void @sink(i8*)
define linkonce_odr void @haslabel() {
  br label %label
label:
  ret void
}
define void @usesblockaddress() {
  call void @sink(i8* blockaddress(@haslabel, %label))
  ret void
}
What if this TU's @haslabel is not selected by the linker? (Maybe this is just a failed link, since there's no guarantee the other @haslabel still has a basic block, or that it means the same thing, since it could be optimized away if its TU didn't have a blockaddress pointing at it.)

The above test case currently runs through llvm-link just fine, regardless of my patch.

nickdesaulniers added a reviewer: pcc.Mar 2 2022, 10:55 AM

I'm having a difficult time figuring out how to solve the issue in the face of new functions being added to the worklist during mapping, without either passing the value map down to the bitcode reader, or maybe it would be better to add a callback that can be used to have the IRLinker tell the bitcode reader block address parsing what to do and do the mapping if the function has been materialized?

In D120781#3355054, @nickdesaulniers wrote:
In D120781#3353252, @dexonsmith wrote:
Another potential concern: what happens with weak linkage? I.e., what if you have:
declare void @sink(i8*)
define linkonce_odr void @haslabel() {
  br label %label
label:
  ret void
}
define void @usesblockaddress() {
  call void @sink(i8* blockaddress(@haslabel, %label))
  ret void
}
What if this TU's @haslabel is not selected by the linker? (Maybe this is just a failed link, since there's no guarantee the other @haslabel still has a basic block, or that it means the same thing, since it could be optimized away if its TU didn't have a blockaddress pointing at it.)
The above test case currently runs through llvm-link just fine, regardless of my patch.

Sorry, I guess I wasn't clear. That's not a complete test case. Complete testcase needs two files so that @haslabel gets replaced by another TU. For example:

% cat a.ll
define weak_odr void @haslabel() {
  ret void
}
% cat b.ll
declare void @sink(i8*)
define linkonce_odr void @haslabel() {
  br label %label
label:
  ret void
}

define void @usesblockaddress() {
  call void @sink(i8* blockaddress(@haslabel, %label))
  ret void
}

Result of linking a.ll and b.ll looks to me like it produces a bogus blockaddress:

% ./staging/build/bin/llvm-link a.ll b.ll -S
; ModuleID = 'llvm-link'
source_filename = "llvm-link"

define weak_odr void @haslabel() {
  ret void
}

define void @usesblockaddress() {
  call void @sink(i8* inttoptr (i32 1 to i8*))
  ret void
}

declare void @sink(i8*)

Maybe it should error instead... or maybe taking a blockaddress to a weak function shouldn't be allowed at all?

In D120781#3358338, @tejohnson wrote:

I'm having a difficult time figuring out how to solve the issue in the face of new functions being added to the worklist during mapping, without either passing the value map down to the bitcode reader, or maybe it would be better to add a callback that can be used to have the IRLinker tell the bitcode reader block address parsing what to do and do the mapping if the function has been materialized?

I think the callback option would work. One concern (with either solution) is we need to be sure that the function we point the blockaddress at actually came from this bitcode file (otherwise, I think it could lead to hard-to-reason about miscompiles).

A related idea: when the IRMover moves content from one function to another, leave behind a Function* in the moved-from function that the bitcodereader knows how to find (not sure exactly where to store it, but I think the place to do the work is in IRLinker::linkFunctionBody(), something like Src.recordMovedToFunction(Dst)). Unlike ValueMapper, this would not have a mapping in the "function replaced" case where it seems more likely to lead to miscompiles.

But maybe your callback option is better. I'm not sure.

dexonsmith added inline comments.Mar 4 2022, 3:02 PM

llvm/test/Linker/blockaddress.ll
5	BTW, it'd be better to have positive tests here than the `CHECK-NOT`. You can use `llvm-link -S` to emit the IR, and then check that each `blockaddress` is correct (rather than checking that no errors were emitted).

update test to check for good blockaddresses, as per @dexonsmith

In D120781#3361107, @dexonsmith wrote:

Result of linking a.ll and b.ll looks to me like it produces a bogus blockaddress:

That's weird, but note that that case does not lazily add to the running worklist. available_externally linkage seems to, but test cases like mine but marked available_externally already seem to work regardless of my patch (and I'm happy to add tests for that to prove that).

Otherwise I don't think we've found a concrete test case for which my approach does not work.

In D120781#3361107, @dexonsmith wrote:

Result of linking a.ll and b.ll looks to me like it produces a bogus blockaddress:

A slightly simpler test, take my @x + @y test and mark @x linkonce_odr. Though I'd argue that's a different case also involving blockaddresses which is broken, and my patch doesn't fix, but that seems slightly orthogonal to the bug I'm solving here, which is a concrete LTO failure observed in the wild Never resolved function from blockaddress.

Harbormaster completed remote builds in B152692: Diff 413166.Mar 4 2022, 4:48 PM

nickdesaulniers added inline comments.Mar 7 2022, 2:36 PM

llvm/lib/Linker/IRMover.cpp
1525–1529	Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe. I don't think so (though please do triple check). `IRLinker::maybeAdd` is the only method that adds to the worklist. That is only called from `IRLinker`'s constructor and `IRLinker::shouldLink`. While `IRLinker::shouldLink` is called from `IRLinker::materialize`, the method I'm calling here is `GlobalValue::materialize` (which through some indirection is a call to `BitcodeReader::materialize`. While these methods are all called `materialize`, `BitcodeReader` doesn't (AFAICT) have any reference to `IRLinker`'s private `Worklist` member. To triple check the other calls of `IRLinker::shouldLink`, I see `IRLinker::linkAppendingVarProto` and `IRLinker::linkGlobalValueProto`. The latter may call the former. The former is only called from `IRLinker::materialize`. So check my work, but it looks to me like the answer to `Can materialize() add things to the Worklist?` is "No; `IRLinker::materialize` can, but I'm not calling that; I'm calling `GlobalValue::materialize`". So I don't think there's an iterator invalidation bug here as implied by the question. Working on a reply to the second question...stay tuned.

nickdesaulniers edited reviewers, added: rickyz; removed: ricky.Mar 7 2022, 3:14 PM

nickdesaulniers added inline comments.Mar 8 2022, 10:07 AM

llvm/lib/Linker/IRMover.cpp

1525–1529

I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time?

Here's a test case that fails, with the same error I'm trying to fix here.

declare void @f(i8*)

; 1. @x gets parsed + moved. References to @x's %label can no longer be
;    created.
define void @x() {
  br label %label

label:
  ret void
}

; 2. @y not parsed; will be parsed lazily if referenced.
define linkonce_odr void @y() {
  br label %label

label:
; 4. when lazy parsing, @x's %label is gone, resulting in a badref.
  call void @f(i8* blockaddress(@x, %label))
  ret void
}

; 3. @z parsed, call to @y adds @y to worklist.
define void @z() {
  call void @y()
  ret void
}

$ llvm-as z.ll
$ llvm-link -S z.bc
error: Never resolved function from blockaddress (Producer: 'LLVM15.0.0git' Reader: 'LLVM 15.0.0git')

Let me see if I can more thoroughly separate the parsing from the linking/remapping, then I'll add the above test case to this patch.

nickdesaulniers planned changes to this revision.Mar 8 2022, 10:07 AM

handle functions added lazily to the worklist

nickdesaulniers planned changes to this revision.Mar 9 2022, 11:25 AM

nickdesaulniers added inline comments.

llvm/lib/Linker/IRMover.cpp
1525–1529	I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time? Diff 414166 now appears to handle this. I have a TODO on the existing call to `GlobalValue::materialize()` that I'd like to remove. That causes some tests to fail, so marking this "Changes Planned" until I have more time to sort that out. But PTAL with this approach.

Harbormaster completed remote builds in B153402: Diff 414166.Mar 9 2022, 11:50 AM

dexonsmith added inline comments.Mar 10 2022, 3:43 PM

llvm/lib/Linker/IRMover.cpp
1525–1529	Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe. I don't think so (though please do triple check). Is there a way to add an assertion to that effect? If so, it'd be great to do that as a prep commit (do a local bootstrap with assertions on, see if it survives the pre-commit bots, etc.) before we starting to rely on the fact. This is also important to prevent that behaviour from showing up later. As you described, the logic here is entangled. I've been surprised before by various co-recursive things in the valuemapper and irlinker.
1605–1613	I worry this could cause a big compile-time hit for ThinLTO, depending on the exact code paths it uses. Here's part of a testcase for situation I'm worried about: ; f1.ll define weak_odr void @f1() { ret void } ; f2.ll define linkonce_odr void @f1() { ret void } define void @f2() { call @f1() ret void } The scenario (more scaffolding needed): Linker decides to take `@f1` from f1.ll and `@f2` from f2.ll. The worklist for `f2.bc` contains just `@f2`, and `@f1` will (never) be materialized from `f2.bc`. IIUC, this patch changes it so that `@f1` is eagerly materialized from `f2.bc` even though it will be ignored. In the general case, this would effectively defeat lazy-loading of templates and inline functions. @tejohnson , do you know how to set up the above scenario for ThinLTO?

tejohnson added inline comments.Mar 11 2022, 9:11 AM

llvm/lib/Linker/IRMover.cpp

1605–1613

Yes this is a good catch. But I think where it affects issues is a slight change from what you have described above. We don't use the IRMover to load the destination module for each ThinLTO backend, so it wouldn't affect f2.ll when we are importing into it - f2.ll will be parsed here: http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#1088) in the thread for its ThinLTO backend.

However, where it will affect things is when we are importing f2 into another module, when the copy of f1 was selected from f1.ll. We can use llvm-lto2 which is passed explicit linker resolutions to simulate.

; f1.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

define weak_odr void @f1() {
  ret void
}

; f2.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

define linkonce_odr void @f1() {
  ret void
}
define void @f2() {
  call void @f1()
  ret void
}

; f3.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

declare void @f2()
define void @f3() {
  call void @f2()
  ret void
}

$ opt -module-summary f1.ll -o f1.o
$ opt -module-summary f2.ll -o f2.o
$ opt -module-summary f2.ll -o f2.o
$ llvm-lto2 run f1.o f2.o f3.o -save-temps -o f -r=f1.o,f1,plx -r=f2.o,f1, -r=f2.o,f2,plx -r=f3.o,f2, -r=f3.o,f3,plx -debug-only=function-import

The end of the debug output shows what we import from where into f3.o:

Is importing function 2072045998141807037 f1 from f1.ll
Not importing function 2072045998141807037 f1 from f2.ll
Is importing function 8471399308421654326 f2 from f2.ll
Imported 2 functions for Module f3.o

So we should not materialize f1 from f2.o when importing in f3.o's ThinLTO backend thread.

However, as noted in one of the comments earlier, the function importer has an empty AddLazyFor callback, so we will never add new values to materialize during the value mapping. I wonder if we key off of this somehow to not attempt to add new functions to materialize here?

Thinking (more), I think the materializing-too-much explosion will be a problem for use cases other than LTO, and not just for weak functions.

Consider a JIT, which initially just materializes main and codegens with callback stubs for any calls. When a stub is called the JIT materializes the called function, patches the stub, and continues, pulling in functions one at a time. This pre-materialization logic would cause the JIT to eagerly materialize everything that main transitively references.

That would be a huge regression. This needs to be significantly more narrow.

I've also changed my opinion from the comment in https://github.com/llvm/llvm-project/issues/52787#issuecomment-1010314787, where I said:

This could work, but I think it should be possible to fix BitcodeReader/IRMover to do the right thing without adding backedges.

[...]

If we did need something new serialized, maybe it could be limited to a bit per basic block saying "is this basic block referenced by a blockaddress outside this function?" (instead of a backedge, just a bit saying there is an edge), and then add ValueHandles for those basic blocks when materializing them to track what they get RAUW'd to.

But I'd much prefer to avoid having the function content in bitcode depend on what references it. Seems like the IRMover and BitcodeReader should be able to track this somehow. E.g., maybe the caller of IRMover should tell BitcodeReader that the basic blocks were moved to another function.

I've realized through this review (thanks @nickdesaulniers and @tejohnson for all iteration!) that:

A function being blockaddress-able ties it tightly to anything that points at it.
The only way to avoid a materialization explosion is to record backedges.

(Sorry for the poor guidance!)

Here's what I suggest:

Add to bitcode a way to say that materializing a function requires materializing another one. Maybe this could be a new record in the FUNCTION_BLOCK that's arbitrary size, only emitted if there are non-zero of them, which lists the functions that need to be materialized when this one is.
Use this feature to add edges in both directions every time one function has a blockaddress to another one. Probably not too hard; during value enumeration of function bodies, create a map of which functions need to be materialized with each other (based on observed blockaddresses), then use that to generate the records during bitcode emission.
Change the lazy materializer to use this information somehow (maybe this is the hard part).

WDYT?

In D120781#3376069, @dexonsmith wrote:

The only way to avoid a materialization explosion is to record backedges.

Unless I can make @tejohnson 's suggestion work regarding making IRLinker::AddLazyFor optional, then at that point I agree.

(Sorry for the poor guidance!)

No, no! I appreciate the feedback; it's helpful to have folks engage on this issue.

Here's what I suggest:

Add to bitcode a way to say that materializing a function requires materializing another one. Maybe this could be a new record in the FUNCTION_BLOCK that's arbitrary size, only emitted if there are non-zero of them, which lists the functions that need to be materialized when this one is.

Use this feature to add edges in both directions every time one function has a blockaddress to another one. Probably not too hard; during value enumeration of function bodies, create a map of which functions need to be materialized with each other (based on observed blockaddresses), then use that to generate the records during bitcode emission.

Change the lazy materializer to use this information somehow (maybe this is the hard part).

WDYT?

I can give it a shot.

llvm/lib/Linker/IRMover.cpp
1605–1613	So we should not materialize f1 from f2.o when importing in f3.o's ThinLTO backend thread. Ah, thanks for the test case. My current approach (Diff 414166) does materialize f1 from f2.o, so that would be a regression. However, as noted in one of the comments earlier, the function importer has an empty AddLazyFor callback, so we will never add new values to materialize during the value mapping. I wonder if we key off of this somehow to not attempt to add new functions to materialize here? So if we made `IRLinker::AddLazyFor` an `Optional` function reference (or function pointer) then we could easily tell+guarantee that no new functions could be added lazily to `IRLinker::Worklist`. Then we could go back to an earlier+simpler diff https://reviews.llvm.org/D120781?id=412277 just wrapped in one conditional check that `AddLazyFor` was not available. That might be the smallest possible fix that might be possible to get into the clang-14 release. Let me play with that approach a little before pursuing @dexonsmith 's recommendation to encode "backedges" (references) to functions in the bitcode, and let's see what that looks like if it works.

In D120781#3380046, @nickdesaulniers wrote:

In D120781#3376069, @dexonsmith wrote:

The only way to avoid a materialization explosion is to record backedges.

Unless I can make @tejohnson 's suggestion work regarding making IRLinker::AddLazyFor optional, then at that point I agree.

Is that just going to fix ThinLTO, or will it also fix other lazy-materialization use cases as well (like the potential JIT case I described)?

@lhames, do you know of concrete JIT uses cases that rely on lazy materialization for performance? Is there someone we could add to the review to help get a testcase to check, if @nickdesaulniers is able to get the AddLazyFor fix working for ThinLTO?

nickdesaulniers mentioned this in D121630: [IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC.Mar 14 2022, 11:27 AM

In D120781#3353252, @dexonsmith wrote:

Another potential concern: what happens with weak linkage? I.e., what if you have:

Thinking about this over the weekend; I don't think we should allow blockaddresses to reference available_externally linkage functions if the blockaddress' Function differs from the Instruction's Function (for the Instruction with the blockaddress Constant operand).

For example:

; y.ll
declare void @f(i8*)

define available_externally void @x() {
  br label %label

label:
  call void @y()
  ret void
}

define void @y() {
  ; Note: blockaddress refers to @x in fn @y.
  call void @f(i8* blockaddress(@x, %label))
  ret void
}

$ llc -o - y.ll

	.text
	.file	"y.ll"
	.globl	y                               # -- Begin function y
	.p2align	4, 0x90
	.type	y,@function
y:                                      # @y
	.cfi_startproc
# %bb.0:
	pushq	%rax
	.cfi_def_cfa_offset 16
	movl	$.Ltmp0, %edi ; oops! where is .Ltmp0?
	callq	f@PLT
	popq	%rax
	.cfi_def_cfa_offset 8
	retq
.Lfunc_end0:
	.size	y, .Lfunc_end0-y
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

$ opt -verify -S y.ll -o /dev/null
$ echo $?
0

Perhaps I should think harder about additional linkages that are added lazily to IRLinkers's Worklist (like linkonce_odr and perhaps weak). If all of the cases that could be added lazily are made invalid IR, with new verifier checks added, we could update IPSCCP to not sink such blockaddress Constants that would produce such invalid IR. Let me see if a quick verifier check can find any existing test cases I'm not thinking of where that might not work.

llvm/lib/Linker/IRMover.cpp
1605–1613	So if we made IRLinker::AddLazyFor an Optional function reference (or function pointer) then we could easily tell+guarantee that no new functions could be added lazily to IRLinker::Worklist. Then we could go back to an earlier+simpler diff https://reviews.llvm.org/D120781?id=412277 just wrapped in one conditional check that AddLazyFor was not available. That might be the smallest possible fix that might be possible to get into the clang-14 release. That might work for thinLTO, but it doesn't for the simple test case using `llvm-link` added in this patch; `IRLinker::AddLazyFor` is defined. :( https://reviews.llvm.org/D121630 is still a worthwhile cleanup IMO. I'll start looking into bitcode modifications.

nickdesaulniers mentioned this in rG236695e70c41: [IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC.Mar 14 2022, 2:38 PM

In D120781#3376069, @dexonsmith wrote:

Here's what I suggest:

Add to bitcode a way to say that materializing a function requires materializing another one. Maybe this could be a new record in the FUNCTION_BLOCK that's arbitrary size, only emitted if there are non-zero of them, which lists the functions that need to be materialized when this one is.

Use this feature to add edges in both directions every time one function has a blockaddress to another one. Probably not too hard; during value enumeration of function bodies, create a map of which functions need to be materialized with each other (based on observed blockaddresses), then use that to generate the records during bitcode emission.

Change the lazy materializer to use this information somehow (maybe this is the hard part).

WDYT?

One potential issue, consider the following case:

declare void @f(i8*)

define void @x() {
  br label %label

label:
  ret void
}

define linkonce_odr void @y() {
  br label %label

label:
  call void @f(i8* blockaddress(@x, %label))
  ret void
}

So @y contains a blockaddress that refers to @x, but because of its linkage AND lack of reference from any function w/ external linkage, @y is never emitted (let alone materialized). I think in this case, if we were to record that @x had a "backedge" from @y, and thus materialized @y as a result of materializing @x, we'd wind up "over-materializing" again. This is because BitcodeReader doesn't know about IRMover::Worklist and the complex linkage type handling, and lazy additions to the worklist.

In D120781#3381029, @nickdesaulniers wrote:
One potential issue, consider the following case:
declare void @f(i8*)

define void @x() {
  br label %label

label:
  ret void
}

define linkonce_odr void @y() {
  br label %label

label:
  call void @f(i8* blockaddress(@x, %label))
  ret void
}
So @y contains a blockaddress that refers to @x, but because of its linkage AND lack of reference from any function w/ external linkage, @y is never emitted (let alone materialized). I think in this case, if we were to record that @x had a "backedge" from @y, and thus materialized @y as a result of materializing @x, we'd wind up "over-materializing" again. This is because BitcodeReader doesn't know about IRMover::Worklist and the complex linkage type handling, and lazy additions to the worklist.

I think over-materializing here is probably okay, as a penalty for @x having exposed an internal pointer via blockaddress.

Given that blockaddress is a rarely used feature, I think the goals for this patch should be (in rough priority order):

Avoid compile-time regressions when not using blockaddress. Limit any compile-time regressions to IR intrinsically tied to the feature.
Get blockaddress working for these (new) cases.
Avoid unnecessary compile-time regressions for IR intrinsically tied to the feature.

I think over-materializing @x falls into category (3).

That case with @x reminded me of my concern about miscompiles due to blockaddress to a de-refinable function.

I was wondering if we could make those illegal in the verifier. In particular, reject a reference blockaddress(@f, ...) if:

it's not a self-reference (it appears outside the body of @f) AND
@f is de-refinable (linkage is available_externally or {weak,linkonce}{,_odr}) AND
the reference is NOT in the same comdat as @f.

But I think we probably can't disallow it. Clang will generate IR that violates that for:

void sink(...);
inline void f1() {
  static void *Label = &&label;
  sink(Label);
label:
  return;
}
void f2() { f1(); }

Maybe it'd be desirable to change Clang to put Label in the same comdat as f1(), but that won't solve object formats (like Mach-O) that don't have comdats.

For this case, the linker probably *in practice* probably selects them from the same object file.

nickdesaulniers mentioned this in D121970: [Verify] check that BlockAddresses don't refer to functions with certain linkages which wont be emitted.Mar 17 2022, 3:37 PM

In D120781#3376069, @dexonsmith wrote:

Here's what I suggest:

Add to bitcode a way to say that materializing a function requires materializing another one. Maybe this could be a new record in the FUNCTION_BLOCK that's arbitrary size, only emitted if there are non-zero of them, which lists the functions that need to be materialized when this one is.

Use this feature to add edges in both directions every time one function has a blockaddress to another one. Probably not too hard; during value enumeration of function bodies, create a map of which functions need to be materialized with each other (based on observed blockaddresses), then use that to generate the records during bitcode emission.

Change the lazy materializer to use this information somehow (maybe this is the hard part).

WDYT?

So I think when BitcodeWriter is serializing a function, it can check whether each of the function's basicblocks has its address taken. If so, it can build the set of GlobalValues that are Users of that BlockAddress. I'm curious if this should be part of the record or an Abbreviation? (Not sure I fully understand what an abbreviation is, despite just reading https://llvm.org/docs/BitCodeFormat.html#abbreviations). It seems like adding a record is unconditional though? So not sure about making it optional/only emitted if necessary?

Then, when BitcodeReader is parsing function records (parseFunctionRecord), it can read back these sub-records (or abbreviations). When parsing a function body (lazily), it can check these sub-records and call DeferredFunctionInfo.find, findFunctionInStream, Stream.JumpToBit, parseFunctionBody (basically the steps of BitcodeReader::materialize).

Herald added a subscriber: StephenFan. · View Herald TranscriptMar 23 2022, 5:12 PM

In D120781#3404235, @nickdesaulniers wrote:

So I think when BitcodeWriter is serializing a function, it can check whether each of the function's basicblocks has its address taken. If so, it can build the set of GlobalValues that are Users of that BlockAddress. I'm curious if this should be part of the record or an Abbreviation? (Not sure I fully understand what an abbreviation is, despite just reading https://llvm.org/docs/BitCodeFormat.html#abbreviations). It seems like adding a record is unconditional though? So not sure about making it optional/only emitted if necessary?

You'd either want to add a new record or add something to an existing record.

An abbreviation stores "meta-information", describing the encoding of any record that refers to it.
A record stores "content". It can optionally refer to any already-emitted abbreviation, customizing the record's encoding to match the abbreviation. This doesn't affect the record's content when parsed; it just changes the binary representation.
If you change the number/order/content of fields in a record, it's possible the abbreviation it was using will no longer apply. You might need to update it, or use a different one.

The other concept is a block, which is a scope into which you can put abbreviations, records, and other blocks. Blocks are nested, and are guaranteed to be 4B-aligned (records are not even 1B-aligned IIRC; they can start on any bit). You can remember where a block is and go back to read it later.

Then, when BitcodeReader is parsing function records (parseFunctionRecord), it can read back these sub-records (or abbreviations). When parsing a function body (lazily), it can check these sub-records and call DeferredFunctionInfo.find, findFunctionInStream, Stream.JumpToBit, parseFunctionBody (basically the steps of BitcodeReader::materialize).

Instead of adding a new field to a function record, I'd suggest adding a new record to FUNCTION_BLOCK, which contains the instructions and basic blocks. Then you're tying together:

Loading basic blocks of a function
Pulling up dependents that might refer directly to the function's basic blocks

I'd suggest creating a new record kind for this.

rework to use new bitcode function code record

nickdesaulniers retitled this revision from [IRLinker] materialize Functions before moving any to [Bitcode] materialize Functions early when BlockAddress taken.Apr 7 2022, 2:58 PM

nickdesaulniers edited the summary of this revision. (Show Details)

Looks like I accidentally dropped my tests. Let me re-add those.

Also, this is incomplete; there's some tests failing where the "use list" (???) seems to be incorrect; some blockaddresses are being deserialized with the basic block name changed to a number from a string that I don't yet understand.

Harbormaster completed remote builds in B158575: Diff 421343.Apr 7 2022, 3:56 PM

In D120781#3437465, @nickdesaulniers wrote:

there's some tests failing where the "use list" (???) seems to be incorrect

I suspect these are "use-list order" tests, which confirm that we can round-trip to bitcode without changing the use-list order.

Quick summary:

Every Value has an intrusive linked list of its Uses -- i.e., where it's used as an operand.
Some algorithms walk a Value's use-list, and their output can be influenced by the order that the Uses are listed.
This order is deterministic, but arbitrary somewhat arbitrary.
Bitcode serialization has an optional feature to serialize / recover this use-list order. This is important for reproducing some (rare) bugs, or and for guaranteeing that serializing/deserializing has no side effects.
To minimize overhead, the algorithm only stores diffs vs. the "natural" use-lists you'd get by running materializeAll(). (Side note: this feature is optional because the storage overhead is a bit too high. If we could guarantee that no one looked at use-lists of pure constants like i1 0, ptr null, and i32 0, it would likely be cheap enough to store all the time.)

(Incidentally, that feature is the only reason I know anything about blockaddress...)

Anyway, if you stop calling materialize() from inside parseFunctionBody(), you'll stop changing the use-list order, and these tests should stop bothering you.

some blockaddresses are being deserialized with the basic block name changed to a number from a string that I don't yet understand.

I'm guessing it's not safe to call materialize() from parseFunctionBody(); it's maybe not re-entrant, and something is getting corrupted?

Have a look at how blockaddress forward references are handled; if you follow a similar path for these discovered back-edges the corruption might go away.

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
4324–4343 ↗	(On Diff #421343)	I think it'd be better to save these in a list to iterate through after finishing materializing whatever's already "scheduled". I think the right place to run the code is probably in `materializeForwardReferencedFunctions()`, which is already doing stuff for block addresses (in the other direction!). This is called from two places: `BitcodeModule::getModuleImpl()`, which only calls it if lazy-loading -- i.e., if NOT calling `materializeAll()`. I am guessing that lazy-loading needs this because block addresses can be referenced by something that isn't lazily loaded (not sure what that is; maybe globals or some metadata?). `BitcodeReader::materialize()`, at the end.

ychen added a subscriber: ychen.Apr 7 2022, 4:34 PM

materialize later, re-add previous test, add llvm-bcanalyzer test; all tests now green

In D120781#3437602, @dexonsmith wrote:

In D120781#3437465, @nickdesaulniers wrote:

there's some tests failing where the "use list" (???) seems to be incorrect

I suspect these are "use-list order" tests, which confirm that we can round-trip to bitcode without changing the use-list order.

Tremendous and informative feedback! Indeed, your suggestion worked. PTAL.

Harbormaster completed remote builds in B158734: Diff 421574.Apr 8 2022, 12:13 PM

LGTM, with one nitpick inline!

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
3398–3399 ↗	(On Diff #421574)	Nit: IIRC, LLVM style usually adds `{}` braces to outer control structures whenever an inner one has them.

This revision is now accepted and ready to land.Apr 11 2022, 10:50 AM

fix style nit

nickdesaulniers marked an inline comment as done.Apr 12 2022, 11:31 AM

Harbormaster completed remote builds in B159295: Diff 422303.Apr 12 2022, 12:26 PM

This revision was landed with ongoing or failed builds.Apr 13 2022, 2:31 PM

Closed by commit rG23ec5782c3cc: [Bitcode] materialize Functions early when BlockAddress taken (authored by nickdesaulniers). · Explain Why

This revision was automatically updated to reflect the committed changes.

nickdesaulniers added a commit: rG23ec5782c3cc: [Bitcode] materialize Functions early when BlockAddress taken.

skan added a subscriber: skan.May 12 2023, 10:00 PM

Revision Contents

Path

Size

llvm/

lib/

Linker/

IRMover.cpp

38 lines

test/

Linker/

blockaddress.ll

125 lines

Diff 414166

llvm/lib/Linker/IRMover.cpp

Show First 20 Lines • Show All 514 Lines • ▼ Show 20 Lines	class IRLinker {
void flushRAUWWorklist();		void flushRAUWWorklist();

/// When importing for ThinLTO, prevent importing of types listed on		/// When importing for ThinLTO, prevent importing of types listed on
/// the DICompileUnit that we don't need a copy of in the importing		/// the DICompileUnit that we don't need a copy of in the importing
/// module.		/// module.
void prepareCompileUnitsForImport();		void prepareCompileUnitsForImport();
void linkNamedMDNodes();		void linkNamedMDNodes();

		Error preMaterializeFunctions();

public:		public:
IRLinker(Module &DstM, MDMapT &SharedMDs,		IRLinker(Module &DstM, MDMapT &SharedMDs,
IRMover::IdentifiedStructTypeSet &Set, std::unique_ptr<Module> SrcM,		IRMover::IdentifiedStructTypeSet &Set, std::unique_ptr<Module> SrcM,
ArrayRef<GlobalValue *> ValuesToLink,		ArrayRef<GlobalValue *> ValuesToLink,
std::function<void(GlobalValue &, IRMover::ValueAdder)> AddLazyFor,		std::function<void(GlobalValue &, IRMover::ValueAdder)> AddLazyFor,
bool IsPerformingImport)		bool IsPerformingImport)
: DstM(DstM), SrcM(std::move(SrcM)), AddLazyFor(std::move(AddLazyFor)),		: DstM(DstM), SrcM(std::move(SrcM)), AddLazyFor(std::move(AddLazyFor)),
TypeMap(Set), GValMaterializer(this), LValMaterializer(this),		TypeMap(Set), GValMaterializer(this), LValMaterializer(this),
▲ Show 20 Lines • Show All 562 Lines • ▼ Show 20 Lines

/// Copy the source function over into the dest function and fix up references		/// Copy the source function over into the dest function and fix up references
/// to values. At this point we know that Dest is an external function, and		/// to values. At this point we know that Dest is an external function, and
/// that Src is not.		/// that Src is not.
Error IRLinker::linkFunctionBody(Function &Dst, Function &Src) {		Error IRLinker::linkFunctionBody(Function &Dst, Function &Src) {
assert(Dst.isDeclaration() && !Src.isDeclaration());		assert(Dst.isDeclaration() && !Src.isDeclaration());

// Materialize if needed.		// Materialize if needed.
		// TODO: can this be removed?
if (Error Err = Src.materialize())		if (Error Err = Src.materialize())
return Err;		return Err;

// Link in the operands without remapping.		// Link in the operands without remapping.
if (Src.hasPrefixData())		if (Src.hasPrefixData())
Dst.setPrefixData(Src.getPrefixData());		Dst.setPrefixData(Src.getPrefixData());
if (Src.hasPrologueData())		if (Src.hasPrologueData())
Dst.setPrologueData(Src.getPrologueData());		Dst.setPrologueData(Src.getPrologueData());
▲ Show 20 Lines • Show All 402 Lines • ▼ Show 20 Lines	emitWarning("Linking two modules of different target triples: '" +
"'\n");		"'\n");

DstM.setTargetTriple(SrcTriple.merge(DstTriple));		DstM.setTargetTriple(SrcTriple.merge(DstTriple));

// Loop over all of the linked values to compute type mappings.		// Loop over all of the linked values to compute type mappings.
computeTypeMapping();		computeTypeMapping();

std::reverse(Worklist.begin(), Worklist.end());		std::reverse(Worklist.begin(), Worklist.end());

		// Materialize all Functions in Worklist and that will be added Lazily, prior
		// to remapping otherwise blockaddress Constants may fail to materialize
		// during IR moving.
		if (Error E = preMaterializeFunctions())
		return E;

while (!Worklist.empty()) {		while (!Worklist.empty()) {
		dexonsmithUnsubmitted Done Reply Inline Actions Can `materialize()` add things to the Worklist? If so, I don't think the ranged-based for is safe. I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time? dexonsmith: Can `materialize()` add things to the Worklist? If so, I don't think the ranged-based for is…
		nickdesaulniersAuthorUnsubmitted Done Reply Inline Actions Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe. I don't think so (though please do triple check). `IRLinker::maybeAdd` is the only method that adds to the worklist. That is only called from `IRLinker`'s constructor and `IRLinker::shouldLink`. While `IRLinker::shouldLink` is called from `IRLinker::materialize`, the method I'm calling here is `GlobalValue::materialize` (which through some indirection is a call to `BitcodeReader::materialize`. While these methods are all called `materialize`, `BitcodeReader` doesn't (AFAICT) have any reference to `IRLinker`'s private `Worklist` member. To triple check the other calls of `IRLinker::shouldLink`, I see `IRLinker::linkAppendingVarProto` and `IRLinker::linkGlobalValueProto`. The latter may call the former. The former is only called from `IRLinker::materialize`. So check my work, but it looks to me like the answer to `Can materialize() add things to the Worklist?` is "No; `IRLinker::materialize` can, but I'm not calling that; I'm calling `GlobalValue::materialize`". So I don't think there's an iterator invalidation bug here as implied by the question. Working on a reply to the second question...stay tuned. nickdesaulniers: > Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is…
		dexonsmithUnsubmitted Done Reply Inline Actions Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe. I don't think so (though please do triple check). Is there a way to add an assertion to that effect? If so, it'd be great to do that as a prep commit (do a local bootstrap with assertions on, see if it survives the pre-commit bots, etc.) before we starting to rely on the fact. This is also important to prevent that behaviour from showing up later. As you described, the logic here is entangled. I've been surprised before by various co-recursive things in the valuemapper and irlinker. dexonsmith: >> Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is…
		nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time? Here's a test case that fails, with the same error I'm trying to fix here. declare void @f(i8) ; 1. @x gets parsed + moved. References to @x's %label can no longer be ; created. define void @x() { br label %label label: ret void } ; 2. @y not parsed; will be parsed lazily if referenced. define linkonce_odr void @y() { br label %label label: ; 4. when lazy parsing, @x's %label is gone, resulting in a badref. call void @f(i8 blockaddress(@x, %label)) ret void } ; 3. @z parsed, call to @y adds @y to worklist. define void @z() { call void @y() ret void } $ llvm-as z.ll $ llvm-link -S z.bc error: Never resolved function from blockaddress (Producer: 'LLVM15.0.0git' Reader: 'LLVM 15.0.0git') Let me see if I can more thoroughly separate the parsing from the linking/remapping, then I'll add the above test case to this patch. nickdesaulniers: > I also worry if it's sufficient; can the worklist loop below add new functions to the…
		nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time? Diff 414166 now appears to handle this. I have a TODO on the existing call to `GlobalValue::materialize()` that I'd like to remove. That causes some tests to fail, so marking this "Changes Planned" until I have more time to sort that out. But PTAL with this approach. nickdesaulniers: > I also worry if it's sufficient; can the worklist loop below add new functions to the…
GlobalValue *GV = Worklist.back();		GlobalValue *GV = Worklist.back();
Worklist.pop_back();		Worklist.pop_back();

// Already mapped.		// Already mapped.
if (ValueMap.find(GV) != ValueMap.end() \|\|		if (ValueMap.find(GV) != ValueMap.end() \|\|
IndirectSymbolValueMap.find(GV) != IndirectSymbolValueMap.end())		IndirectSymbolValueMap.find(GV) != IndirectSymbolValueMap.end())
continue;		continue;

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (NewValue) {
Globals.splice(Globals.end(), Globals, NewGV->getIterator());		Globals.splice(Globals.end(), Globals, NewGV->getIterator());
}		}
}		}

// Merge the module flags into the DstM module.		// Merge the module flags into the DstM module.
return linkModuleFlagsMetadata();		return linkModuleFlagsMetadata();
}		}

		Error IRLinker::preMaterializeFunctions() {
		// Make a copy of the Worklist.
		SmallVector<GlobalValue *, 32> MaterializeList(Worklist.begin(),
		Worklist.end());
		while (!MaterializeList.empty()) {
		GlobalValue *GV = MaterializeList.pop_back_val();
		if (auto *F = dyn_cast<Function>(GV)) {
		if (!F->isDeclaration())
		if (Error E = F->materialize())
		return E;
		// Once materialized, check all operands for references to other
		// Functions that have not yet been materialized.
		for (BasicBlock &BB : *F)
		for (Instruction &I : BB)
		for (Use &U : I.operands())
		if (auto *Callee = dyn_cast<Function>(U)) {
		// TODO: shouldLink?
		if (!Callee->isMaterializable())
		continue;
		if (Error E = Callee->materialize())
		return E;
		MaterializeList.push_back(Callee);
		}
		dexonsmithUnsubmitted Not Done Reply Inline Actions I worry this could cause a big compile-time hit for ThinLTO, depending on the exact code paths it uses. Here's part of a testcase for situation I'm worried about: ; f1.ll define weak_odr void @f1() { ret void } ; f2.ll define linkonce_odr void @f1() { ret void } define void @f2() { call @f1() ret void } The scenario (more scaffolding needed): Linker decides to take `@f1` from f1.ll and `@f2` from f2.ll. The worklist for `f2.bc` contains just `@f2`, and `@f1` will (never) be materialized from `f2.bc`. IIUC, this patch changes it so that `@f1` is eagerly materialized from `f2.bc` even though it will be ignored. In the general case, this would effectively defeat lazy-loading of templates and inline functions. @tejohnson , do you know how to set up the above scenario for ThinLTO? dexonsmith: I worry this could cause a big compile-time hit for ThinLTO, depending on the exact code paths…
		tejohnsonUnsubmitted Not Done Reply Inline Actions Yes this is a good catch. But I think where it affects issues is a slight change from what you have described above. We don't use the IRMover to load the destination module for each ThinLTO backend, so it wouldn't affect f2.ll when we are importing into it - f2.ll will be parsed here: http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#1088) in the thread for its ThinLTO backend. However, where it will affect things is when we are importing f2 into another module, when the copy of f1 was selected from f1.ll. We can use llvm-lto2 which is passed explicit linker resolutions to simulate. ; f1.ll target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" define weak_odr void @f1() { ret void } ; f2.ll target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" define linkonce_odr void @f1() { ret void } define void @f2() { call void @f1() ret void } ; f3.ll target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" declare void @f2() define void @f3() { call void @f2() ret void } $ opt -module-summary f1.ll -o f1.o $ opt -module-summary f2.ll -o f2.o $ opt -module-summary f2.ll -o f2.o $ llvm-lto2 run f1.o f2.o f3.o -save-temps -o f -r=f1.o,f1,plx -r=f2.o,f1, -r=f2.o,f2,plx -r=f3.o,f2, -r=f3.o,f3,plx -debug-only=function-import The end of the debug output shows what we import from where into f3.o: Is importing function 2072045998141807037 f1 from f1.ll Not importing function 2072045998141807037 f1 from f2.ll Is importing function 8471399308421654326 f2 from f2.ll Imported 2 functions for Module f3.o So we should not materialize f1 from f2.o when importing in f3.o's ThinLTO backend thread. However, as noted in one of the comments earlier, the function importer has an empty AddLazyFor callback, so we will never add new values to materialize during the value mapping. I wonder if we key off of this somehow to not attempt to add new functions to materialize here? tejohnson: Yes this is a good catch. But I think where it affects issues is a slight change from what you…
		nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions So we should not materialize f1 from f2.o when importing in f3.o's ThinLTO backend thread. Ah, thanks for the test case. My current approach (Diff 414166) does materialize f1 from f2.o, so that would be a regression. However, as noted in one of the comments earlier, the function importer has an empty AddLazyFor callback, so we will never add new values to materialize during the value mapping. I wonder if we key off of this somehow to not attempt to add new functions to materialize here? So if we made `IRLinker::AddLazyFor` an `Optional` function reference (or function pointer) then we could easily tell+guarantee that no new functions could be added lazily to `IRLinker::Worklist`. Then we could go back to an earlier+simpler diff https://reviews.llvm.org/D120781?id=412277 just wrapped in one conditional check that `AddLazyFor` was not available. That might be the smallest possible fix that might be possible to get into the clang-14 release. Let me play with that approach a little before pursuing @dexonsmith 's recommendation to encode "backedges" (references) to functions in the bitcode, and let's see what that looks like if it works. nickdesaulniers: > So we should not materialize f1 from f2.o when importing in f3.o's ThinLTO backend thread.
		nickdesaulniersAuthorUnsubmitted Not Done Reply Inline Actions So if we made IRLinker::AddLazyFor an Optional function reference (or function pointer) then we could easily tell+guarantee that no new functions could be added lazily to IRLinker::Worklist. Then we could go back to an earlier+simpler diff https://reviews.llvm.org/D120781?id=412277 just wrapped in one conditional check that AddLazyFor was not available. That might be the smallest possible fix that might be possible to get into the clang-14 release. That might work for thinLTO, but it doesn't for the simple test case using `llvm-link` added in this patch; `IRLinker::AddLazyFor` is defined. :( https://reviews.llvm.org/D121630 is still a worthwhile cleanup IMO. I'll start looking into bitcode modifications. nickdesaulniers: > So if we made IRLinker::AddLazyFor an Optional function reference (or function pointer) then…
		}
		}
		return Error::success();
		}

IRMover::StructTypeKeyInfo::KeyTy::KeyTy(ArrayRef<Type *> E, bool P)		IRMover::StructTypeKeyInfo::KeyTy::KeyTy(ArrayRef<Type *> E, bool P)
: ETypes(E), IsPacked(P) {}		: ETypes(E), IsPacked(P) {}

IRMover::StructTypeKeyInfo::KeyTy::KeyTy(const StructType *ST)		IRMover::StructTypeKeyInfo::KeyTy::KeyTy(const StructType *ST)
: ETypes(ST->elements()), IsPacked(ST->isPacked()) {}		: ETypes(ST->elements()), IsPacked(ST->isPacked()) {}

bool IRMover::StructTypeKeyInfo::KeyTy::operator==(const KeyTy &That) const {		bool IRMover::StructTypeKeyInfo::KeyTy::operator==(const KeyTy &That) const {
return IsPacked == That.IsPacked && ETypes == That.ETypes;		return IsPacked == That.IsPacked && ETypes == That.ETypes;
▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

llvm/test/Linker/blockaddress.ll

This file was added.

				; RUN: llvm-as %s -o %t.bc
				; RUN: llvm-link %t.bc -S \| FileCheck %s

				declare void @f(i8*)

				dexonsmithUnsubmitted Done Reply Inline Actions BTW, it'd be better to have positive tests here than the `CHECK-NOT`. You can use `llvm-link -S` to emit the IR, and then check that each `blockaddress` is correct (rather than checking that no errors were emitted). dexonsmith: BTW, it'd be better to have positive tests here than the `CHECK-NOT`. You can use `llvm-link…
				; Test that a blockaddress in @y referring to %label in @x can be moved when @y
				; appears after @x.
				define void @x() {
				br label %label

				label:
				call void @y()
				ret void
				}

				define void @y() {
				; CHECK: define void @y() {
				; CHECK-NEXT: call void @f(i8* blockaddress(@x, %label))
				call void @f(i8* blockaddress(@x, %label))
				ret void
				}

				; Test that a blockaddress in @a referring to %label in @b can be moved when @a
				; appears before @b.
				define void @a() {
				; CHECK: define void @a() {
				; CHECK-NEXT: call void @f(i8* blockaddress(@b, %label))
				call void @f(i8* blockaddress(@b, %label))
				ret void
				}

				define void @b() {
				br label %label

				label:
				call void @a()
				ret void
				}

				; Test that @c and @d can both have blockaddress Constants that refer to one
				; another.

				define void @c() {
				; CHECK: define void @c() {
				; CHECK-NEXT: br label %label
				; CHECK-EMPTY:
				; CHECK-NEXT: label:
				; CHECK-NEXT: call void @f(i8* blockaddress(@d, %label))
				br label %label

				label:
				call void @f(i8* blockaddress(@d, %label))
				ret void
				}

				define void @d() {
				; CHECK: define void @d() {
				; CHECK-NEXT: br label %label
				; CHECK-EMPTY:
				; CHECK-NEXT: label:
				; CHECK-NEXT: call void @f(i8* blockaddress(@c, %label))
				br label %label

				label:
				call void @f(i8* blockaddress(@c, %label))
				ret void
				}

				; Test that Functions added to IRLinker's Worklist member lazily (linkonce_odr)
				; aren't susceptible to the the same issues as @x/@y above.
				define void @parsed() {
				br label %label

				label:
				ret void
				}

				define linkonce_odr void @lazy() {
				; CHECK: define linkonce_odr void @lazy() {
				; CHECK-NEXT: br label %label
				; CHECK-EMPTY:
				; CHECK-NEXT: label:
				; CHECK-NEXT: call void @f(i8* blockaddress(@parsed, %label))
				br label %label

				label:
				call void @f(i8* blockaddress(@parsed, %label))
				ret void
				}

				define void @parsed2() {
				call void @lazy()
				ret void
				}

				; Same test as @lazy, just with one more level of lazy parsed functions.
				define void @parsed3() {
				br label %label

				label:
				ret void
				}

				define linkonce_odr void @lazy1() {
				; CHECK: define linkonce_odr void @lazy1() {
				; CHECK-NEXT: br label %label
				; CHECK-EMPTY:
				; CHECK-NEXT: label:
				; CHECK-NEXT: call void @f(i8* blockaddress(@parsed3, %label))
				br label %label

				label:
				call void @f(i8* blockaddress(@parsed3, %label))
				ret void
				}

				define linkonce_odr void @lazy2() {
				call void @lazy1()
				ret void
				}

				define void @parsed4() {
				call void @lazy2()
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Bitcode] materialize Functions early when BlockAddress takenClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 414166

llvm/lib/Linker/IRMover.cpp

llvm/test/Linker/blockaddress.ll

[Bitcode] materialize Functions early when BlockAddress taken
ClosedPublic