This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Allow undefined symbols
AbandonedPublic

Authored by yaxunl on Feb 3 2021, 1:18 PM.

Details

Reviewers
tra
Summary

HIP managed variables need to be emitted as undefined symbols
since runtime needs to define them with managed memory
accessible by both device and host.

Let HIP toolchain allow that with lld.

Diff Detail

Event Timeline

yaxunl requested review of this revision.Feb 3 2021, 1:18 PM
yaxunl created this revision.
tra added a comment.Feb 3 2021, 1:44 PM

What's going to happen if you do have an undefined reference that's *not* to a __managed__ variable?

yaxunl added a comment.Feb 3 2021, 2:31 PM
In D95970#2540303, @tra wrote:

What's going to happen if you do have an undefined reference that's *not* to a __managed__ variable?

By default HIP toolchain uses -fvisibility hidden -fapply-global-visibility-to-externs for device compilation. Undefined variable symbols have protected visibility. Undefined function symbols have hidden visibility. Since they are not allowed to be preempted, lld still emits error if they are undefined. __managed__ variables have default visibility, therefore they are allowed to go through.

We can let HIP runtime emit an error if there are undefined symbols other than managed variables.

tra added a comment.Feb 3 2021, 4:27 PM
In D95970#2540303, @tra wrote:

What's going to happen if you do have an undefined reference that's *not* to a __managed__ variable?

By default HIP toolchain uses -fvisibility hidden -fapply-global-visibility-to-externs for device compilation. Undefined variable symbols have protected visibility. Undefined function symbols have hidden visibility. Since they are not allowed to be preempted, lld still emits error if they are undefined. __managed__ variables have default visibility, therefore they are allowed to go through.

Is there a test that verifies/demonstrates how it's handled at compile time?

We can let HIP runtime emit an error if there are undefined symbols other than managed variables.

Detecting compilation errors at run-time is not very useful. The end user would likely have no idea what's going on.
I wonder if we can catch the errors earlier. E.g. w/o RDC, we can probably catch it in the codegen.
With RDC, only the linker will know it, so you would need some sort of custom plugin for that.

What if we actually resolve all __managed__ references and point them to some sort of placeholder.
E.g a weak symbol, that would be directed to the real location when the binary is loaded.
Or, maybe, introduce an indirection and let the runtime handle managed variables as a special case
and point the indirect links to the right locations.
This way you can still run with -no-undefined and find all other unresolved references that should not be there.

yaxunl abandoned this revision.Feb 5 2021, 8:10 PM
In D95970#2540669, @tra wrote:
In D95970#2540303, @tra wrote:

What's going to happen if you do have an undefined reference that's *not* to a __managed__ variable?

By default HIP toolchain uses -fvisibility hidden -fapply-global-visibility-to-externs for device compilation. Undefined variable symbols have protected visibility. Undefined function symbols have hidden visibility. Since they are not allowed to be preempted, lld still emits error if they are undefined. __managed__ variables have default visibility, therefore they are allowed to go through.

Is there a test that verifies/demonstrates how it's handled at compile time?

We can let HIP runtime emit an error if there are undefined symbols other than managed variables.

Detecting compilation errors at run-time is not very useful. The end user would likely have no idea what's going on.
I wonder if we can catch the errors earlier. E.g. w/o RDC, we can probably catch it in the codegen.
With RDC, only the linker will know it, so you would need some sort of custom plugin for that.

What if we actually resolve all __managed__ references and point them to some sort of placeholder.
E.g a weak symbol, that would be directed to the real location when the binary is loaded.
Or, maybe, introduce an indirection and let the runtime handle managed variables as a special case
and point the indirect links to the right locations.
This way you can still run with -no-undefined and find all other unresolved references that should not be there.

Good point. I will try using indirection. This should avoid undefined symbols in device binary.