This is an archive of the discontinued LLVM Phabricator instance.

[HIP-Clang] propagate -mllvm options to opt and llc
ClosedPublic

Authored by ashi1 on Mar 13 2019, 12:41 PM.

Details

Summary

I've updated the HIP ToolChain to pass OPT_mllvm options into OPT and LLC stages.

Also added a lit test to verify its properly passed down. All check-clang tests passing.

Diff Detail

Repository
rC Clang

Event Timeline

ashi1 created this revision.Mar 13 2019, 12:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 13 2019, 12:41 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
arsenm added a subscriber: arsenm.Mar 13 2019, 12:51 PM

The real solution is to stop invoking these tools separately. clang -cc1 should be used for everything

ashi1 added a comment.Mar 13 2019, 1:01 PM

Hi Matt, that solution will need refactoring and testing. Currently, HIP-Clang is following the same link flow as HCC

ashi1 added a subscriber: scchan.Mar 13 2019, 1:02 PM

Hi Matt, that solution will need refactoring and testing. Currently, HIP-Clang is following the same link flow as HCC

HCC is also an issue. I really want effort put into fixing this rather than a halfway solution that will make it easier to continue deferring a real fix. I started on step 1 a long time ago (D59321). Step 2 is to fix this to behave like the CudaToolchain handling instead of invoking the tools. Pretty much everything should be how that already does this, except for the minor details

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

yaxunl accepted this revision.Mar 14 2019, 11:02 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Mar 14 2019, 11:02 AM
This revision was automatically updated to reflect the committed changes.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't working

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't workingh

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't working

hip-clang has been used for compiling and running real ML applications without issue. We do have lld step, see line 281. We do need to link different objects for this LinkerJob.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't workingh

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't working

hip-clang has been used for compiling and running real ML applications without issue. We do have lld step, see line 281. We do need to link different objects for this LinkerJob.

ML workloads are extremely unlikely to use a call. We should have an execution tests with noinline somewhere to stress this

ML workloads are extremely unlikely to use a call. We should have an execution tests with noinline somewhere to stress this

I compiled and ran a test with noinline function and I saw function call in ISA and it passes. I agree that we should have such a test and will keep it in mind.