This is an archive of the discontinued LLVM Phabricator instance.

[clang][Driver] Handle risvc in Baremetal.cpp.
ClosedPublic

Authored by abidh on Nov 13 2020, 9:49 AM.

Details

Summary

I am working on a baremetal riscv toolchain using LLVM runtime and
LLD linker. Baremetal.cpp provides most of the things needed for such
toolchain. So I have modified it to also handle riscv64/32-unknown-elf
targets alongside arm-none-eabi.

Currently, targets like riscv64-unknown-elf are handled by RISCVToolChain
which mostly expects a gcc toolchain to be present. If you dont
want the dependency on gcc-toolchain/libgloss or want to use LLD, then
RISCVToolChain is not a good fit.

So in the toolchain selection code, I have made this dependency of
RISCVToolChain on gcc toolchain explicit. It is created if gcc-toolchain
option is present. Otherwise Baremetal toolchain is created. I will be
happy to hear if there is a better way to choose between these two
toolchains.

Diff Detail

Event Timeline

abidh created this revision.Nov 13 2020, 9:49 AM
abidh requested review of this revision.Nov 13 2020, 9:49 AM
jroelofs accepted this revision.Nov 13 2020, 10:00 AM

Seems reasonable.

I could see someone wanting to use --gcc-toolchain to point at the baremetal toolchain for their target, but that's unlikely to work out of the box anyway. I'd love to know where the Generic_GCC toolchain is used in practice, since that's always seemed quite fragile to me.

clang/lib/Driver/Driver.cpp
5150

for consistency and principle-of-least-surprise, maybe we should have the same check in the other place where toolchains::BareMetal is created?

This revision is now accepted and ready to land.Nov 13 2020, 10:00 AM

We use pure LLVM toolchains so improving support for that out of the box is good in my books. However, I do worry this is going to cause friction for a lot of people using LLVM for RISC-V; my understanding is that most use LLVM with a GNU sysroot and binutils, and so this looks like it will break their setups. Is there anything that can be done to automatically detect such cases? What does Arm do here?

We use pure LLVM toolchains so improving support for that out of the box is good in my books. However, I do worry this is going to cause friction for a lot of people using LLVM for RISC-V; my understanding is that most use LLVM with a GNU sysroot and binutils, and so this looks like it will break their setups. Is there anything that can be done to automatically detect such cases? What does Arm do here?

That was one of my concern too. Unlike riscv, Arm does not have an explicit check so it falls in the default where it picks Baremetal.

abidh updated this revision to Diff 305573.Nov 16 2020, 12:13 PM

Made the condition consistent in both places where Baremetal toolchain is instantiated as suggested in review.

abidh marked an inline comment as done.Nov 16 2020, 12:14 PM

Seems reasonable.

I could see someone wanting to use --gcc-toolchain to point at the baremetal toolchain for their target, but that's unlikely to work out of the box anyway. I'd love to know where the Generic_GCC toolchain is used in practice, since that's always seemed quite fragile to me.

Thanks for the review. I have handled the comment in updated diff. I will give it a few more days in case anyone else wants to comment before landing.

jrtc27 added inline comments.Nov 16 2020, 12:27 PM
clang/test/Driver/riscv64-toolchain.c
5

(repeated below and in riscv32-toolchain.c)

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

abidh added a comment.Nov 17 2020, 3:40 AM

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

The GCC toolchain, when given a relative path, already interprets it relative to the working directory the compiler was invoked from, not the directory the compiler is located in, iirc.

abidh added a comment.Nov 17 2020, 6:15 AM

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

The GCC toolchain, when given a relative path, already interprets it relative to the working directory the compiler was invoked from, not the directory the compiler is located in, iirc.

I dont think path is made absolute relative using working directory. What is happening is that if (!VFS.exists(Prefix)) call in the following line will succeed if the path existed relative to working directory. This seems accidental to me.

https://github.com/llvm/llvm-project/blob/5a9f3867046c4e1c97760e22a505f4d1d788417e/clang/lib/Driver/ToolChains/Gnu.cpp#L1947

I tried a small change to turn a relative --gcc-toolchain into absolute using the compiler installed path and it works ok. If this is something that works for you then I can post a patch.

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

The GCC toolchain, when given a relative path, already interprets it relative to the working directory the compiler was invoked from, not the directory the compiler is located in, iirc.

I dont think path is made absolute relative using working directory. What is happening is that if (!VFS.exists(Prefix)) call in the following line will succeed if the path existed relative to working directory. This seems accidental to me.

https://github.com/llvm/llvm-project/blob/5a9f3867046c4e1c97760e22a505f4d1d788417e/clang/lib/Driver/ToolChains/Gnu.cpp#L1947

I tried a small change to turn a relative --gcc-toolchain into absolute using the compiler installed path and it works ok. If this is something that works for you then I can post a patch.

Relative paths should be relative to the current working directory. That is what everyone expects and is what every other argument does (and see = and $SYSROOT for how GCC and compatible drivers get around that in cases like -I).

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

The GCC toolchain, when given a relative path, already interprets it relative to the working directory the compiler was invoked from, not the directory the compiler is located in, iirc.

I dont think path is made absolute relative using working directory. What is happening is that if (!VFS.exists(Prefix)) call in the following line will succeed if the path existed relative to working directory. This seems accidental to me.

https://github.com/llvm/llvm-project/blob/5a9f3867046c4e1c97760e22a505f4d1d788417e/clang/lib/Driver/ToolChains/Gnu.cpp#L1947

I tried a small change to turn a relative --gcc-toolchain into absolute using the compiler installed path and it works ok. If this is something that works for you then I can post a patch.

Relative paths should be relative to the current working directory. That is what everyone expects and is what every other argument does (and see = and $SYSROOT for how GCC and compatible drivers get around that in cases like -I).

And how is that accidental? It looks to me like the correct behaviour naturally falls out just as it normally does in that by default anything involving a relative path automatically interprets it relative to the current working directory.

I recall that there is also https://reviews.llvm.org/D68407 which iirc hoped to address using RISCVToolchain with Compiler-rt and libunwind - presumably it is not a complete solution, but it might be we want to use some checking of the GCCInstallation (like RISCVToolChain::GetDefaultRuntimeLibType does) to choose which toolchain to instantiate? I'm not sure if we can satisfy that ordering though.

abidh added a comment.Nov 17 2020, 8:31 AM

I'm worried about this change - I *think* it doesn't cover the existing behaviour of a baremetal GCC toolchain being installed into the same prefix as clang, and clang automatically picking up that baremetal gcc toolchain. What should we expect to do here? This is especially an issue if you're trying to make a relocatable toolchain tarball, where specifying --gcc-toolchain automatically is difficult.

Would it be possible to use a relative path with --gcc-toolchain then this can be checked in either RISCVToolChain.cpp or GNU.cpp and adjusted accordingly?

The GCC toolchain, when given a relative path, already interprets it relative to the working directory the compiler was invoked from, not the directory the compiler is located in, iirc.

I dont think path is made absolute relative using working directory. What is happening is that if (!VFS.exists(Prefix)) call in the following line will succeed if the path existed relative to working directory. This seems accidental to me.

https://github.com/llvm/llvm-project/blob/5a9f3867046c4e1c97760e22a505f4d1d788417e/clang/lib/Driver/ToolChains/Gnu.cpp#L1947

I tried a small change to turn a relative --gcc-toolchain into absolute using the compiler installed path and it works ok. If this is something that works for you then I can post a patch.

Relative paths should be relative to the current working directory. That is what everyone expects and is what every other argument does (and see = and $SYSROOT for how GCC and compatible drivers get around that in cases like -I).

The constructor of Driver class in clang resolves the default sysroot relative to install directory. That gave me the idea that perhaps similar thing could be done with gcc-toolchain too. I dont have a strong opinion on it either way. I am just trying to find a way where current RISCVToolChain based toolchain continue working while I can instantiate Baremetal toolchain.

abidh added a comment.Nov 17 2020, 8:54 AM

I recall that there is also https://reviews.llvm.org/D68407 which iirc hoped to address using RISCVToolchain with Compiler-rt and libunwind - presumably it is not a complete solution, but it might be we want to use some checking of the GCCInstallation (like RISCVToolChain::GetDefaultRuntimeLibType does) to choose which toolchain to instantiate? I'm not sure if we can satisfy that ordering though.

But that function is not static and you need to instantiate the RISCVToolChain to use things like GCCInstallation.isValid().

abidh updated this revision to Diff 306180.Nov 18 2020, 12:08 PM

This update contains following changes.

  1. Address issue raised by lenary. I have introduced a static function that checks for the presence of gcc toolchain first by --gcc-toolchain argument or in the same prefix where clang is installed. If found, RISCVToolChain is instantiated. I have added tests in riscv64-toolchain-extra.c and riscv32-toolchain-extra.c to check this case.
  1. Fixed some formatting things pointed out by jrtc27.
  1. In the initial revision, jroelofs commented that other place where Baremetal is intantiated should have the similar check. I put that in 2nd version of the patch. But I think it does quite make sense in the current patch so I have removed it.
  1. Rebased baremetal.cpp tests after recent changes and fixed some typos.
abidh marked an inline comment as done.Nov 18 2020, 12:09 PM
abidh added a comment.Nov 24 2020, 4:58 AM

Hi @lenary,
Do you any more comments on the patch?

Nope, no more comments from me. Thanks for updating based on my comments :)

This revision was automatically updated to reflect the committed changes.
MaskRay added a comment.EditedDec 9 2020, 5:59 PM

This change has been failing on my machine for the past two weeks.

riscv32-toolchain-extra.c and riscv64-toolchain-extra.c are not hermit. I have installed /usr/lib/gcc-cross/riscv64-linux-gnu/10 (gcc-10-riscv64-linux-gnu and its dependencies) on my machine and the test will find some directories relative to /usr/lib/gcc-cross/riscv64-linux-gnu/10. I think --sysroot is required for such tests.

%T is a deprecated lit feature. Please do something like rm -rf %t; mkdir %t instead.

abidh added a comment.Dec 10 2020, 4:09 AM

This change has been failing on my machine for the past two weeks.

riscv32-toolchain-extra.c and riscv64-toolchain-extra.c are not hermit. I have installed /usr/lib/gcc-cross/riscv64-linux-gnu/10 (gcc-10-riscv64-linux-gnu and its dependencies) on my machine and the test will find some directories relative to /usr/lib/gcc-cross/riscv64-linux-gnu/10. I think --sysroot is required for such tests.

%T is a deprecated lit feature. Please do something like rm -rf %t; mkdir %t instead.

Hi MaskRay,
Thanks for reporting this issue. I don't fully understand how test is finding directories relative to some other location. But I have opened https://reviews.llvm.org/D93023 to replace %T with %t as you suggested. Please have a look and see if this fixes your issue.

jrtc27 added inline comments.Jun 3 2021, 4:12 AM
clang/lib/Driver/Driver.cpp
5216–5220

This has broken our use downstream. We construct a normal bare-metal non-multilib baremetal sysroot, no GCC involved, passed explicitly with --sysroot. However, BareMetal's findRISCVMultilibs unconditionally appends multilib paths for anything other than IMAC (always ILP32/LP64) (well, it still appends the path, just the multilib path is curiously empty), without filtering out ones that don't exist like Gnu.cpp. I assume the empty string for IMAC is a hack to allow most normal uses of bare-metal toolchains to work, but it breaks if you want hard float support (or fewer extensions).

I suspect this is best fixed by filtering out non-existent directories from findRISCVMultilibs (and at the same time, fixing IMAC to have a non-empty path, since RISCVToolChain has one for it, and we no longer need the hacky, insufficient workaround)?

abidh added inline comments.Jun 3 2021, 5:12 AM
clang/lib/Driver/Driver.cpp
5216–5220

IIRC the empty path was probably there to keep the non-multilib toolchain working. I don't mind if we remove the empty path for IMAC and all multilibs have non-empty paths.

I am not sure on filtering out non-existing folder though. It is a bit different from GNU.cpp in that we may not be using a pre-built sysroot and use the just-built compiler for building the runtime too. Also I am wondering what should happen if user wants a multilib whose runtime bits are not present in the toolchain.

jrtc27 added inline comments.Jun 3 2021, 5:14 AM
clang/lib/Driver/Driver.cpp
5216–5220

Well we need to detect multilib vs non-multilib sysroots *somehow*, and everywhere else that gets done based on file or directory existence. If you want a multilib but don't have it, falling back on the non-multilib path and failing (either with incompatibility errors or missing files) seems fine to me.

MaskRay added inline comments.Oct 27 2023, 5:58 PM
clang/lib/Driver/ToolChains/RISCVToolchain.cpp
40

This logic (if --gcc-toolchain= is available or lib/crt0.o is present, select RISCVToolChain; otherwise BareMetal) is strange.

Can someone shed light on what this RISCVToolChain is intended and which crt0.o is used?

Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2023, 5:58 PM