This is an archive of the discontinued LLVM Phabricator instance.

[libcxx][test] Cover RISC-V in string.capacity test
ClosedPublic

Authored by ldionne on Feb 2 2023, 2:25 AM.

Details

Summary

Prior to this patch, libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp would fail for RISC-V.

The __riscv macro is defined for both RV32 and RV64.

Diff Detail

Event Timeline

asb created this revision.Feb 2 2023, 2:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2023, 2:25 AM
asb requested review of this revision.Feb 2 2023, 2:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2023, 2:25 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
asb added a comment.Feb 9 2023, 5:05 AM

Ping! Hopefully a very quick one to review. Thanks in advance.

Do you want libc++ to support RISC-V? In that case you'd have to provide a runner for the CI. Otherwise this is dead code from our perspective, which will just rot.

asb added a comment.Feb 9 2023, 6:56 AM

The runner I'm adding in D143172 runs the libc++ tests on RISC-V, which should avoid this becoming dead.

The runner I'm adding in D143172 runs the libc++ tests on RISC-V, which should avoid this becoming dead.

We generally require a runner in the libc++ precommit-CI itself. See https://libcxx.llvm.org/AddingNewCIJobs.html#addingnewcijobs for more information. Is the post-commit CI documented somewhere? If that's the case we should probably add a sentence about libc++ requiring a pre-commit CI runner. It seems that most people aren't aware of this. D139147 might me interesting in case you want to cross-compile the code and run it in an emulator.

asb added a comment.Feb 9 2023, 10:07 AM

We generally require a runner in the libc++ precommit-CI itself. See https://libcxx.llvm.org/AddingNewCIJobs.html#addingnewcijobs for more information. Is the post-commit CI documented somewhere? If that's the case we should probably add a sentence about libc++ requiring a pre-commit CI runner. It seems that most people aren't aware of this. D139147 might me interesting in case you want to cross-compile the code and run it in an emulator.

Documentation for the main CI is probably mainly here and here.

Just to check I've understood the infra properly - is the linked ADB patch an example of setting up a new CI job to run "on our existing infrastructure"? Should I expect to be able to submit qemu-based testing on top of the shared libcxx CI infra?

We generally require a runner in the libc++ precommit-CI itself. See https://libcxx.llvm.org/AddingNewCIJobs.html#addingnewcijobs for more information. Is the post-commit CI documented somewhere? If that's the case we should probably add a sentence about libc++ requiring a pre-commit CI runner. It seems that most people aren't aware of this. D139147 might me interesting in case you want to cross-compile the code and run it in an emulator.

Documentation for the main CI is probably mainly here and here.

Thanks! I'll look into making a patch to add a note about the libc++ CI.

Just to check I've understood the infra properly - is the linked ADB patch an example of setting up a new CI job to run "on our existing infrastructure"? Should I expect to be able to submit qemu-based testing on top of the shared libcxx CI infra?

The patch doesn't actually add a configuration to the CI, it just adds everything required to add a configuration. To actually add a configuration libcxx/utils/ci/buildkite-pipeline.yml has to be extended. I'm not sure what exactly our requirements are on CI infrastructure, but I think for adding a new platform some compute resources have to be provided. I think the "existing infrastructure" note is mostly about general configurations, like a new C++ version, testing the unstable ABI and similar stuff. I'm not sure though. @ldionne should be able to give a definite answer here.

asb added a comment.Feb 9 2023, 11:27 AM

Thanks for the advice so far. One thing I wanted to flag early so as to avoid wasted effort is that the requirement on the linked doc "We may be reluctant to add and support CI jobs that take a long time to finish or that are too flaky." may be challenging for runners that need to use emulation. I haven't checked what proportion of libcxx tests are execution vs compile tests, but check-libcxx on a fast x86_64 host using qemu-user for to emulate a 'native' RISC-V environment is not fast.

Thanks for the advice so far. One thing I wanted to flag early so as to avoid wasted effort is that the requirement on the linked doc "We may be reluctant to add and support CI jobs that take a long time to finish or that are too flaky." may be challenging for runners that need to use emulation. I haven't checked what proportion of libcxx tests are execution vs compile tests, but check-libcxx on a fast x86_64 host using qemu-user for to emulate a 'native' RISC-V environment is not fast.

Do you have some estimate how long a run takes? Currently the slowest runners take ~45 minutes to complete a job.

It would be an option to just have a minimal configuration running (i.e. set LIBCXX_ENABLE_RANDOM_DEVICE, LIBCXX_ENABLE_LOCALIZATION, LIBCXX_ENABLE_UNICODE, etc. to Off). That will probably reduce the runtime quite a bit (but of course also lower coverage). You can then enable these once there is native hardware available to run the tests (which should be the case soon-ish if I understand your commit message from D143172 correctly).

You could also try to use -O1 or -O2 when compiling the tests to reduce the amount of code that actually gets executed. This takes 25% or so longer when running on x64, but maybe the extra time spent compiling reduces the time executing enough that it makes sense in an environment with emulation. I think most of our tests could be removed by the compiler, since we only have very few ABI boundaries. Quite a bit of our tests are thing like assert(std::min(1, 2) == 1); or assert(std::string("string") + "whatever" == "stringwhatever");. (I should look into why the second example doesn't get optimized away)

Xeonacid added a comment.EditedFeb 19 2023, 7:03 AM

Thanks for the advice so far. One thing I wanted to flag early so as to avoid wasted effort is that the requirement on the linked doc "We may be reluctant to add and support CI jobs that take a long time to finish or that are too flaky." may be challenging for runners that need to use emulation. I haven't checked what proportion of libcxx tests are execution vs compile tests, but check-libcxx on a fast x86_64 host using qemu-user for to emulate a 'native' RISC-V environment is not fast.

Do you have some estimate how long a run takes? Currently the slowest runners take ~45 minutes to complete a job.

It would be an option to just have a minimal configuration running (i.e. set LIBCXX_ENABLE_RANDOM_DEVICE, LIBCXX_ENABLE_LOCALIZATION, LIBCXX_ENABLE_UNICODE, etc. to Off). That will probably reduce the runtime quite a bit (but of course also lower coverage). You can then enable these once there is native hardware available to run the tests (which should be the case soon-ish if I understand your commit message from D143172 correctly).

The check-cxx tests runs for ~2 hours both on 5950X with qemu-user and on SG2042 (64C RISC-V), which is likely the most powerful riscv hardware today.

asb added a comment.Feb 22 2023, 6:52 AM

I'm not sure what exactly our requirements are on CI infrastructure, but I think for adding a new platform some compute resources have to be provided. I think the "existing infrastructure" note is mostly about general configurations, like a new C++ version, testing the unstable ABI and similar stuff. I'm not sure though. @ldionne should be able to give a definite answer here.

Hi @ldionne - just a ping for your thoughts here (and on your thoughts about native RISC-V precommit being slow).

I'd imagine the ideal case here for giving reasonable coverage without taking too long would be to cross-compile everything (all the compile-time tests would be covered here at least on the fast x86 host), and to run the tests that need executing using qemu-user-static. It looks like the linked Android test scripts are doing native x86_64 testing - is there an example in the current CI of a cross-build setup?

There's also the question of what such tests would run on. If you have existing infra that's open to adding new jobs, it definitely seems easiest if that can be used - and it reduces the risk of unwanted failures due to bot-specific issues. But I can explore spinning up something myself if necessary.

I'm not sure what exactly our requirements are on CI infrastructure, but I think for adding a new platform some compute resources have to be provided. I think the "existing infrastructure" note is mostly about general configurations, like a new C++ version, testing the unstable ABI and similar stuff. I'm not sure though. @ldionne should be able to give a definite answer here.

Hi @ldionne - just a ping for your thoughts here (and on your thoughts about native RISC-V precommit being slow).

I'd imagine the ideal case here for giving reasonable coverage without taking too long would be to cross-compile everything (all the compile-time tests would be covered here at least on the fast x86 host), and to run the tests that need executing using qemu-user-static. It looks like the linked Android test scripts are doing native x86_64 testing - is there an example in the current CI of a cross-build setup?

There's also the question of what such tests would run on. If you have existing infra that's open to adding new jobs, it definitely seems easiest if that can be used - and it reduces the risk of unwanted failures due to bot-specific issues. But I can explore spinning up something myself if necessary.

What hardware does https://lab.llvm.org/staging/#/builders/241 run on? I think we should use that hardware. If executing the tests is too slow, one thing we could do is only compile the tests without running them. This is definitely possible, we do that for some internal configurations, but it does require a bit of work on the Lit side. We do not have an example of how to do that upstream right now.

For starters, what I would suggest is to set up a proper BuildKite bot that runs your RISC-V tests. Now that's going to be prohibitively slow, so I can suggest ways to make that faster and help you achieve the compile-only test suite on that Phab review once it's open. You can reach out to me privately on Discord to get a BuildKite agent token for your RISC-V runners.

asb added a comment.Mar 8 2023, 2:59 AM

I'm not sure what exactly our requirements are on CI infrastructure, but I think for adding a new platform some compute resources have to be provided. I think the "existing infrastructure" note is mostly about general configurations, like a new C++ version, testing the unstable ABI and similar stuff. I'm not sure though. @ldionne should be able to give a definite answer here.

Hi @ldionne - just a ping for your thoughts here (and on your thoughts about native RISC-V precommit being slow).

I'd imagine the ideal case here for giving reasonable coverage without taking too long would be to cross-compile everything (all the compile-time tests would be covered here at least on the fast x86 host), and to run the tests that need executing using qemu-user-static. It looks like the linked Android test scripts are doing native x86_64 testing - is there an example in the current CI of a cross-build setup?

There's also the question of what such tests would run on. If you have existing infra that's open to adding new jobs, it definitely seems easiest if that can be used - and it reduces the risk of unwanted failures due to bot-specific issues. But I can explore spinning up something myself if necessary.

What hardware does https://lab.llvm.org/staging/#/builders/241 run on? I think we should use that hardware. If executing the tests is too slow, one thing we could do is only compile the tests without running them. This is definitely possible, we do that for some internal configurations, but it does require a bit of work on the Lit side. We do not have an example of how to do that upstream right now.

For starters, what I would suggest is to set up a proper BuildKite bot that runs your RISC-V tests. Now that's going to be prohibitively slow, so I can suggest ways to make that faster and help you achieve the compile-only test suite on that Phab review once it's open. You can reach out to me privately on Discord to get a BuildKite agent token for your RISC-V runners.

It's running on a fast x86_64 host, using qemu-user within a RISC-V chroot/container. If we're going to be doing compile time tests anyway, wouldn't it make a lot more sense to do those compile-time tests using a cross compiler (i.e. running natively on a fast x86_64 host). If we do so, then it's just another configuration on the runners you already have.

ldionne commandeered this revision.Sep 7 2023, 8:42 AM
ldionne edited reviewers, added: asb; removed: ldionne.

[Github PR transition cleanup]

Commandeering to apply. I think it's better to apply this patch without proper CI than to drop it forever.

ldionne accepted this revision.Sep 7 2023, 8:45 AM
This revision is now accepted and ready to land.Sep 7 2023, 8:45 AM
This revision was automatically updated to reflect the committed changes.