This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Run modules_include.sh.cpp compiles in parallel
ClosedPublic

Authored by philnik on Feb 23 2023, 4:34 AM.

Details

Diff Detail

Event Timeline

philnik created this revision.Feb 23 2023, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2023, 4:34 AM
philnik requested review of this revision.Feb 23 2023, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2023, 4:34 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

I wonder whether bash and wait will work across different platforms but otherwise looks better than my solution.

I wonder whether bash and wait will work across different platforms but otherwise looks better than my solution.

Probably not, but it's already XFAILed for AIX, FreeBSD, Windows and Android, so it probably wasn't really portable before.

Some numbers:

RunTest old timeTest new timeall old timeall new time
C++2b853s55s15m 33s8m 58s
Modular build763s51s14m 7s13m 6s
C++03227s14s4m 54s4m 7s

I have no idea what's going on on the windows runners. That's probably unrelated, since on windows the test is marked UNSUPPORTED. On Apple the test has a timeout now, probably because of resource contention, since this test starts ~140 processes in parallel now and they are much weaker than the Linux runners. I'll try to change the script to run at most 16 processes in parallel to avoid the contention. In a perfect world we'd use the lit thread pool for this, but that would require teaching lit about multiple independent processes from a single test file, which would take a lot longer than changing the script here. Let's fix the test first and then make it better.

philnik updated this revision to Diff 499866.Feb 23 2023, 8:26 AM

Try to avoid too much resource contention

philnik updated this revision to Diff 499931.Feb 23 2023, 11:45 AM

Rebased - hopefully this fixes the Windows problems.

philnik accepted this revision.Feb 23 2023, 1:23 PM

I'll land this now, since it fixes the CI and actually lets a lot of the runs be a lot faster. Feel free to review post-commit and I'll create a patch to address any comments.

This revision is now accepted and ready to land.Feb 23 2023, 1:23 PM
This revision was landed with ongoing or failed builds.Feb 23 2023, 1:24 PM
This revision was automatically updated to reflect the committed changes.

Thanks for this. On Arm 32 bit the test is down to 5 minutes now (could be faster I expect but we have a limited core count).

I'll land this now, since it fixes the CI and actually lets a lot of the runs be a lot faster. Feel free to review post-commit and I'll create a patch to address any comments.

I agree with our reasoning. I think it would be good for @ldionne to have a look.

libcxx/test/libcxx/modules_include.sh.cpp
40

Can you limit this to the number of cores on the system, for example, by using nproc?

Hi @philnik,

here is a problem with the test during the remote runs on the windows cross builders. A lot of those messages because of broken path:

clang++: error: no such file or directory: 'C:buildbotas-builder-1x-armv7lllvm-projectlibcxxtestlibcxxmodules_include.sh.cpp'
clang++: error: no input files
...

This is the windows to arm-linux cross builders, which run the tests on the arm/aarch64 boards.

Would you fix that or revert the changes?

Hi @philnik,

here is a problem with the test during the remote runs on the windows cross builders. A lot of those messages because of broken path:

clang++: error: no such file or directory: 'C:buildbotas-builder-1x-armv7lllvm-projectlibcxxtestlibcxxmodules_include.sh.cpp'
clang++: error: no input files
...

This is the windows to arm-linux cross builders, which run the tests on the arm/aarch64 boards.

Would you fix that or revert the changes?

Since this is isn't a libc++-supported configuration AFAICT and reverting the patch would break our own CI, I don't think that's an option. I'm not actually sure why the test runs at all, since it looks like you are running a windows-arm build, which should be disabled by the // UNSUPPORTED: windows. It might be enough to drop the bash -c to avoid making the backslashes escape sequences. I can't remember why I used the in the first place, since I just use echo directly in other places. I'll make a patch to simplify this.

libcxx/test/libcxx/modules_include.sh.cpp
40

I'm not sure how I would do that without building a thread pool. Just the number of cores doesn't really help. At that point it's probably easier to integrate with lit.

@vvereschaka I think your issue with escaping should indeed be solved by https://reviews.llvm.org/D144825. I suspect you ran into this issue because your executor (which must be some flavor of libcxx/utils/ssh.py must treat escapes differently.

Could you work with us on transitioning your build bot over to BuildKite so we run it as pre-commit? This is our supported way of doing CI now and it has the benefit that we see problems before the happen, not after.

@philnik,

Since this is isn't a libc++-supported configuration AFAICT and reverting the patch would break our own CI, I don't think that's an option.

This is supported libc++ configuration.

I'm not actually sure why the test runs at all, since it looks like you are running a windows-arm build, which should be disabled by the // UNSUPPORTED: windows.

Because the target is Arm/Aarch64 Linux. This is Window-to-ARM-Linux cross builders. // UNSUPPORTED: windows works for the Windows targets.

So, the test became broken on the windows cross builders. I'm going to disable it there for awhile.

@ldionne,

@vvereschaka I think your issue with escaping should indeed be solved by https://reviews.llvm.org/D144825.

Unfortunately it wasn't solved.

I suspect you ran into this issue because your executor (which must be some flavor of libcxx/utils/ssh.py must treat escapes differently.

I use libcxx's (libcxx/utils/ssh.py) executor to run the remote tests. The problem occurs on the build host side during the test preparations, before invoking ssh.py.

Could you work with us on transitioning your build bot over to BuildKite so we run it as pre-commit?

I'll think, I'll try to find some hardware for that purpose. But currently there is no possibility to convert the existing cross builders into the buildkite agents.

@philnik,

Since this is isn't a libc++-supported configuration AFAICT and reverting the patch would break our own CI, I don't think that's an option.

This is supported libc++ configuration.

I'm not actually sure why the test runs at all, since it looks like you are running a windows-arm build, which should be disabled by the // UNSUPPORTED: windows.

Because the target is Arm/Aarch64 Linux. This is Window-to-ARM-Linux cross builders. // UNSUPPORTED: windows works for the Windows targets.

Ah, that makes sense. I thought it was compiling for windows-arm, which we don't support.

So, the test became broken on the windows cross builders. I'm going to disable it there for awhile.

@ldionne,

@vvereschaka I think your issue with escaping should indeed be solved by https://reviews.llvm.org/D144825.

Unfortunately it wasn't solved.

I suspect you ran into this issue because your executor (which must be some flavor of libcxx/utils/ssh.py must treat escapes differently.

I use libcxx's (libcxx/utils/ssh.py) executor to run the remote tests. The problem occurs on the build host side during the test preparations, before invoking ssh.py.

Could you work with us on transitioning your build bot over to BuildKite so we run it as pre-commit?

I'll think, I'll try to find some hardware for that purpose. But currently there is no possibility to convert the existing cross builders into the buildkite agents.

Unfortunately I don't think I can really try to fix it without having access to some hardware. If you could give me access to a machine I'd be happy to work on fixing this.

Unfortunately I don't think I can really try to fix it without having access to some hardware. If you could give me access to a machine I'd be happy to work on fixing this.

Sorry, I can't give an access currently. But the problem is clear - this is a handling of the windows-style pathes by the current test. lit and clang handle them correctly, but when you switch to the unix tool (such as bash) on Windows, these tools do not work properly with that kind of pathes. They process the \x characters within the windows-style path as the control sequences and break the original path. The test became too bash specific/depended and now it needs a proper processing/converting the pathes (\ -> /) to work correctly on Windows hosts. sed -e s/\\/\//g on %s may help may be, but anyway it doesn't look portable.

Mordante added inline comments.Mar 11 2023, 5:30 AM
libcxx/test/libcxx/modules_include.sh.cpp
40

Then please use a way with lit, this test now uses a lot of resources on machines with less than 16 cores.