This is an archive of the discontinued LLVM Phabricator instance.

[clang-scan-deps] Improvements to thread usage
AbandonedPublic

Authored by saudi on May 17 2021, 8:27 AM.

Details

Summary

Switching back to std::thread to simplify the logic, as ThreadPool adds a layer of task management. However, keeping the ThreadPoolStrategy use, to benefit the ThreadPool improvements from https://reviews.llvm.org/D71775

This patch partly reverts https://reviews.llvm.org/D74569

Added a few optimizations:

    • Prevent from spawning more threads than there are files to process;
    • Perform computations on the main thread too, to save one thread spawn. This way, clang-scan-deps -j1 will be really single-threaded, and the code path is the same as when LLVM threads are disabled.
  • Call reserve() to std::vectors which size is known when initializing.

Diff Detail

Event Timeline

saudi requested review of this revision.May 17 2021, 8:27 AM
saudi created this revision.

It might be good for @aganea to take a look as well.

aganea added a subscriber: mehdi_amini.EditedMay 21 2021, 6:08 AM

It might be good for @aganea to take a look as well.

Thanks! I actually work with @saudi, I already took a look at the patch before uploading. However I'm stil torn about running one of the workers on the main thread. I fear that we could have random errors because of the stack size of the "main" thread could be different from the stack size of the "satellite" threads. There's 99.99% chance that this won't happen, but I'd prefer that behavior to be explicit.

We could have:

-j0: use all hardware threads
-j1: don't use multi-threading, run all the tasks on the main thread
-jN: use the specified number of threads

The rationale is that we're using clang-scan-deps as-a-DLL in Fastbuild, to extract the dependencies. Since Fastbuild already has its own thread pool management, we call into clang-scan-deps with -j1 from different Fastbuild threads (but keeping the DependencyScanningService alive between calls). It would great if each call to clang-scan-deps wouldn't create a extra new thread.

Perhaps any of you would like to comment? +@mehdi_amini

It might be good for @aganea to take a look as well.

Thanks! I actually work with @saudi, I already took a look at the patch before uploading. However I'm stil torn about running one of the workers on the main thread. I fear that we could have random errors because of the stack size of the "main" thread could be different from the stack size of the "satellite" threads. There's 99.99% chance that this won't happen, but I'd prefer that behavior to be explicit.

We could have:

-j0: use all hardware threads
-j1: don't use multi-threading, run all the tasks on the main thread
-jN: use the specified number of threads

The rationale is that we're using clang-scan-deps as-a-DLL in Fastbuild, to extract the dependencies. Since Fastbuild already has its own thread pool management, we call into clang-scan-deps with -j1 from different Fastbuild threads (but keeping the DependencyScanningService alive between calls). It would great if each call to clang-scan-deps wouldn't create a extra new thread.

Perhaps any of you would like to comment? +@mehdi_amini

Seems like if you want/need a DLL you should be using a library interface not the tool interface. If you can't / don't want to use the C++ API in clang/lib/Tooling, then we should push out a C API. Maybe you can confirm that https://github.com/apple/llvm-project/blob/apple/main/clang/include/clang-c/Dependencies.h in libclang will work for your needs? It landed on the fork just because it was experimental and libclang has a stable API promise, but I think the names of the functions make that clear enough -- e.g., clang_experimental_DependencyScannerWorker_create_v0 seems well-enough named that no one expects permanent compatibility -- and it seems pretty awkward to be using the tool as a DLL to work around not having this in tree. Then your FastBuild can directly manage the service and the workers. WDYT?

@arphaman / @Bigcheese / @jansvoboda11 , thoughts?

aganea added a comment.EditedMay 27 2021, 8:31 PM

@dexonsmith Yes, using the Clang Tooling C++ API seems like a good option, however some logic could/would be needed from clang/tools/clang-scan-deps/ClangScanDeps.cpp. Using clang-scan-deps as-a-DLL seemed like the best option on the short term, since we're using its main() almost verbatim with only a few minor changes. We still need to solve a contention issue related to the Windows implementation of sys::fs::status(), but then once everything is sorted out we'll re-consider the usage of the DLL.

In the meanwhile, do you have any opinion on this patch and/or on https://reviews.llvm.org/D102633#2773578 ? Do you think there's value for users?

saudi abandoned this revision.Jul 27 2023, 8:44 AM

@dexonsmith Yes, using the Clang Tooling C++ API seems like a good option, however some logic could/would be needed from clang/tools/clang-scan-deps/ClangScanDeps.cpp. Using clang-scan-deps as-a-DLL seemed like the best option on the short term, since we're using its main() almost verbatim with only a few minor changes. We still need to solve a contention issue related to the Windows implementation of sys::fs::status(), but then once everything is sorted out we'll re-consider the usage of the DLL.

Abandonning this patch, since we ended up creating a separate tool based on ClangScanDeps, modified and simplified for our needs: builds as a DLL, no -j # support as the main function is itself called from different threads, etc...

Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 8:44 AM