We would like to move ThinLTO’s battle-tested file caching mechanism to the LLVM Support library so that we can use it elsewhere in LLVM.
Details
- Reviewers
MaskRay gkm phosek pcc tejohnson - Group Reviewers
Restricted Project - Commits
- rGe678c5117710: [Support][ThinLTO] Move ThinLTO caching to LLVM Support library
rG92b8cc52bbc8: [Support][ThinLTO] Move ThinLTO caching to LLVM Support library
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
24,480 ms | x64 debian > libarcher.races::task-taskwait-nested.c |
Event Timeline
Can you state your intended use case (make that patch a dependent of this one by clicking Edit Related Revisions if you have such a patch) so that others can analyze whether code sharing is justified?
@MaskRay Thank you for pointing this out. I just added this!
The use case is a Debuginfod client implementation.
The AssetCache in the debuginfod client revision will be replaced by the localCache that was implemented for ThinLTO, after this revision to move the caching to Support.
ThinLTO's caching code has advantages over our own, such as the tweaks for Windows compatibility (example).
llvm/lib/Support/Caching.cpp | ||
---|---|---|
143 | I would make this configurable via function argument and set this to Thin for the LTO uses to avoid any behavioral changes. |
This lgtm but I agree with @phosek 's suggestion to configure the file prefix and keep it as "Thin" for ThinLTO.
Make file prefix and error prefix configurable.
This sets the prefix for existing use to "Thin" to avoid chaning the behavior.
llvm/lib/Support/Caching.cpp | ||
---|---|---|
31 | Can we make this a Twine? I'd also make this argument the first one. |
Adjust cache name/prefix customization.
Move customization arguments to be first. Convert to twines, remove unnecessary copy. Separate customized file prefix from customized cache name (to avoid chaning any behavior from existing ThinLTO cache).
llvm/lib/Support/Caching.cpp | ||
---|---|---|
37 | Thanks again for this suggestion. On second thought I am concerned about lifetime issues with this approach. The localCache function returns a lambda which captures the arguments by copy. The Twine and StringRef classes just point to some temporary memory which could be modified/freed before the returned lambda is invoked. Making a copy (or just running the sys::path::append before returning the lambda) would protect from such bugs. Here is a stripped-down example of the issue I'm concerned about: #include <iostream> #include <string.h> #include "llvm/ADT/Twine.h" using namespace llvm; auto localCacheDummy(Twine FooTwine) { return [=] () { std::cout << FooTwine.str(); }; } int main() { std::string SpecialString = "hello here is a string\n"; auto myFunction = localCacheDummy(SpecialString); myFunction(); SpecialString = "ok I have replaced the string contents\n"; myFunction(); return 0; } This outputs: hello here is a string ok I have replaced the string contents This is an issue for StringRef as well, so I am concerned the existing caching code is unsafe as well. This was probably fine when it was used solely with the usage pattern in ThinLTO. As we move it to the support library should we make it more safe? In particular, we could keep the parameters Twines but only access them in the top part of the localCache body outside the lambdas. |
llvm/lib/Support/Caching.cpp | ||
---|---|---|
37 | Hmm that is a good catch. I think you are right that we should be making a copy of all of these string reference parameters outside of the lambda (which we then capture by copy) to address this issue. |
llvm/lib/Support/Caching.cpp | ||
---|---|---|
37 | Thanks! These explicit copies have been added at the top of localCache. |
Make cache dir parameter a Twine and avoid unnecessary copy in conversion to SmallString.
@tejohnson Thanks for your comments! From my perspective this patch is done. Please let me know if you have any further changes to suggest before accepting. Thanks!
llvm/include/llvm/Support/Caching.h | ||
---|---|---|
2 | There needs to be a *- C++ -*-. |
Migrate references to Caching.cpp from llvm/utils/gn/secondary/llvm/lib/LTO/BUILD.gn to [...]/Support/BUILD.gn.
There needs to be a *- C++ -*-.
Can you check whether the header has 80 characters?