This is an archive of the discontinued LLVM Phabricator instance.

[Support][ThinLTO] Move ThinLTO caching to LLVM Support library.
ClosedPublic

Authored by noajshu on Oct 7 2021, 5:43 PM.

Details

Summary

We would like to move ThinLTO’s battle-tested file caching mechanism to the LLVM Support library so that we can use it elsewhere in LLVM.

Diff Detail

Event Timeline

noajshu created this revision.Oct 7 2021, 5:43 PM
Herald added a reviewer: gkm. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
noajshu requested review of this revision.Oct 7 2021, 5:43 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 7 2021, 5:43 PM

Can you state your intended use case (make that patch a dependent of this one by clicking Edit Related Revisions if you have such a patch) so that others can analyze whether code sharing is justified?

@MaskRay Thank you for pointing this out. I just added this!
The use case is a Debuginfod client implementation.
The AssetCache in the debuginfod client revision will be replaced by the localCache that was implemented for ThinLTO, after this revision to move the caching to Support.
ThinLTO's caching code has advantages over our own, such as the tweaks for Windows compatibility (example).

phosek added inline comments.Oct 7 2021, 7:20 PM
llvm/lib/Support/Caching.cpp
139

I would make this configurable via function argument and set this to Thin for the LTO uses to avoid any behavioral changes.

This lgtm but I agree with @phosek 's suggestion to configure the file prefix and keep it as "Thin" for ThinLTO.

noajshu updated this revision to Diff 378289.Oct 8 2021, 10:23 AM

Make file prefix and error prefix configurable.
This sets the prefix for existing use to "Thin" to avoid chaning the behavior.

noajshu updated this revision to Diff 378296.Oct 8 2021, 10:35 AM

Rebase against main.

phosek added inline comments.Oct 8 2021, 10:49 AM
llvm/lib/Support/Caching.cpp
32

Can we make this a Twine? I'd also make this argument the first one.

tejohnson added inline comments.Oct 8 2021, 10:57 AM
llvm/include/llvm/Support/Caching.h
69

Nit: document new parameter.

llvm/lib/Support/Caching.cpp
36

Is this copy necessary? I believe sys::path::append takes a Twine and eventually makes a copy of it.

noajshu updated this revision to Diff 378316.Oct 8 2021, 11:28 AM

Adjust cache name/prefix customization.
Move customization arguments to be first. Convert to twines, remove unnecessary copy. Separate customized file prefix from customized cache name (to avoid chaning any behavior from existing ThinLTO cache).

noajshu marked 3 inline comments as done.Oct 8 2021, 11:30 AM
noajshu added inline comments.
llvm/lib/Support/Caching.cpp
32

Done! Also split out the customized cache name and file prefixes, to avoid changing any behavior. In ThinLTO these were "ThinLTO" and "Thin" respectively.

36

Thanks, removed!

noajshu updated this revision to Diff 378328.Oct 8 2021, 11:55 AM
noajshu marked 2 inline comments as done.

Add documentation for new parameters to localCache.

noajshu marked an inline comment as done.Oct 8 2021, 11:56 AM
noajshu marked 3 inline comments as not done.Oct 8 2021, 1:50 PM
noajshu added inline comments.
llvm/lib/Support/Caching.cpp
36

Thanks again for this suggestion. On second thought I am concerned about lifetime issues with this approach. The localCache function returns a lambda which captures the arguments by copy. The Twine and StringRef classes just point to some temporary memory which could be modified/freed before the returned lambda is invoked. Making a copy (or just running the sys::path::append before returning the lambda) would protect from such bugs.

Here is a stripped-down example of the issue I'm concerned about:

#include <iostream>
#include <string.h>
#include "llvm/ADT/Twine.h"

using namespace llvm;

auto localCacheDummy(Twine FooTwine) {
	return [=] () {
		std::cout << FooTwine.str();
	};
}

int main() {
	std::string SpecialString = "hello here is a string\n";
	auto myFunction = localCacheDummy(SpecialString);
	myFunction();
	SpecialString = "ok I have replaced the string contents\n";
	myFunction();
	return 0;
}

This outputs:

hello here is a string
ok I have replaced the string contents

This is an issue for StringRef as well, so I am concerned the existing caching code is unsafe as well. This was probably fine when it was used solely with the usage pattern in ThinLTO. As we move it to the support library should we make it more safe? In particular, we could keep the parameters Twines but only access them in the top part of the localCache body outside the lambdas.

tejohnson added inline comments.Oct 8 2021, 11:06 PM
llvm/lib/Support/Caching.cpp
36

Hmm that is a good catch. I think you are right that we should be making a copy of all of these string reference parameters outside of the lambda (which we then capture by copy) to address this issue.

noajshu updated this revision to Diff 378785.Oct 11 2021, 1:45 PM
noajshu marked an inline comment as not done.

Make copies of string reference parameters outside the returned lambda.

noajshu added inline comments.Oct 11 2021, 1:52 PM
llvm/lib/Support/Caching.cpp
36

Thanks! These explicit copies have been added at the top of localCache.

noajshu updated this revision to Diff 378787.Oct 11 2021, 1:56 PM

rebase against main

noajshu updated this revision to Diff 379095.Oct 12 2021, 10:08 AM

rebase against main

noajshu marked 2 inline comments as done.Oct 12 2021, 10:14 AM
noajshu updated this revision to Diff 379104.Oct 12 2021, 10:34 AM

Make cache dir parameter a Twine and avoid unnecessary copy in conversion to SmallString.

noajshu marked an inline comment as done.Oct 12 2021, 10:34 AM

@tejohnson Thanks for your comments! From my perspective this patch is done. Please let me know if you have any further changes to suggest before accepting. Thanks!

noajshu updated this revision to Diff 379598.Oct 13 2021, 8:17 PM

Rebase against main.

noajshu updated this revision to Diff 379787.Oct 14 2021, 11:21 AM

rebase against main

This revision is now accepted and ready to land.Oct 14 2021, 1:36 PM
MaskRay accepted this revision.Oct 14 2021, 4:10 PM
MaskRay added inline comments.
llvm/include/llvm/Support/Caching.h
2

There needs to be a *- C++ -*-.
Can you check whether the header has 80 characters?

noajshu updated this revision to Diff 379912.Oct 14 2021, 8:56 PM

Add *- C++ -*- header indicator to Caching.h

noajshu marked an inline comment as done.Oct 14 2021, 8:56 PM
noajshu added inline comments.
llvm/include/llvm/Support/Caching.h
2

Thank you, nice catch @MaskRay . I just added this indicator and can confirm that the header has 80 characters.

noajshu updated this revision to Diff 380039.Oct 15 2021, 9:34 AM
noajshu marked an inline comment as done.

rebase against main

This revision was automatically updated to reflect the committed changes.
phosek reopened this revision.Oct 18 2021, 12:34 PM
This revision is now accepted and ready to land.Oct 18 2021, 12:34 PM
noajshu updated this revision to Diff 380498.Oct 18 2021, 12:38 PM

Update references to LTO/Caching in Gold plugin and Clang docs; rebase against main.

noajshu updated this revision to Diff 380527.Oct 18 2021, 3:01 PM

Migrate references to Caching.cpp from llvm/utils/gn/secondary/llvm/lib/LTO/BUILD.gn to [...]/Support/BUILD.gn.

This revision was automatically updated to reflect the committed changes.
lld/wasm/LTO.cpp