This is an archive of the discontinued LLVM Phabricator instance.

[Support] Extend sys::path with user_cache_directory function.
ClosedPublic

Authored by chfast on Oct 16 2015, 5:17 AM.

Details

Summary

The new function sys::path::user_cache_directory tries to discover
a directory suitable for cache storage for current system user.

On Windows and Darwin it returns a path to system-specific user cache directory.

On Linux it follows XDG Base Directory Specification, what is:

  • use non-empty $XDG_CACHE_HOME env var,
  • use $HOME/.cache.

Diff Detail

Repository
rL LLVM

Event Timeline

chfast updated this revision to Diff 37571.Oct 16 2015, 5:17 AM
chfast retitled this revision from to [Support] Extend sys::path with user_cache_directory function..
chfast updated this object.
chfast added a subscriber: llvm-commits.
aaron.ballman added a subscriber: aaron.ballman.
aaron.ballman added inline comments.
include/llvm/Support/Path.h
328 ↗(On Diff #37571)

Should we be using @brief here like the rest of the file?

lib/Support/Unix/Path.inc
592 ↗(On Diff #37571)

Please use / only for doxygen comments, otherwise.

lib/Support/Windows/Path.inc
773 ↗(On Diff #37571)

How is this intended to be used? Is the caller expected to place a directory under the user_cache_directory() and use that, or are they expecting this to be a folder where they can dump cached data? I ask because putting files directly into the local AppData directory is not a particularly good thing to do on Windows.

Thanks for the review.

include/llvm/Support/Path.h
328 ↗(On Diff #37571)

@brief has become redundant some time ago as Doxygen's option JAVADOC_AUTOBRIEF has been turned on. The brief is now first line until dot.

lib/Support/Unix/Path.inc
592 ↗(On Diff #37571)

I actually wanted that to be exported as a Doxygen note. But that might not be the best idea as it is Unix specific.

lib/Support/Windows/Path.inc
773 ↗(On Diff #37571)

That's right. An application should append its name to the returned path in general case. But that issues is already present on Darwin with system_temp_directory(false) that returns a path of "~/Library/Caches".

That should be definitely clarified in the docs.

I'm thinking about adding optional param to the function with a path that will be appended to the returned path. Then usage could be `user_cache_directory(result, "vendor/appname"). What do you think?

Aaron, could you comment my comments?

Aaron, could you comment my comments?

Sorry for the delayed response; I'm in standards meetings this week, and vacation next, so I'm a bit hard to pin down.

include/llvm/Support/Path.h
328 ↗(On Diff #37571)

I was speaking more for conformity within the same file. We usually try to match the style of the file, even if it's not strictly the correct thing to do style-wise.

I don't feel strongly. I just prefer consistency. Easier on the eyes. ;-)

lib/Support/Unix/Path.inc
592 ↗(On Diff #37571)

Good to know. I would still recommend only using //.

lib/Support/Windows/Path.inc
773 ↗(On Diff #37571)

I am not certain that's a good idea because then it embeds path separators in the string literal. If there's a way to pass that information that's abstracted (I'm not certain if something in fs would suffice or not), then I think it's a good idea.

chfast updated this revision to Diff 37977.Oct 21 2015, 3:02 AM
chfast edited edge metadata.
  • Update documenting comments.
chfast marked 2 inline comments as done.Oct 21 2015, 3:07 AM
chfast added inline comments.
lib/Support/Windows/Path.inc
775 ↗(On Diff #37977)

It looks the best we can do is to repeat path::append interface that has optional path elements of type llvm::Twine. I don't think it is a good idea to do so. I've updated the doc comment of path::user_cache_directory and in my opinions it is enough.

btw, what will be using this?

btw, what will be using this?

Personally I'm using it for keeping JIT cached objects. I believe it is more suitable for that case than system_temp_directory (and in case of MacOS is just more explicit).

rafael added inline comments.Oct 28 2015, 8:16 AM
include/llvm/Support/Path.h
332 ↗(On Diff #37977)

Maybe take a twine argument and do the appending in here? It is easy to forget to read documentation :-).

lib/Support/Unix/Path.inc
636 ↗(On Diff #37977)

Should we drop the EraseOnReboot argument from this function? We would now have

user_cache_directory
system_temp_directory(true
system_temp_directory(false

When should each one be used?

chfast added inline comments.Oct 28 2015, 8:35 AM
include/llvm/Support/Path.h
332 ↗(On Diff #37977)

That can be done. I would go for 2 Twine optional args to support <vendor>/<application> convention.

But that duplicates features of path::append and you still need to append a file name. The common pattern is:

if (sys::path::user_cache_directory(Dir)) {
  sys::path::append(Dir, "LLVM", "Kaleidoscope", CacheFileName);
}

and the above would be replaced with:

if (sys::path::user_cache_directory(Dir, "LLVM", "Kaleidoscope")) {
  sys::path::append(Dir, CacheFileName);
}

Which one is better?

lib/Support/Unix/Path.inc
636 ↗(On Diff #37977)

user_cache_directory is similar to system_temp_directory(false) (level of similarity is OS-specific). We can drop system_temp_directory(false) then, but because it is the API change it's probably bad idea to do it at once.

In the end I would like to have:
user_cache_directory
system_temp_directory
and maybe user_temp_directory

I think that needs to be discussed. And we can agree on the final solution I'm happy to implement it.

and the above would be replaced with:

if (sys::path::user_cache_directory(Dir, "LLVM", "Kaleidoscope")) {
  sys::path::append(Dir, CacheFileName);
}

Which one is better?

The second one if at least the first argument is mandatory. That way
we make sure we never write directly to the top level folder.

Comment at: lib/Support/Unix/Path.inc:636
@@ -585,3 +635,3 @@

void system_temp_directory(bool ErasedOnReboot, SmallVectorImpl<char> &Result) {

Result.clear();

rafael wrote:

Should we drop the EraseOnReboot argument from this function? We would now have

user_cache_directory
system_temp_directory(true
system_temp_directory(false

When should each one be used?

user_cache_directory is similar to system_temp_directory(false) (level of similarity is OS-specific). We can drop system_temp_directory(false) then, but because it is the API change it's probably bad idea to do it at once.

In the end I would like to have:
user_cache_directory
system_temp_directory
and maybe user_temp_directory

I think that needs to be discussed. And we can agree on the final solution I'm happy to implement it.

Awesome. Yes, multiple patches are better.

Cheers,
Rafael

chfast updated this revision to Diff 38733.Oct 29 2015, 7:00 AM
chfast added a reviewer: rafael.

Function path::user_cache_directory has been extended with params of additional paths to be appended to the resulting path. First of the params is mandatory.

rafael accepted this revision.Oct 29 2015, 2:28 PM
rafael edited edge metadata.

LGTM with the variable names changed to the llvm style (Path1, Path2, etc).

This revision is now accepted and ready to land.Oct 29 2015, 2:28 PM
aaron.ballman accepted this revision.Nov 1 2015, 2:39 PM
aaron.ballman edited edge metadata.

LGTM with a testing request.

unittests/Support/Path.cpp
340 ↗(On Diff #38733)

It would be nice to have a test that ensures non-ASCII characters in the path do not get corrupted. I don't think there's a reasonable way to test that for the user_cache_directory itself, but the paths we append would at least give some peace of mind.

chfast updated this revision to Diff 38874.Nov 2 2015, 1:17 AM
chfast edited edge metadata.
chfast marked an inline comment as done.
  • Params and variables names capitalized.
  • A unittest with Unicode names added.
This revision was automatically updated to reflect the committed changes.