Noticed the mistake while reviewing the generated documentation.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mar 17 2023
Mar 15 2023
@aaron.ballman, can you land this revision for me please?
Mar 14 2023
LGTM!
Mar 13 2023
I have implemented the setter for the new option locally and tested it in KDevelop.
void clang_CXIndex_setStorePreamblesInMemory(CXIndex CIdx, int storePreamblesInMemory) { if (CIdx) static_cast<CIndexer *>(CIdx)->setStorePreamblesInMemory( storePreamblesInMemory); }
Works as expected: new preambles are created in memory or the temporary directory depending on the option selected in the UI. Even if the temporary storage option ends up in memory, all remaining preambles are removed from the temporary directory on exit.
Just created the follow-up StorePreamblesInMemory revision: D145974.
Mar 10 2023
Mar 9 2023
Please update the commit message (there is no clang_isFieldDeclBitWidthDependent anymore) and update the revision with arc diff --verbatim @~.
Mar 7 2023
In D143418#4175628, @aaron.ballman wrote:Hmmm, don't relaxed loads and stores still have the potential to be racey? I thought you needed a release on the store and an acquire on the load (or sequential consistency), but this is definitely not my area of expertise.
The acquire/release semantics is needed if the atomic variable guards access to another non-atomic variable or the atomic pointer guards access to its non-atomic pointed-to value. The relaxed order means no guarantees about this variable's interaction with other atomic or non-atomic variables. But even the relaxed order prevents data races on the variable itself, which is sufficient here.
The option can also be added to CXIndexOptions in order to allow setting its initial value reliably (not worrying whether it could be used before the setter gets called after index construction).
Is adding both the setter and the CXIndexOptions member OK or would you prefer only one of these two?
It's a bit odd to me that we're deprecating the global option setters to turn around and add a new global option setter. The old interfaces are not thread safe and the new one is thread safe, so I get why the desire exists and is reasonable, but (personally) I'd like to get rid of the option state setters entirely someday.
I also thought about the deprecated old interfaces today. clang_CXIndex_setGlobalOptions() could be undeprecated by similarly making CIndexer::Options atomic. Safely setting std::string members would require a mutex.
What I was envisioning was that the index creation would get global options and if we wanted per-TU options, we'd add an interface akin to clang_parseTranslationUnit() which takes another options struct for per-TU options. (Perhaps the default values for those options come from the global options -- e.g., DisplayDiagnostics is set at the global level but could be overridden for a single TU). Do you have thoughts about that approach?
This would allow to specify whether to store each individual preamble in memory, which is more specific than necessary for my use case. This would shift the burden of passing around the StorePreamblesInMemory value to libclang users. Certainly more difficult to implement both in libclang and in KDevelop.
I am implementing the StorePreamblesInMemory bool option discussed earlier. It would be nice to be able to modify it at any time, because it can be an option in an IDE's UI and requiring to restart an IDE for the option change to take effect is undesirable. In order to make the setter thread-safe, the option can be stored as std::atomic<bool> StorePreamblesInMemory in class CIndexer and stored/loaded with memory_order_relaxed. Setting this option would apply only to subsequent clang_parseTranslationUnit_Impl() calls. The option can also be added to CXIndexOptions in order to allow setting its initial value reliably (not worrying whether it could be used before the setter gets called after index construction).
Thank you for the quick build fix. KDevelop's CMakeLists.txt disables this warning by adding the -Wno-missing-field-initializers flag. That's why I haven't noticed the warning while building KDevelop. Either I missed the warning while building LLVM or it is also disabled in a default GNU/Linux x86_64 build.
A clang-ppc64le-rhel build failed:
54.897 [28/169/148] Building CXX object tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o FAILED: tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o /home/buildbots/clang.15.0.4/bin/clang++ --gcc-toolchain=/usr -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/unittests/libclang -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/third-party/unittest/googletest/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/third-party/unittest/googlemock/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++17 -MD -MT tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o -MF tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o.d -o tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o -c /home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang/LibclangTest.cpp /home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang/LibclangTest.cpp:488:50: error: missing field 'ThreadBackgroundPriorityForIndexing' initializer [-Werror,-Wmissing-field-initializers] CXIndexOptions Opts = {sizeof(CXIndexOptions)}; ^ 1 error generated.
In D143418#4172587, @aaron.ballman wrote:Thank you, this LGTM! I have to head out shortly, so I'll land this on your behalf tomorrow when I have the time to babysit the postcommit build farm. However, if you'd like to request commit access for yourself, I think that'd be reasonable as well: https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access Let me know which route you'd prefer going.
https://llvm.org/docs/Phabricator.html#committing-a-change says:
Using the Arcanist tool can simplify the process of committing reviewed code as it will retrieve reviewers, the Differential Revision, etc from the review and place it in the commit message. You may also commit an accepted change directly using git push, per the section in the getting started guide.
But how to use the Arcanist tool to push reviewed changes is not elaborated. As far as I can tell from the Arcanist documentation, if I have the commit in my local main branch which is currently checked out, I simply need to run arc land without arguments. Hopefully the Git pre-push hook I have set up will run in this case.
Mar 6 2023
Clean rebase on main branch where the base revision D143415 has already landed.
I just verified that this patch fixes a KDevelop crash reported by several users. Thank you!
Mar 4 2023
Replace clang_getDefaultGlobalOptions() with CXChoice as discussed.
A few unrelated small improvements.
Mar 2 2023
Mar 1 2023
Address review comments
In D143415#4161460, @aaron.ballman wrote:In D143415#4160550, @vedgy wrote:Can someone merge/land this review? I don't have commit access.
I'm happy to do that for you -- what name and email address would you like me to use for patch attribution?
Feb 28 2023
Can someone merge/land this review? I don't have commit access.
Feb 27 2023
Feb 26 2023
Reimplement the API as discussed in review comments
Rebase on latest main branch
Feb 25 2023
@aaron.ballman, this test fix is independent from D143418 and can be reviewed separately.
Feb 24 2023
In D143418#4150360, @aaron.ballman wrote:So my intuition is that the current behavior works well enough and I doubt anyone's going to propose changes to it in the future.
So adding the GlobalOptions member to CXIndexOptions, adding the clang_getDefaultGlobalOptions() function and deprecating clang_CXIndex_setGlobalOptions in this revision is fine, right?
Feb 23 2023
In D143418#4147578, @aaron.ballman wrote:In D143418#4131156, @vedgy wrote:On second thought, the proposed clang_getDefaultGlobalOptions() API already offers users a choice to either prefer or override each option value from the env. variables, just like the existing clang_CXIndex_[gs]etGlobalOptions API. The environment variables are binary options: an existence, not value, of an env. variable determines an initial option value. So I don't understand what are the two different ways to expose these options.
I was thinking of the env variable as determining the initial option value, so the two options I saw were "the the env variable provides the final resulting value used by the program" and "the env variable provides the default value used by the program". But you're right, the current behavior is that the presence of the env variable determines the behavior, not the value of the env variable.
The question really boils down to: if the user passes in an option which says "don't use thread background priority" to the call to create index, AND there is an environment variable saying "use thread background priority", who should win? And does the answer differ if the option says "use thread background priority" and the environment variable does not exist?
The purpose of adding the GlobalOptions member to CXIndexOptions and deprecating clang_CXIndex_setGlobalOptions is to improve libclang's thread safety. The proposed approach keeps the meaning of the environment variables and enables straightforward porting of existing uses. A good reason to add the GlobalOptions member now is to reduce the number of supported struct CXIndexOptions versions.
Feb 15 2023
In D143418#4125756, @aaron.ballman wrote:
- clang_createIndex() initializes global options based on environment variable values:
if (getenv("LIBCLANG_BGPRIO_INDEX")) CIdxr->setCXGlobalOptFlags(CIdxr->getCXGlobalOptFlags() | CXGlobalOpt_ThreadBackgroundPriorityForIndexing); if (getenv("LIBCLANG_BGPRIO_EDIT")) CIdxr->setCXGlobalOptFlags(CIdxr->getCXGlobalOptFlags() | CXGlobalOpt_ThreadBackgroundPriorityForEditing);The recommended in documentation usage of clang_CXIndex_setGlobalOptions is:
* \code * CXIndex idx = ...; * clang_CXIndex_setGlobalOptions(idx, * clang_CXIndex_getGlobalOptions(idx) | * CXGlobalOpt_ThreadBackgroundPriorityForIndexing); * \endcodeSo making these options part of struct CXIndexOptions and deprecating clang_CXIndex_setGlobalOptions requires introducing another global function that would read the environment variables:
CINDEX_LINKAGE unsigned clang_getDefaultGlobalOptions();Is this the right approach?
Hmm, to make this patch easier, I think we might want to leave the environment variable behavior alone and not shift these into the options structure (yet?). Naively, I think it makes sense for these to eventually live in the options structure, but we could expose them in a few different ways (an option to prefer the env variable over a manual value as it is today or an option to prefer the manual value over the env variable for folks who want more hermetic behavior). WDYT? My opinion here isn't super strong, so if you have a strong desire to deprecate and add a replacement API, I think that's a defensible course to take.
On second thought, the proposed clang_getDefaultGlobalOptions() API already offers users a choice to either prefer or override each option value from the env. variables, just like the existing clang_CXIndex_[gs]etGlobalOptions API. The environment variables are binary options: an existence, not value, of an env. variable determines an initial option value. So I don't understand what are the two different ways to expose these options.
Feb 14 2023
In D143418#4125756, @aaron.ballman wrote:In D143418#4125098, @vedgy wrote:uint32_t Size; // = sizeof(struct CIndexOptions), used for option versioning
- uint32_t was introduced in C99. Can/should it be used in Index.h? Only built-in [unsigned] (int|long) types are currently used in this file.
That is a really good question. I couldn't spot anything existing in the header file that requires C99 or later (no // comments, no C99 types like _Bool or uint32_t, etc). I think the common pattern would be to use unsigned, which is also variably-sized (as are all the integer types, technically, thanks to CHAR_BIT), but we do have one existing use of size_t in the file, which is probably the best type to use there given that we expect people to assign the results of sizeof into it. WDYT about using size_t?
I don't have the historical context to know whether we expect this header to be C89 compatible, so it's not clear to me how disruptive it would be to use stdint.h. One alternative would be to use stdint.h if it's available (via __has_include) and otherwise fallback on a C89 custom definition of uint32_t, but that strikes me as being more likely to introduce problems on systems we don't test on or for compilers we don't test with.
size_t indeed makes logical sense for this member as that's the return type of sizeof. size_t is two times larger than unsigned on x86_64, but I don't think the size of this struct has any chance of impacting performance. Though it wouldn't hurt to pack the size and the boolean options in a single-pointer-sized region on x86_64. After all, this struct's size will never reach UINT_MAX. I slightly prefer unsigned due to my efficiency inclinations :). What do you prefer? Is there any benefit in using a fixed-size integer type here?
Feb 13 2023
uint32_t Size; // = sizeof(struct CIndexOptions), used for option versioning
- uint32_t was introduced in C99. Can/should it be used in Index.h? Only built-in [unsigned] (int|long) types are currently used in this file.
In D143418#4122821, @aaron.ballman wrote:How about including existing options, which should be set in constructor, in the initial struct version and deprecating the corresponding setters?
I think that makes a lot of sense.
How to deprecate the setters? Add DEPRECATED in the documentations as is already done in two places in Index.h?
Feb 10 2023
In D143418#4118905, @aaron.ballman wrote:I don't think "tries hard" is a good enough guarantee for where files are placed. I'm not comfortable with the security posture of that approach as it could potentially leak sensitive information (try to override the temp directory to something that's chroot jailed, get sensitive files showing up in the system temp directory anyway).
That's why the function's documentation explicitly doesn't guarantee anything. It should be safe to assume that security-sensitive users would at least read the documentation. If this function and potential future function like it are named specifically, a responsible security use case wouldn't be better off. The only safety advantage would be to those who don't bother reading the documentation. But why should we care much about such hypothetical careless security needs?
Hi @owenpan. Thank you for fixing this bug!
Have you noticed this paragraph in my bug report?
I believe clang_getTypeSpelling(), or more likely QualType::print() used by it, should insert a tab character between such tokens to pretty-print compilable code. The tab character is preferable to the space character here, because the users may rely on the fact that pretty-printed binary operators are surrounded by spaces to distinguish them from angle brackets.
KDevelop parses the result of clang_getTypeSpelling() when libclang API is lacking. Since this recent commit KDevelop's parsing relies on the empirical fact that only operators are surrounded by spaces to distinguish them from angle brackets. Does this revision introduce angle brackets surrounded by spaces? Can tab characters be used instead? If not, do you know how else such angle brackets can be distinguished from operators?
Feb 9 2023
In D143418#4116290, @aaron.ballman wrote:So how about a more sincere general solution: setPreferredTempDirPath()? The documentation would say that libclang tries its best to place temporary files into the specified directory, but some might end up in the system temporary directory instead.
I don't think those semantics are reasonable. "Tries its best" is fine for things that are heuristic based, like deciding when to do an analysis cut point, but is too non-deterministic for something like "where do files get placed". I think it's reasonable that we either:
- expose an option to say "I prefer preambles to be put in this directory" for libclang, or
- expose an option to say "I prefer using this directory over the temp directory for all files" for libclang,
I thought we agreed that #2 is too high of risk because it requires auditing libclang for all the places it decides to put something into the temp directory, and so we were going with #1 and doing it as part of the cindex constructor so that users cannot run into the surprise issues (multithreading, files changing location mid-run) with an independent setter.
This revision implements #2, but warns about possible imperfections of the implementation. KDevelop (and other potential users of the added API that need to override the temp dir location for the same reason) is not interested in overriding each temp dir usage separately. For this only known use case an imperfectly implemented clang_CXIndex_setPreferredTempDirPath() is much more convenient than a perfectly correct clang_CXIndex_setPreambleStoragePath(). With the general API name, if another temp dir use is overridden in the future, neither libclang API nor the code that uses it would have to be adjusted.
In D140756#4115381, @aaron.ballman wrote:The spare moment came sooner than expected, I've made the changes in de4321cf2cf28f2bae7fc0f1897957e73b726f89.
Isn't the global: label needed in the new LLVM_17 block/tag?
Feb 7 2023
I hoped to avoid applying an unrelated git clang-format's include reordering, but the CI fails because of that. Will apply it along with changes requested during code review.
Feb 6 2023
Address an old inline review comment
https://reviews.llvm.org/D139774#inline-1360634 and add a comment.
The added, then reverted release note was lost when this revision landed the second time.
D143418 implements my latest idea described in the previous comment. Let us continue the discussion there.
Feb 2 2023
I meant that the commit message of https://reviews.llvm.org/rG79571aa2103c95760a07e3549d8636379e4948f0 misleadingly refers to this review's commit. But CINDEX_VERSION_MINOR == 63 is for previous commits. This commit will require incrementing CINDEX_VERSION_MINOR again to 64 in Clang 17. Hopefully my preamble-storage patches will also be included in Clang 17 and share the same minor version 64 :)
79571aa2103c95760a07e3549d8636379e4948f0 is the commit on main and https://github.com/llvm/llvm-project/issues/60478 is for the 16.x cherry-pick.
Hello Luca and Aaron.
I have noticed that you recently implemented/reviewed multiple interesting libclang features, some of which can be used in KDevelop. However, CINDEX_VERSION_MINOR was last modified in Clang 13. This constant's documentation states:
CINDEX_VERSION_MINOR should increase when there are API additions.
Another disadvantage of using CINDEX_VERSION_MINOR instead of the sizeof is that an older libclang version won't know about compatibility of newer versions with itself. But this reasoning brought me to a different thought: when a program is compiled against a libclang version X, it should not be expected to be runtime-compatible with a libclang version older than X. For example, KDevelop uses CINDEX_VERSION_MINOR to decide whether or not certain libclang API is available.
Feb 1 2023
Perhaps the struct should contain the value of CINDEX_VERSION_MINOR instead of the sizeof? This way, libclang can change the meaning of a struct member without changing the size of the struct. For example, replace PreambleStoragePath with TemporaryDirectoryPath. Such a change could even be quiet, because setting the more general temporary directory path would likely be welcome or at least acceptable to the API users.
In D139774#4096695, @aaron.ballman wrote:There's three scenarios when a field is added to the structure: 1) library and app are matching versions, 2) library is newer than app, 3) app is newer than library. #1 is the happy path most folks will hopefully be on. For #2 and #3, the app will either be sending more or less data than the library expects, but the library will know how much of the structure is valid based on the size field. If the size is too small and the library can't get data it needs, it can catch that and report an error instead of reading past a buffer. And if the size is too large, the library can ignore anything it doesn't know about or it can give an error due to not knowing how to support the semantics of the newer interface.
In D139774#4096527, @aaron.ballman wrote:That sounds like a good plan to me. I wonder if we want to name it something like clang_createIndexWithOptions (or something generic like that), and give it a versioned structure of options along the lines of:
struct CIndexOptions { uint32_t Size; // sizeof(struct CIndexOptions), used for option versioning const char *PreambleStoragePath; // This pointer can be freed after creating the index };and define the function to return an error if Size < sizeof(struct CIndexOptions). This should allow us to add additional options to the structure without having to introduce a new constructor API each time.
Jan 31 2023
In D139774#4094619, @aaron.ballman wrote:Is there a middle ground where, instead of #2 for general temporary storage, we went with #2 but with compiler-specific directories instead of system directories. e.g., don't let the caller set the temp directory, but do let the caller set the preamble directory (which defaults to the temp directory) so long as it's set before invoking the compiler? This still won't let you change options mid-run, but it also seems like it should have less risk of affecting other components while still solving the thread safety issues. I'm not certain if it's any easier to implement, but I think it does save you from modifying FileSystemOptions. As a separate item, we could then consider adding a new C API to let you toggle the in-memory vs on-disk functionality after exploring that it won't cause other problems because nobody considered the possibility that it's not a stable value for the compiler invocation.
Jan 30 2023
In D139774#4091631, @aaron.ballman wrote:My preference is still for specific API names. If we want to do something general to all temp files, I think FileSystemOptions is the way to go.
My concern with not using a constructor is that, because this is the C API, we will be supporting the functionality for a long time even if we do switch to using a global override in FileSystemOptions where you would need to set the information up before executing the compiler. To me, these paths are something that should not be possible to change once the compiler has started executing. (That also solves the multithreading problem where multiple threads are fighting over what the temp directory should be and stepping on each other's toes.)
Jan 25 2023
3 out of 4 alternatives remain:
- Add an option to store the preamble-*.pch files in RAM instead of /tmp and add a corresponding option in KDevelop configuration UI. This would work perfectly for me, provided I don't change my mind and decide to turn this option off, in which case I'll be back to square one.
- Add an option to store the preamble-*.pch files in a custom directory instead of a general temporary directory. The option could be named generally (2a: overrideTemporaryDirectory) or specially (2b: setPreambleStoragePath). If the option is named generally, other temporary files created by libclang could be identified in the future and placed in the same directory without changing the API.
- 1 and 2 - the options can be propagated between identical end points together, so this combination is natural.
Now that a consensus has been reached that this revision is to be discarded, can we please resume the discussion of which alternative should be implemented in the replacement revision?
Jan 20 2023
One idea discussed in comments to the KDevelop merge request, which I haven't mentioned here is this: remove the preamble files immediately after creating and opening them. This is safe on Unix-like OSes, because every file that is still open will not actually be deleted but its directory entry cleared. It's an old Unix trick to mark (open) files as volatile. You could do this with any opened file (that you don't want to be able to close and then reopen) immediately after creating it in which case it'll get cleaned up even in case of a hard crash. However, my analysis of clang/lib/Frontend/PrecompiledPreamble.cpp shows that such unlinking isn't straightforward for the preamble files, e.g. because of PrecompiledPreamble::getSize(), which checks the file size at the preamble file path.
Jan 19 2023
In D139774#4066593, @dblaikie wrote:(1) seems OK-ish, I guess there's some question as to what the tradeoff is for that option - does that blow out memory usage of the client/kdevelop? But I guess it's probably fine.
Do you think we should do one of the options for (2) or do you think (1) should be sufficient?
If (1) is sufficient for KDevelop's needs and already implemented in some form for clangd, if I understand correctly, that seems the cheapest/least involved?
Not sufficient for all KDevelop users. Namely it doesn't work for low-RAM systems where /tmp is on disk to save RAM.
In D139774#4066519, @dblaikie wrote:(1) seems OK-ish, I guess there's some question as to what the tradeoff is for that option - does that blow out memory usage of the client/kdevelop? But I guess it's probably fine.
At least one KDevelop user who has limited RAM and keeps /tmp on disk (not tmpfs) protested vehemently against unconditional storing of the temporary files in RAM. Several gigabytes of RAM can be valuable on some systems. Storing huge files in a temporary directory is definitely more flexible. But to users who keep /tmp on tmpfs, storing the preamble files in RAM is preferable, because memory is always freed right after a KDevelop crash, not on next start of the same KDevelop session (which can occur much later).
In D139774#4065753, @aaron.ballman wrote:I lean towards #2b over #2a due to wanting individual overrides rather than a blanket override (e.g., we should be able to put preamble files in a different location than we put, say, crash logs).
Crash logs don't really belong to a common temporary directory. Are they actually placed into a temporary directory erased on reboot? I'd prefer coarser categories, like temp dir erased-on-reboot and not-erased-on-reboot, similar to void system_temp_directory(bool erasedOnReboot in Path.h. This definitely makes more sense to the use case of removing these temporary files on next application start. Why would a libclang-using program care about various purposes of its internally-used temporary files?
Jan 18 2023
After a discussion under the corresponding KDevelop merge request, I can see 4-6 alternative ways to address the temporary directory issue:
Jan 12 2023
In D139774#4047976, @dblaikie wrote:It seemed like the places where kdevelop would want to suppress the temp dir cleanup would be enumerable/more within kdevelop's control than instances of libraries kdevelop is using wanting to create their own unexposed temp files. But, yeah, would be a bunch of work to go and touch all those places. Ah well *shrug*
That may be true. But a few small-size generated files slipping into the system temporary directory is not much of a problem. /tmp is cleared on shutdown after all. Causing bugs in a user executable run from KDevelop is much worse.
In D139774#4045361, @dblaikie wrote:
- Should clang be doing something better with these temp files anyway, no matter the directory they go in? (setting them for cleanup at process exit or the like - I think we have such a filesystem abstraction, maybe that only works on Windows? I'm not sure)
That'd be great, but appears unfeasible on GNU/Linux in case of a crash: https://stackoverflow.com/questions/471344/guaranteed-file-deletion-upon-program-termination-c-c
- If there isn't a better general way to avoid creating temp trash that's a problem, I'd have thought that KDevelop/other tools would have issues with other temp files too, and want a more general solution (I'd have thought changing the process's temp directory, and changing it back for user process launches, would be sufficient - so it can cleanup after itself for all files, not just these particular clang-created ones)
I do plan to put other temporary files KDevelop generates in the same session-specific temporary directory and clean it on start. But modifying KDevelop's temporary directory environment variable has been rejected during code review for good reason. As the bug this review aims to fix puts it:
setting the environment variables is problematic, because they are inherited by the IDE's code and all child processes it spawns (compiler, build system and user-provided executables). The IDE must then remove the temporary directory environment variable from each child process where it can cause undesirable behavior.
Jan 11 2023
In D139774#4041308, @aaron.ballman wrote:Is your concern essentially that someone will add a new use to put files in a temp directory and not have access to this information from ASTUnit? Or do you have other concerns in mind?
We should certainly investigate whether there are other uses of temp files from libclang as part of these changes. It's possible we'll find a situation that makes my suggestion untenable because we don't have easy access to the information we need.
The temporary directory could be used, now or in future, by some support code, which is used indirectly by libclang. I found the following uses potentially relevant to libclang:
- Driver::GetTemporaryDirectory(StringRef Prefix) calls llvm::sys::fs::createUniqueDirectory(Prefix, Path);
- StandardInstrumentations => DotCfgChangeReporter calls sys::fs::createUniquePath("cfgdot-%%%%%%.dot", SV, true);
- There are plenty of createTemporaryFile() uses in llvm and clang. Some of them are likely used by libclang.
I don't know how to efficiently check whether or not each of these indirect system_temp_directory() uses is in turn used by libclang. Even if they aren't know, they might be later, when libclang API grows.
Jan 10 2023
In D139774#4040705, @aaron.ballman wrote:At a point where we have a CIndexer object, we eventually call ASTUnit::LoadFromCommandLine() and ASTUnit has a member FileSystemOptions FileSystemOpts, and FileSystemOptions has a std::string for the working directory to use. I think we should store the temp directory in here as well, and when ASTUnit::getMainBufferWithPrecompiledPreamble() build the preamble, it can pass that information along to TempPCHFile to avoid needing to ask the system for the temp directory.
The path would have to be passed into several functions. TempPCHFile::create() would have to use createUniqueFile() instead of createTemporaryFile(). My greatest concern with this improved design is that libclang may use the temporary directory for other purposes - if not yet, then in the future. Another issue is with the FileSystemOptions part: this class is widely used in Clang. So adding a member to it could adversely affect performance of unrelated code.
This does mean we store two copies of the string (one in CIndexer and one in FileSystemOptions, but I think the improved ownership semantics for the C API makes that duplication somewhat reasonable.
If LLVM does not have its own shared buffer type, a std::shared_ptr<const char[]> could be used instead of std::string. Or SmallString (not shared, but avoids allocations given a not too long overriding path).
In D139774#4039975, @aaron.ballman wrote:Oh that is a good point! Apologies for not noticing that earlier -- that makes me wonder if a better approach here is to add a std::string to the CIndexer class to represent the temp path to use.
I have suggested the possibility in this review:
A copy of user-provided temporary directory path buffer can be stored in class CIndexer. But this API in llvm/Support/Path.h would stay fragile.
So the question is whether the LLVM API or CIndexer should store the buffer.
In D139774#4036496, @aaron.ballman wrote:Given that we already have other setters to set global state, I'm less concerned about adding another one. I hadn't realized we already introduced the dangers here. But we should document the expectation that the call be made before creating the index.
There is a difference: clang_CXIndex_setGlobalOptions() and clang_CXIndex_setInvocationEmissionPathOption() take a CXIndex argument and thus can only be called after creating an index. So requiring to call clang_overrideTemporaryDirectory() before creating an index, plus one more time with nullptr argument after disposing of the index, wouldn't be particularly consistent with other setters.
Jan 9 2023
In D139774#4036496, @aaron.ballman wrote:In terms of the C API, I think it'd make more sense to name in terms of "override" rather than "set", but I don't feel as strongly about it given the other setters. In terms of the C++ file system API, I think "override" makes the most sense though (we don't have setters to follow the naming convention for) because some systems do allow you to set the system directory.
Let's keep the naming in C and C++ APIs consistent: clang_overrideTemporaryDirectory() and override_system_temp_directory_erased_on_reboot().
In terms of memory ownership, WDYT of requiring the caller to handle this? e.g., calling set_system_temp_directory_erased_on_reboot will strdup a nonnull pointer and free the stored pointer when given nullptr.
I like this idea. libclang-user code would become easier to use than it is now (though less easy compared to libclang managing memory itself). The libclang API documentation can require overriding the temp directory before creating an index and un-overriding it with nullptr after calling clang_disposeIndex().
Now in order to make this libclang API harder to misuse, I lean towards passing the temporary directory in clang_createIndexWithTempDir() and letting clang_disposeIndex() handle the un-overriding (call override_system_temp_directory_erased_on_reboot(nullptr)) automatically. Makes sense? I feel that the clang_createIndexWithTempDir() name could be improved, but don't know how...
Jan 6 2023
Jan 4 2023
In D139774#4025356, @vedgy wrote:clang_disposeIndex() would have to un-override the temporary directory then. I'll have to check whether this un-overriding happens too early (before all preamble-*.pch files are removed).
Checked by calling clang_setTemporaryDirectory(nullptr); right after clang_disposeIndex(m_index);. Works correctly in KDevelop: all preamble-*.pch files are removed from the overriding temporary directory on exit. So this alternative API is viable.
In D139774#4023940, @aaron.ballman wrote:This feels like a configuration property of the libclang execution -- so it'd be nice to require it to be set up from clang_createIndex() (IIRC that's the entrypoint for CIndex functionality, but I'm not 100% sure), rather than an API that the user can call repeatedly.
In order to keep backward compatibility, this would require introducing another createIndex function, e.g. clang_createIndexWithTempDir(). clang_disposeIndex() would have to un-override the temporary directory then. I'll have to check whether this un-overriding happens too early (before all preamble-*.pch files are removed). But notice that separate setup functions already exist: clang_CXIndex_setGlobalOptions(), clang_CXIndex_setInvocationEmissionPathOption(). The documentation for clang_setTemporaryDirectory() can recommend calling it before clang_createIndex().
Either alternative works for KDevelop, because it calls clang_createIndex() once.
Dec 11 2022
Extract identical code from the two Path.inc files into Path.cpp