User Details
- User Since
- Dec 9 2022, 7:53 AM (14 w, 6 d)
Fri, Mar 17
Noticed the mistake while reviewing the generated documentation.
Wed, Mar 15
@aaron.ballman, can you land this revision for me please?
Tue, Mar 14
LGTM!
Mon, Mar 13
I have implemented the setter for the new option locally and tested it in KDevelop.
void clang_CXIndex_setStorePreamblesInMemory(CXIndex CIdx, int storePreamblesInMemory) { if (CIdx) static_cast<CIndexer *>(CIdx)->setStorePreamblesInMemory( storePreamblesInMemory); }
Works as expected: new preambles are created in memory or the temporary directory depending on the option selected in the UI. Even if the temporary storage option ends up in memory, all remaining preambles are removed from the temporary directory on exit.
Just created the follow-up StorePreamblesInMemory revision: D145974.
Fri, Mar 10
Thu, Mar 9
Please update the commit message (there is no clang_isFieldDeclBitWidthDependent anymore) and update the revision with arc diff --verbatim @~.
Tue, Mar 7
The acquire/release semantics is needed if the atomic variable guards access to another non-atomic variable or the atomic pointer guards access to its non-atomic pointed-to value. The relaxed order means no guarantees about this variable's interaction with other atomic or non-atomic variables. But even the relaxed order prevents data races on the variable itself, which is sufficient here.
The option can also be added to CXIndexOptions in order to allow setting its initial value reliably (not worrying whether it could be used before the setter gets called after index construction).
Is adding both the setter and the CXIndexOptions member OK or would you prefer only one of these two?
It's a bit odd to me that we're deprecating the global option setters to turn around and add a new global option setter. The old interfaces are not thread safe and the new one is thread safe, so I get why the desire exists and is reasonable, but (personally) I'd like to get rid of the option state setters entirely someday.
I also thought about the deprecated old interfaces today. clang_CXIndex_setGlobalOptions() could be undeprecated by similarly making CIndexer::Options atomic. Safely setting std::string members would require a mutex.
What I was envisioning was that the index creation would get global options and if we wanted per-TU options, we'd add an interface akin to clang_parseTranslationUnit() which takes another options struct for per-TU options. (Perhaps the default values for those options come from the global options -- e.g., DisplayDiagnostics is set at the global level but could be overridden for a single TU). Do you have thoughts about that approach?
This would allow to specify whether to store each individual preamble in memory, which is more specific than necessary for my use case. This would shift the burden of passing around the StorePreamblesInMemory value to libclang users. Certainly more difficult to implement both in libclang and in KDevelop.
I am implementing the StorePreamblesInMemory bool option discussed earlier. It would be nice to be able to modify it at any time, because it can be an option in an IDE's UI and requiring to restart an IDE for the option change to take effect is undesirable. In order to make the setter thread-safe, the option can be stored as std::atomic<bool> StorePreamblesInMemory in class CIndexer and stored/loaded with memory_order_relaxed. Setting this option would apply only to subsequent clang_parseTranslationUnit_Impl() calls. The option can also be added to CXIndexOptions in order to allow setting its initial value reliably (not worrying whether it could be used before the setter gets called after index construction).
Thank you for the quick build fix. KDevelop's CMakeLists.txt disables this warning by adding the -Wno-missing-field-initializers flag. That's why I haven't noticed the warning while building KDevelop. Either I missed the warning while building LLVM or it is also disabled in a default GNU/Linux x86_64 build.
A clang-ppc64le-rhel build failed:
54.897 [28/169/148] Building CXX object tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o FAILED: tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o /home/buildbots/clang.15.0.4/bin/clang++ --gcc-toolchain=/usr -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/unittests/libclang -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/third-party/unittest/googletest/include -I/home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/third-party/unittest/googlemock/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++17 -MD -MT tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o -MF tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o.d -o tools/clang/unittests/libclang/CMakeFiles/libclangTests.dir/LibclangTest.cpp.o -c /home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang/LibclangTest.cpp /home/buildbots/docker-RHEL84-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/unittests/libclang/LibclangTest.cpp:488:50: error: missing field 'ThreadBackgroundPriorityForIndexing' initializer [-Werror,-Wmissing-field-initializers] CXIndexOptions Opts = {sizeof(CXIndexOptions)}; ^ 1 error generated.
https://llvm.org/docs/Phabricator.html#committing-a-change says:
Using the Arcanist tool can simplify the process of committing reviewed code as it will retrieve reviewers, the Differential Revision, etc from the review and place it in the commit message. You may also commit an accepted change directly using git push, per the section in the getting started guide.
But how to use the Arcanist tool to push reviewed changes is not elaborated. As far as I can tell from the Arcanist documentation, if I have the commit in my local main branch which is currently checked out, I simply need to run arc land without arguments. Hopefully the Git pre-push hook I have set up will run in this case.
Mon, Mar 6
Clean rebase on main branch where the base revision D143415 has already landed.
I just verified that this patch fixes a KDevelop crash reported by several users. Thank you!
Sat, Mar 4
Replace clang_getDefaultGlobalOptions() with CXChoice as discussed.
A few unrelated small improvements.
Thu, Mar 2
Wed, Mar 1
Address review comments
Tue, Feb 28
Can someone merge/land this review? I don't have commit access.
Mon, Feb 27
Sun, Feb 26
Reimplement the API as discussed in review comments
Rebase on latest main branch
Sat, Feb 25
@aaron.ballman, this test fix is independent from D143418 and can be reviewed separately.
Fri, Feb 24
So adding the GlobalOptions member to CXIndexOptions, adding the clang_getDefaultGlobalOptions() function and deprecating clang_CXIndex_setGlobalOptions in this revision is fine, right?
Thu, Feb 23
The purpose of adding the GlobalOptions member to CXIndexOptions and deprecating clang_CXIndex_setGlobalOptions is to improve libclang's thread safety. The proposed approach keeps the meaning of the environment variables and enables straightforward porting of existing uses.
Feb 15 2023
On second thought, the proposed clang_getDefaultGlobalOptions() API already offers users a choice to either prefer or override each option value from the env. variables, just like the existing clang_CXIndex_[gs]etGlobalOptions API. The environment variables are binary options: an existence, not value, of an env. variable determines an initial option value. So I don't understand what are the two different ways to expose these options.
Feb 14 2023
size_t indeed makes logical sense for this member as that's the return type of sizeof. size_t is two times larger than unsigned on x86_64, but I don't think the size of this struct has any chance of impacting performance. Though it wouldn't hurt to pack the size and the boolean options in a single-pointer-sized region on x86_64. After all, this struct's size will never reach UINT_MAX. I slightly prefer unsigned due to my efficiency inclinations :). What do you prefer? Is there any benefit in using a fixed-size integer type here?
Feb 13 2023
uint32_t Size; // = sizeof(struct CIndexOptions), used for option versioning
- uint32_t was introduced in C99. Can/should it be used in Index.h? Only built-in [unsigned] (int|long) types are currently used in this file.
How to deprecate the setters? Add DEPRECATED in the documentations as is already done in two places in Index.h?
Feb 10 2023
That's why the function's documentation explicitly doesn't guarantee anything. It should be safe to assume that security-sensitive users would at least read the documentation. If this function and potential future function like it are named specifically, a responsible security use case wouldn't be better off. The only safety advantage would be to those who don't bother reading the documentation. But why should we care much about such hypothetical careless security needs?
Hi @owenpan. Thank you for fixing this bug!
Have you noticed this paragraph in my bug report?
I believe clang_getTypeSpelling(), or more likely QualType::print() used by it, should insert a tab character between such tokens to pretty-print compilable code. The tab character is preferable to the space character here, because the users may rely on the fact that pretty-printed binary operators are surrounded by spaces to distinguish them from angle brackets.
KDevelop parses the result of clang_getTypeSpelling() when libclang API is lacking. Since this recent commit KDevelop's parsing relies on the empirical fact that only operators are surrounded by spaces to distinguish them from angle brackets. Does this revision introduce angle brackets surrounded by spaces? Can tab characters be used instead? If not, do you know how else such angle brackets can be distinguished from operators?
Feb 9 2023
This revision implements #2, but warns about possible imperfections of the implementation. KDevelop (and other potential users of the added API that need to override the temp dir location for the same reason) is not interested in overriding each temp dir usage separately. For this only known use case an imperfectly implemented clang_CXIndex_setPreferredTempDirPath() is much more convenient than a perfectly correct clang_CXIndex_setPreambleStoragePath(). With the general API name, if another temp dir use is overridden in the future, neither libclang API nor the code that uses it would have to be adjusted.
Isn't the global: label needed in the new LLVM_17 block/tag?
Feb 7 2023
I hoped to avoid applying an unrelated git clang-format's include reordering, but the CI fails because of that. Will apply it along with changes requested during code review.
Feb 6 2023
Address an old inline review comment
https://reviews.llvm.org/D139774#inline-1360634 and add a comment.
The added, then reverted release note was lost when this revision landed the second time.
D143418 implements my latest idea described in the previous comment. Let us continue the discussion there.
Feb 2 2023
I meant that the commit message of https://reviews.llvm.org/rG79571aa2103c95760a07e3549d8636379e4948f0 misleadingly refers to this review's commit. But CINDEX_VERSION_MINOR == 63 is for previous commits. This commit will require incrementing CINDEX_VERSION_MINOR again to 64 in Clang 17. Hopefully my preamble-storage patches will also be included in Clang 17 and share the same minor version 64 :)
79571aa2103c95760a07e3549d8636379e4948f0 is the commit on main and https://github.com/llvm/llvm-project/issues/60478 is for the 16.x cherry-pick.
Hello Luca and Aaron.
I have noticed that you recently implemented/reviewed multiple interesting libclang features, some of which can be used in KDevelop. However, CINDEX_VERSION_MINOR was last modified in Clang 13. This constant's documentation states:
CINDEX_VERSION_MINOR should increase when there are API additions.
Another disadvantage of using CINDEX_VERSION_MINOR instead of the sizeof is that an older libclang version won't know about compatibility of newer versions with itself. But this reasoning brought me to a different thought: when a program is compiled against a libclang version X, it should not be expected to be runtime-compatible with a libclang version older than X. For example, KDevelop uses CINDEX_VERSION_MINOR to decide whether or not certain libclang API is available.
Feb 1 2023
Perhaps the struct should contain the value of CINDEX_VERSION_MINOR instead of the sizeof? This way, libclang can change the meaning of a struct member without changing the size of the struct. For example, replace PreambleStoragePath with TemporaryDirectoryPath. Such a change could even be quiet, because setting the more general temporary directory path would likely be welcome or at least acceptable to the API users.
Jan 31 2023
Jan 30 2023
Jan 25 2023
3 out of 4 alternatives remain:
- Add an option to store the preamble-*.pch files in RAM instead of /tmp and add a corresponding option in KDevelop configuration UI. This would work perfectly for me, provided I don't change my mind and decide to turn this option off, in which case I'll be back to square one.
- Add an option to store the preamble-*.pch files in a custom directory instead of a general temporary directory. The option could be named generally (2a: overrideTemporaryDirectory) or specially (2b: setPreambleStoragePath). If the option is named generally, other temporary files created by libclang could be identified in the future and placed in the same directory without changing the API.
- 1 and 2 - the options can be propagated between identical end points together, so this combination is natural.
Now that a consensus has been reached that this revision is to be discarded, can we please resume the discussion of which alternative should be implemented in the replacement revision?
Jan 20 2023
One idea discussed in comments to the KDevelop merge request, which I haven't mentioned here is this: remove the preamble files immediately after creating and opening them. This is safe on Unix-like OSes, because every file that is still open will not actually be deleted but its directory entry cleared. It's an old Unix trick to mark (open) files as volatile. You could do this with any opened file (that you don't want to be able to close and then reopen) immediately after creating it in which case it'll get cleaned up even in case of a hard crash. However, my analysis of clang/lib/Frontend/PrecompiledPreamble.cpp shows that such unlinking isn't straightforward for the preamble files, e.g. because of PrecompiledPreamble::getSize(), which checks the file size at the preamble file path.
Jan 19 2023
Not sufficient for all KDevelop users. Namely it doesn't work for low-RAM systems where /tmp is on disk to save RAM.
At least one KDevelop user who has limited RAM and keeps /tmp on disk (not tmpfs) protested vehemently against unconditional storing of the temporary files in RAM. Several gigabytes of RAM can be valuable on some systems. Storing huge files in a temporary directory is definitely more flexible. But to users who keep /tmp on tmpfs, storing the preamble files in RAM is preferable, because memory is always freed right after a KDevelop crash, not on next start of the same KDevelop session (which can occur much later).
Crash logs don't really belong to a common temporary directory. Are they actually placed into a temporary directory erased on reboot? I'd prefer coarser categories, like temp dir erased-on-reboot and not-erased-on-reboot, similar to void system_temp_directory(bool erasedOnReboot in Path.h. This definitely makes more sense to the use case of removing these temporary files on next application start. Why would a libclang-using program care about various purposes of its internally-used temporary files?
Jan 18 2023
After a discussion under the corresponding KDevelop merge request, I can see 4-6 alternative ways to address the temporary directory issue:
Jan 12 2023
That may be true. But a few small-size generated files slipping into the system temporary directory is not much of a problem. /tmp is cleared on shutdown after all. Causing bugs in a user executable run from KDevelop is much worse.
That'd be great, but appears unfeasible on GNU/Linux in case of a crash: https://stackoverflow.com/questions/471344/guaranteed-file-deletion-upon-program-termination-c-c
- If there isn't a better general way to avoid creating temp trash that's a problem, I'd have thought that KDevelop/other tools would have issues with other temp files too, and want a more general solution (I'd have thought changing the process's temp directory, and changing it back for user process launches, would be sufficient - so it can cleanup after itself for all files, not just these particular clang-created ones)
I do plan to put other temporary files KDevelop generates in the same session-specific temporary directory and clean it on start. But modifying KDevelop's temporary directory environment variable has been rejected during code review for good reason. As the bug this review aims to fix puts it:
setting the environment variables is problematic, because they are inherited by the IDE's code and all child processes it spawns (compiler, build system and user-provided executables). The IDE must then remove the temporary directory environment variable from each child process where it can cause undesirable behavior.
Jan 11 2023
The temporary directory could be used, now or in future, by some support code, which is used indirectly by libclang. I found the following uses potentially relevant to libclang:
- Driver::GetTemporaryDirectory(StringRef Prefix) calls llvm::sys::fs::createUniqueDirectory(Prefix, Path);
- StandardInstrumentations => DotCfgChangeReporter calls sys::fs::createUniquePath("cfgdot-%%%%%%.dot", SV, true);
- There are plenty of createTemporaryFile() uses in llvm and clang. Some of them are likely used by libclang.
I don't know how to efficiently check whether or not each of these indirect system_temp_directory() uses is in turn used by libclang. Even if they aren't know, they might be later, when libclang API grows.
Jan 10 2023
The path would have to be passed into several functions. TempPCHFile::create() would have to use createUniqueFile() instead of createTemporaryFile(). My greatest concern with this improved design is that libclang may use the temporary directory for other purposes - if not yet, then in the future. Another issue is with the FileSystemOptions part: this class is widely used in Clang. So adding a member to it could adversely affect performance of unrelated code.
This does mean we store two copies of the string (one in CIndexer and one in FileSystemOptions, but I think the improved ownership semantics for the C API makes that duplication somewhat reasonable.
If LLVM does not have its own shared buffer type, a std::shared_ptr<const char[]> could be used instead of std::string. Or SmallString (not shared, but avoids allocations given a not too long overriding path).
I have suggested the possibility in this review:
A copy of user-provided temporary directory path buffer can be stored in class CIndexer. But this API in llvm/Support/Path.h would stay fragile.
So the question is whether the LLVM API or CIndexer should store the buffer.
There is a difference: clang_CXIndex_setGlobalOptions() and clang_CXIndex_setInvocationEmissionPathOption() take a CXIndex argument and thus can only be called after creating an index. So requiring to call clang_overrideTemporaryDirectory() before creating an index, plus one more time with nullptr argument after disposing of the index, wouldn't be particularly consistent with other setters.
Jan 9 2023
Let's keep the naming in C and C++ APIs consistent: clang_overrideTemporaryDirectory() and override_system_temp_directory_erased_on_reboot().
In terms of memory ownership, WDYT of requiring the caller to handle this? e.g., calling set_system_temp_directory_erased_on_reboot will strdup a nonnull pointer and free the stored pointer when given nullptr.
I like this idea. libclang-user code would become easier to use than it is now (though less easy compared to libclang managing memory itself). The libclang API documentation can require overriding the temp directory before creating an index and un-overriding it with nullptr after calling clang_disposeIndex().
Now in order to make this libclang API harder to misuse, I lean towards passing the temporary directory in clang_createIndexWithTempDir() and letting clang_disposeIndex() handle the un-overriding (call override_system_temp_directory_erased_on_reboot(nullptr)) automatically. Makes sense? I feel that the clang_createIndexWithTempDir() name could be improved, but don't know how...
Jan 6 2023
Jan 4 2023
Checked by calling clang_setTemporaryDirectory(nullptr); right after clang_disposeIndex(m_index);. Works correctly in KDevelop: all preamble-*.pch files are removed from the overriding temporary directory on exit. So this alternative API is viable.
In order to keep backward compatibility, this would require introducing another createIndex function, e.g. clang_createIndexWithTempDir(). clang_disposeIndex() would have to un-override the temporary directory then. I'll have to check whether this un-overriding happens too early (before all preamble-*.pch files are removed). But notice that separate setup functions already exist: clang_CXIndex_setGlobalOptions(), clang_CXIndex_setInvocationEmissionPathOption(). The documentation for clang_setTemporaryDirectory() can recommend calling it before clang_createIndex().
Either alternative works for KDevelop, because it calls clang_createIndex() once.
Dec 11 2022
Extract identical code from the two Path.inc files into Path.cpp