Try to address https://github.com/clangd/clangd/issues/1293. See the link for some design ideas.
What this patch does:
- Offers an option "-experimental-modules-support" for the new feature. So that no matter how bad this is, it wouldn't affect current users. Following off the page, we'll assume the option is enabled.
- When we load a compilation database, we will try to scan every TU recorded in the compilation database by the same process of clang-scan-deps to get the modules related files. For these modules related files, we will build a modules dependency graph based on the scanning results.
- Every time we update a file, all the affected files (e.g., we changed a header file) will be re-scanned. This is necessary since we don't know if the change will introduce modules related things. And we will update the modules graph.
- When we want to build a file, we will try to see if all the BMIs of the its dependencies are already built (third party modules are not included), if yes, go ahead to compile it. If not, we will wait for the modules manager to try to build all the dependent BMIs. No matter if the building is success or not, we will be resumed. Note that this implies that the BMIs are built lazily.
- When we compile a file, all the options (-fmodule-file=<module-name>=<module-path>) to specify the position of BMIs will be dropped. And the new modules global compilation database will insert a new search path to the BMIs built by clangd itself. So that we're not version locked with the compiler the user uses and we won't affect user's build.
Missing functionalities
The major missing functionality is that when we update a module unit, its users won't update automatically. For example,
// b.cppm export module b; export int bb = 43; // a.cpp import b; int aa = bb;
After initialization, we will see the value of bb in a.cpp is correctly 43 in the code intelligence. But if we change the value of bb into 44 in b.cppm, the value of bb displayed in a.cpp is still 43. We can get the newest result by inserting and deleting an empty line to a.cpp and save it. (Any change should work too.)
The reason why I don't address this in the patch is that the patch itself is already big now. The larger it is, the harder it is to review it. Also I think it is usable in some level. So let's try to address this in later patches.
Other problems
The major problem I see now is that we can't handle third party modules. Third party modules refer to the modules whose source codes are not in current project. The users are still able to see the hint from clangd if the BMI of the third party modules are built ahead of time. I think this is true for a lot use cases. But this breaks our goal to not be locked by the same version compiler.
I think we can only solve the issue after SG15 solves it. I know SG15 is discussing how to let the modules communicate across libraries boundary. And it is not wise to invent wheels agains SG15. So let's wait for that.
Unable to make the clangd built BMI persist now
In the above link, @nridge requires to persist the clangd built BMIs so that we can reuse the BMIs across the invocation of clangd. This is not addressed in the patch. One reason is the same with above one. The current patch is already large. And I find that it is not trivial to do this. Since we need to check the consistency of the built BMI with the source codes. I know clang has similar functionalities but I feel we may need some code to adapt that. So I choose to not implement it in the first step.
Performance
I tested this with a modularized library: https://github.com/alibaba/async_simple/tree/CXX20Modules. This library has 3 modules (async_simple, std and asio) and 65 module units. (Note that a module consists of multiple module units). Both std module and asio module have 100k+ lines of code (maybe more, I didn't count). And async_simple itself has 8k lines of code. This is the scale of the project.
I opened a file in the end of the dependency chain and restart the clangd server, the log shows that it takes 10s to get things ready.
hmmm not a good number actually. But I found the major reason is that the speed to built the BMIs is too slow. And the major potential improvements should live in the compiler side. The compiler should offer an option to build BMIs lightly. I don't feel we can do a lot in the clangd side.
I think currently the performance issues should mainly be related to the number module units and the lines of code in the module units. That said, it doesn't matter if we have a big project but there is no or small scaled module units.
Plans
This is clearly not so good. But I feel it is basically workable. A little bit awkward to get this in the end of the release circle. But I still want to try to land this in clang17 as it is an important features to modules users.
If we can land this, I plan to implement the light mode in clang first and implement the persist BMI feature then.
these details seem unused outside the cpp file (and untested), so should be in the cpp file - this would make it easier to understand which part of this file is the interface