The code for traversing precompiled dependencies is somewhat complicated and contains a dangling iterator bug.
This patch simplifies the code and fixes the bug.
Differential D121533
[clang][deps] Fix traversal of precompiled dependencies jansvoboda11 on Mar 12 2022, 12:17 PM. Authored by
Details The code for traversing precompiled dependencies is somewhat complicated and contains a dangling iterator bug. This patch simplifies the code and fixes the bug.
Diff Detail
Event TimelineComment Actions I'm not a big fan of moving to the recursive call. Creating a separate StringMap for each iteration seems awfully wasteful. I'd prefer fixing the iterative algorithm so that it's correct and easy to verify. This can be done with a stack that points at the stable pointers into the StringMap: StringMap<std::string> Imports; SmallVector<StringMapEntry<std::string> *> Worklist; PrebuiltModuleListener Listener(Imports, Worklist, InputFiles, VisitInputFiles); while (!Worklist.empty()) { StringMapEntry<std::string> *Import = Worklist.pop_back_val(); // Deal with "Import". } and in PrebuiltModuleListener only push to the worklist for newly-discovered imports: void visitImport(StringRef ModuleName, StringRef Filename) override { auto I = PrebuiltModuleFiles.insert({ModuleName, Filename.str()}); if (I.second) Worklist.push_back(&*I.first); } Also, is it possible to add a test to exercise the (fixed) the dangling iterator bug? Comment Actions Fair enough, iterative implementation will be better. I simplified it a bit by using the in-out parameter ModuleFiles to keep track of visited files. I also switched to using SmallVector for the newly discovered (not-yet-visited) imports, which allows using the suggested pop_back_val and avoids using (potentially dangling) iterator. The test case is a bit unwieldy, since the old implementation only failed when the StringMap got rehashed (with 16 entries). Comment Actions LGTM, with another comment inline (up to you whether to do something with it).
|
These filenames will usually be longer than std::string's small storage, requiring separate heap allocations for each filename. Can the worklist just point at the stable pointer StringMapEntry<std::string>*, where a filename copy and heap allocation has already been made? (This has also has a nice side effect of handling 3x as many worklist items before needing to allocate, since the element size goes from 24B to 8B.)