Ping.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jun 27 2022
Jun 15 2022
In D122974#3586352, @dblaikie wrote:^ I think it's still worthwhile/necessary to separate LLDB's use case/hashing algorithm choice from LLVM's so LLVM's code can be changed to be more change resilient in a way that LLDB's cannot (eg: random seeds will never be usable by LLDB but may be for LLVM).
Ping.
Jun 14 2022
Jun 8 2022
In D122974#3567278, @dblaikie wrote:Then I still don't know what the problem is supposed to be. If the StringMap hash implementation ever changes, the necessary LLDB rebuild will detect this, the relevant LLDB parts will get adjusted and problem solved.
What I mean is if the cache is used across statically linked versions - eg: cache is created, someone installs an update to lldb, then the cache is read back and misinterprets the hashes in the cache if the hash algorithm had changed between versions.
Ping.
May 26 2022
In D122974#3538413, @dblaikie wrote:In D122974#3536342, @llunak wrote:D124704 adds a unittest that compares StringMap::hash() to a known hardcoded value, so whenever the hash implementation changes, it won't be possible to unittest LLDB with that change, and that will be the time to change the lldb cache version.
Ah, good stuff - doesn't guarantee that any hash change necessarily breaks the test, but certainly helps/seems like a good idea, thanks!
May 24 2022
In D122974#3535669, @dblaikie wrote:It doesn't make sense to require a stable hash algorithm for an internal cache file.
It's at least a stronger stability requirement than may be provided before - like: stable across process boundaries, at least, by the sounds of it? yeah?
In D125184#3516920, @philnik wrote:The new debug mode won't break the ABI of the current debug mode AFAIK. I don't know if we would actually keep these functions in the dylib, but I fear that will be the case if we add them silently now.
Added a unittest that explicitly checks the implementation of the hash value has not changed (+ comments on what to do if that ever happens).
May 23 2022
Added documentation for StringMap::hash(), including an explicit comment saying that the implementation is not guaranteed to be stable.
In D122974#3532822, @dblaikie wrote:If we want a structure that can use a stable hash
May 22 2022
- use DataExtractor in StringTableReader
- rely on StringMap assert to check that hash algorithm has not changed
- add a function for selecting a pool in ConstString
- use uint32_t for hash in StringMap API
May 21 2022
May 16 2022
In D125184#3516573, @philnik wrote:My main problem with this approach is that we add symbols to the dylib which we know will be obsolete in a few months, but we have to keep them forever due to ABI concerns.
May 15 2022
In D125184#3514289, @Mordante wrote:I agree this is worth fixing, but I'm not convinced that this solution is safe and doesn't lead to ODR violations when different translation units are compiled with different values of _LIBCPP_DEBUG_LEVEL.
Added to the abi check list, I hope I got it right.
May 14 2022
I'd like to give this another try after all. AFAICT using an inline namespace solves the problem of ABI for non-debug mode, as this way it is not affected at all, while the debug mode std::to_string() is now properly a separate function.
May 8 2022
This is getting too hackish, so I'm abandoning this.
I see. Until that some time, how about at least this one then? It's admitedly ugly, but still less ugly then crashing later after the function has been called.
May 4 2022
Used a temporary variable instead of repeated 'm_string_pools[h]'.
Added assert that the passed-in hash value matches, guarded by EXPENSIVE_CHECKS. It will assert also hashes computed by StringMap itself, but checking only values passed from outside would mean adding a number of *Impl functions and adding asserts in a number of places.
Hashes are saved directly in string table.
Changed filenames of cache files to be different from version 1.
No support for reading old format.
In D124704#3486971, @clayborg wrote:A few things we might think of for this patch to improve it a bit: we use a StringTableReader and ConstStringTable to read and write a string table to disk, we could save the hashes before each string in the string table data itself. Then we don't need to change the format of any other sections (the symbol table or the manual DWARF index as we would always write out string table offsets just like before, but the offsets would be different since we would always write the hash + string into the string table data. It is ok for the string table data to contain the hashes for all strings since we store only ConstString values and we always need the hashes. This might save on some space since we would only have 1 hash per string instead of possible many hashes for the same string offset in the string table.
May 3 2022
The prerequisities fo this change have been pushed, so this one is ready.
The "Build Status" here lists a failure, but I cannot reproduce any test failure locally, and in the remote log I do not see any test failure, it looks like the testsuite itself failing. Is that something I should ignore, or am I missing something?
May 1 2022
This makes TestDecodeCStringMaps fail. Is there an easy way for me to re-generate the data for the test?
Apr 30 2022
Changed to use std::call_once().
Updated according to review comments.
In D122974#3483647, @llunak wrote:I can measure 10% startup time saved when everything is already cached. Funnily enough, profiler consistently claims that saving caches got faster too (I already use D122975).
Added missing hash write to EncodeCStrMap().
Apr 29 2022
In D122974#3483203, @clayborg wrote:If the string pool caches the hash value, we could actually write out the hash in the cache file to speed up loading.
In D122974#3482556, @labath wrote:Interesting. I don't know if I missed this somewhere, but could explain what kind of a map operation can lldb perform without computing the hash at least once?
In D122974#3481852, @labath wrote:In D122974#3481269, @llunak wrote:But what if I don't want StringMap to compute the hash at all? There's still that 10-15% of CPU time spent in djbHash() when debug info uses exactly[*] that (and LLDB's index cache could be modified to store it). Which means it can be useful for StringMap to allow accepting precomputed hash, and then what purpose will that HashedStringRef have?
I think that David's point was that you would use a (probably static) StringMap method to produce the hash object, and then you use the hash object to select the correct map, and also to insert the key in the map. Something like:
...
That should only produce one hash computation, right?
Apr 28 2022
In D122974#3480686, @dblaikie wrote:In D122974#3424654, @JDevlieghere wrote:
...
struct HashedStringRef { const llvm::StringRef str; const unsigned full_hash; const uint8_t hash; HashedStringRef(llvm::StringRef s) : str(s), full_hash(djbHash(str)), hash(hash(str)) {} }
...
The external code shouldn't be able to create their own (ctor private/protected, etc) - the only way to get one is to call StringMap to produce one.
Apr 20 2022
Ping.
Apr 18 2022
Added asserts that wait() will not deadlock waiting for itself.
Added more documentation about deadlocks and usage from within threadpool's threads.
Apr 16 2022
Adapted to API changes from D123225.
ThreadPool object is now created/destroyed by Debugger class ctor/dtor.
Adapted to API changes from D123225.
Small tweaks based on feedback.
Changed ThreadPool::TaskGroup to standalone ThreadPoolTaskGroup that has trivial calls forwarding to ThreadPool functions.
Apr 12 2022
The change looks good to me too, if that counts as anything from an outsider. But as an outsider I think you LLVM folks tend to overdo the perfectionism while reviewing. There's rather obviously nothing visibly wrong with the change, the chance it'll break something is extremely low, you apparently know this code, and you pushing this should be fine even according to the policy (the "likely-community-consensus" part) rather than blocking on somebody who apparently has a long enough review queue and could be instead reviewing changes that actually need it (*cough* D122974 *cough*).
Ping @dblaikie ?
Apr 11 2022
In D123020#3442434, @labath wrote:In D123020#3437246, @llunak wrote:In D123020#3426867, @JDevlieghere wrote:FWIW the official policy is outlined here: https://llvm.org/docs/CodeReview.html
I'm aware of it, but as far as I can judge I was following it. Even reading it now again I see nothing that I would understand as mandating review for everything.
It does say "patches that meet likely-community-consensus requirements can be committed prior to an explicit review" and "where there is any uncertainty, a patch should be reviewed prior to being committed".
It can be hard to judge what is a likely-community-consensus without being an active member of the community, which is why it's safer to go down the pre-commit review path.Also note that when I said that "all patches are expected to be reviewed", that included both pre-commit and post-commit review. I deliberately used passive voice because in the latter case, there's nothing for you (as the patch author) to do. It's generally up to the owners of individual components to ensure that all patches going in get reviewed by someone. Since there's no paper trail, this is very hard to verify, but I can tell you that people do that, and that it's not a good way to introduce yourself to someone.
Apr 10 2022
Changed to always disable notify and added a comment about that to LoadModuleAtAddress().
Apr 8 2022
In D123158#3438694, @urnathan wrote:understood. Mind if I grab the PR? (Is there an actual PR to grab?)
Apr 7 2022
In D123020#3426867, @JDevlieghere wrote:FWIW the official policy is outlined here: https://llvm.org/docs/CodeReview.html
In D123158#3432365, @urnathan wrote:ETA:but it doesn't matter whether this check runs multiple times, in racing threads? I guess one wants an atomic set though.
Let's first deal with the conceptual stuff, no point in dealing with the small code things as long as there's not agreement that this is the way to implement it.
Apr 6 2022
Apr 5 2022
In D122975#3430613, @JDevlieghere wrote:After applying this patch I started seeing data races reported by TSan when running the shell tests (check-lldb-shell). It seems to happen to different tests on different runs but the backtraces are the same.
Ok, parallelizing of only Module::PreloadSymbols() was simpler than I expected and it also works, so I've reworked the patch to do that.
In D122975#3428876, @labath wrote:OK, got it. So, for this case, I think the best approach would be to extract and paralelize the PreloadSymbols calls. They are not deep (=> relatively easy to extract), optional (they are just a performance optimization, so nothing will break if they're skipped), and completely isolated (they only access data from the single module).
Apr 4 2022
Added a test.
In D122975#3427008, @clayborg wrote:I had tried something similar with the thread pools when trying to parallelize similar stuff. The solution I made was to have a global thread pool for the entire LLDB process, but then the LLVM thread pool stuff needed to be modified to handle different groups of threads where work could be added to a queue and then users can wait on the queue. The queues then need to be managed by the thread pool code. Queues could also be serial queues or concurrent queues. I never completed the patch, but just wanted to pass along the ideas I had used. So instead of adding everything to a separate pool, the main thread pool could take queues. The code for your code above would look something like:
In D123020#3426252, @labath wrote:BTW, does it make sense to get even things like this reviewed, or is it ok if I push these directly if I'm reasonably certain I know what I'm doing? I feel like I'm spamming you by now.
Generally, I would say yes. I'm not even sure that some of your other patches should have been submitted without a pre-commit review (*all* llvm patches are expected to be reviewed).
In D122975#3426575, @labath wrote:I'd like to understand what is the precise thing that you're trying to optimise. If there is like a single hot piece of code that we want to optimise, then we might be able to come up with an different approach (a'la PrefetchModuleSpecs) to achieve the same thing.