This is an archive of the discontinued LLVM Phabricator instance.

Read ObjC class names in one large read, instead of reading them individually
AbandonedPublic

Authored by jasonmolenda on Apr 21 2019, 11:19 PM.

Details

Reviewers
jingham
Summary

There's a perf problem with Objective-C programs as we add more classes to the Darwin libraries over time - when lldb goes to run its first expression on one of these systems, it will load the ObjC class names from the run time, both the shared cache libraries and any app-contributed classes. lldb has two expressions it runs in the inferior to collect this information into a buffer allocated by lldb - returning the isa pointers and a hash of each class name used to id them. lldb would then read the names of the classes out of memory, which were not localized to one region of memory, and this has become a larger and larger performance issue.

This patch modifies:

  1. The two jitted functions, g_get_dynamic_class_info_body and g_get_shared_cache_class_info_body, now take the address of an allocated string pool buffer and size. If the address of the string pool is 0, no names are copied. The functions will copy the class names in to the string pool buffer and add an offset to the (isa, hash) that were previously being returned. If an entry in the (isa, hash, stroffset) array does not have an entry in the string pool, UINT32_MAX is used. If lldb's pre-allocated string pool buffer is too small, entries that did not fit will get UINT32_MAX and lldb will read the class names the old slow way.
  1. Modifies the two methods that call these jitted functions, AppleObjCRuntimeV2::UpdateISAToDescriptorMapDynamic and AppleObjCRuntimeV2::UpdateISAToDescriptorMapSharedCache, to allocate the string pool buffer, pass the address and size to the jitted expression, scan the array of (isa, hash, stroffset) tuples to find the highest stroffset, read that part of the string pool out of the inferior with a single read [1], pass the buffer to ParseClassInfoArray, deallocate the stringpool from the inferior.
  1. Modifies AppleObjCRuntimeV2::ParseClassInfoArray to grab the class name from the stringpool buffer if the stroffset is != UINT32_MAX and is within the copied buffer size.

[1] nb: I just saw an unintended behavior on the last class name as I was writing this. Given that it uses the highest stroffset it finds to read the string pool back out of the inferior, the final class name won't be copied up. The test in ParseClassInfoArray to check that the stroffset is within the bounds of the copied buffer should mean that we read the last class name out of the inferior aka the old method. I'll double check this is handled correctly tomorrow.

This patch also includes a change that Frederic Riss wrote the other month but hadn't upstreamed yet, where we detect swift names in the shared cache and don't compute the hash in the jitted function, doing it up in lldb later.

Testing the patch shows no testsuite difference on Mac native. As an experiment, I intentionally introduced a bug where the class names in the string pool were corrupted and it caused the testsuite failures I expected. From a performance point of view, this shows the packet behavior I was aiming for.

rdar://problem/27798609

Diff Detail

Repository
rLLDB LLDB

Event Timeline

jasonmolenda created this revision.Apr 21 2019, 11:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2019, 11:19 PM
jasonmolenda edited the summary of this revision. (Show Details)Apr 21 2019, 11:20 PM
jasonmolenda edited the summary of this revision. (Show Details)

Fix one small bug in the jitted expressions. Increase the estimated size of class names that lldb uses to pre-allocate the string pool buffer. Added a method to read the string pool out of the inferior, to reduce code duplication, and have it read the last name to get the full size of the string pool correct.

jasonmolenda abandoned this revision.May 2 2019, 3:15 PM

I had some more revisions on this patch, but I finally understood that I was solving the wrong problem with this patchset. lldb should not fetch class names up front when scanning the objective c class tables, they should only be read lazily as we evaluate an expression. The fact that we're reading names during the table scan is the problem that needs to be fixed.