The goal of this patch is to speed up the code that searches for a dSYM
next to an executable.
I generated a project with 10,000 c source files and accompanying header
files using a script that I wrote. I generated objects for each of them
(with debug info in it) but intentionally did not create a dSYM for
anything. I also built a driver, and then linked everything into one
binary. The project itself is very simple, but the number of object files
involved is very large. I then measured lldb's performance debugging
this binary. (NB: I plan on putting this script into lldb's script
directory at some point after some cleanups)
Using the command lldb -x -b -o 'b main' $binary (launch lldb with no
lldbinit, place a breakpoint on main, and then quit), I measured how
long this takes and measured where time was being spent.
On my machine, I used hyperfine to get some statistics about how long
this actually took:
alex@alangford many-objects % hyperfine -w 3 -- "$lldb -x -b -o 'b main' $binary" Benchmark 1: $lldb -x -b -o 'b main' $binary Time (mean ± σ): 4.395 s ± 0.046 s [User: 3.239 s, System: 1.245 s] Range (min … max): 4.343 s … 4.471 s 10 runs
Out of the ~4.4 seconds of work, it looks like ~630ms were being spent
on LocateDSYMInVincinityOfExecutable. After digging in further, we
were spending the majority of our time manipulating paths in FileSpec
with RemoveLastPathComponent and AppendPathComponent. There were a
lot of FileSystem operations as well, so I made an attempt to minimize
the amount of calls to Exists and Resolve as well.
Given how widely used the code in FileSpec is, I opted to improve this
codepath specifically rather than attempt to refactor the internals of
FileSpec. I believe that improving FileSpec is also worth doing, but as
it is a larger change with far reaching implications, I opted to make
this change instead.
With this change, we shave off approximately 600ms:
alex@alangford many-objects % hyperfine -w 3 -- "$lldb -x -b -o 'b main' $binary" Benchmark 1: $lldb -x -b -o 'b main' $binary Time (mean ± σ): 3.822 s ± 0.056 s [User: 2.749 s, System: 1.167 s] Range (min … max): 3.750 s … 3.920 s 10 runs
Why do you need the first call to append ?