Sometimes if llvm-reduce is interrupted in the middle of a delta pass on
a large file, it can take quite some time for the tool to start actually
doing new work if it is restarted again on the partially-reduced file. A
lot of time ends up being spent testing large chunks when these large
chunks are very unlikely to actually pass the interestingness test. In
cases like this, the tool will complete faster if the starting
granularity is reduced to a finer amount. Thus, we introduce a command
line flag that automatically divides the chunks into smaller subsets a
fixed, user-specified number of times prior to beginning the core loop.
Details
- Reviewers
aeubanks - Commits
- rGfbfd327fdf1e: [llvm-reduce] Add flag to start at finer granularity
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Does llvm-reduce actually take a long time for you? Even on large files it's only ever taken at most 2 minutes. What kind of llvm-reduce times are you seeing?
I mean, I imagine the time it takes depends substantially on the interestingness test you specify and how much of the file is in fact interesting, but the use case I am using it for, where I have a large llvm bitcode file that I need to link and execute in order to test whether or not it is interesting, has taken hours upon hours just to run a couple of the delta passes. I assume that's because the algorithm is actually O((n^2)*log(n)), which doesn't perform particularly great on large values of N.
Note that when I say "large" though I mean that the initial bitcode file that was originally passed to the reducer started out at roughly 25 megabytes. That's been reduced down to 1.8 megabytes now, but it is still hours into the process of running the reduce basic blocks pass and showing no sign of stopping soon.
@aeubanks is there any update on this? I have found that it can in some cases be used to cut the time spent on the function-bodies delta pass by up to 50%, assuming that very few function bodies can safely be removed. We are talking a speedup on the order of a day or two if the llvm module is big enough. It would be really nice if this could be upstreamed.
hmm, it still doesn't make sense to me that you'd actually attempt to use/link the IR produced by llvm-reduce since it's definitely not semantics preserving in any way at all
but anyway, if this is purely for performance then sure
the test could attempt to test the behavior a bit more, although I'm not sure how you'd do that aside from looking at logging, so this lgtm