User Details
- User Since
- Mar 19 2022, 12:06 AM (52 w, 2 d)
Wed, Mar 15
Tue, Feb 21
Feb 8 2023
Feb 7 2023
update tests because API change from main
Cleanup unit test
Feb 6 2023
Feb 5 2023
@snehasish Please re-review since I introduced a major change to the code.
Jan 31 2023
use llvm::unittest::TempFile instead
cleanup
Refactored to fix bugs on non linux platforms
Jan 25 2023
Update unit test, previously may crash on mac OS
Jan 17 2023
Updated unit test
Updated unit test
Jan 11 2023
Jan 10 2023
Jan 9 2023
Clarified comments
Jan 4 2023
Refactor: moved implementation to llvm lib so that it can be used by other tools
Jan 3 2023
Dec 29 2022
Add comment clarifying reset() usage
Dec 28 2022
The check also looks suspicious by casting size_t to uint64_t. If size_t is 64 bit the cast has no effect, and if size_t is 32 bit then the condition is always false, so I don't understand what's the purpose of this check
Updating D139603: [llvm-profdata] Add option to cap profile output size
Dec 15 2022
As for down sampling, having a sample count of 0 vs not having a sample means differently to the compiler, so that may change the branch basic block placement on hot functions, not sure if good idea.
The biggest challenge to compute the number of functions accurately is the compression in extbinary, because the compressed size is non-linear to the original size. Since profile samples and function names are written to different sections (and in CS profile the names are split into two sections and samples can also be split into two sections), there is no way to predict ahead the offset between them. Based on use cases, the current heuristic is under estimating how many functions to prune (and the last iteration typically converges to pruning 1 function) so it's unlikely to remove too many functions. (Note: Also tried using cubic equation for heuristic but that will remove too many functions, so the optimal heuristic is between O(n^2) and O(n^3))
Added output size check
Dec 14 2022
Refactored code structure
Use a string buffer to rewrite files
Added API for potential new strategy to reduce profile size
Dec 7 2022
Dec 1 2022
I would need some clarification on inbounds keyword. When an inbounds GEP of GEP is being transformed, what kind of transformation and conditions keep the new GEP inbounds? For example GEP inbounds (GEP inbounds P a) b is equivalent to GEP inbounds (GEP inbounds P b) a if and only if a and b have the same sign. Are there other algebraically valid transformations? This actually does affect D137212 since it is swapping constant indexed GEP. Maybe I misunderstood what inbounds implies? I noticed arbitrary pointer arithmetic expression in C generates inbounds GEP even the pointer is clearly not pointing to any allocated object
Nov 29 2022
Nov 22 2022
Could someone reproduce the test case (and benchmarking results) showing regression?
Nov 10 2022
Nov 9 2022
Nov 8 2022
Merged after adding baseline
See D137664 for baseline
Nov 1 2022
Oct 20 2022
merge with main
Oct 12 2022
follow up @davidxl
For the original issue where a chain of 3 or more gep with the first and last being constant indexed cannot be simplified, this patch can handle such case while D125845 can't
Consider the following code
01 b = gep a, const_index 02 use(b) 03 c = gep b, var_index 04 d = gep c, const_index 05 ret d
Aug 31 2022
Consider the following practical example
struct Vec { float x, y, z; };
Aug 23 2022
It is more of a precaution, but I did come across very large afdo profiles.
Aug 9 2022
@efriedma This is a limitation of the current patch that it could only handle the case with arguments in matching order. After some investigation I found that the current function call lowering pass cannot handle in-place stack argument shuffling at all, as it doesn't take read/write dependency into account. It would require an extensive refactoring to completely fix it.
Aug 4 2022
Added comments
Restructured logic to determine isSibcall
Aug 3 2022
Updated comments
Aug 2 2022
Jul 14 2022
- Remove now irrelevant test
Note: this is a NEW patch and the implementation is different from D125845.