Bypass of slow divs based on operand values is currently disabled for
-Os. Do the same when profile summary is available and the working set
size of the application is huge. This is similar to how loop peeling is
guarded by hasHugeWorkingSetSize. In the div bypass case, the generated
extra code (and the extra branch) tendss to outweigh the benefits of the
bypass. This results in noticeable performance improvement on an
internal application.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
It seems like OptSize and hasHugeWorkingSetSize overlap a lot (a lot of large C++ codebases use -Os to help fit into the icache). Would it make sense to automatically add OptSize markings based on profile data, or something like that, rather than adding hasHugeWorkingSetSize() checks all over the compiler?
Comment Actions
hasHugeWorkingSetSize () can be used to selectively disable size increase transformations while Os can be more aggressive at the cost of performance given its mission, so I don't think we should piggyback on Os for performance purpose.