LoopFlatten improves a well known embedded benchmark with highly-popular industry applications with a few percentage points. But it is not restricted to just optimise a single benchmark case. Find below results for the llvm test suite and the number of loops it flattened:
Test # Loops flattened -------------------------------------------------------------------------------------------- MultiSource/Applications/JM/lencod/lencod 3 MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg 1 MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg 3 MultiSource/Applications/JM/ldecod/ldecod 1 MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 3 MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 17 SingleSource/Benchmarks/Misc/himenobmtxpa 2 MicroBenchmarks/ImageProcessing/AnisotropicDiffusion/AnisotropicDiffusion 2 MicroBenchmarks/ImageProcessing/BilateralFiltering/BilateralFilter 2 MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG/miniGMG 20 MultiSource/Benchmarks/Rodinia/pathfinder/pathfinder 1 MicroBenchmarks/ImageProcessing/Blur/blur 2 MicroBenchmarks/ImageProcessing/Dither/Dither 2 MicroBenchmarks/ImageProcessing/Dilate/Dilate 2 MultiSource/Benchmarks/DOE-ProxyApps-C++/HPCCG/HPCCG 1 MultiSource/Benchmarks/DOE-ProxyApps-C/SimpleMOC/SimpleMOC 1 MicroBenchmarks/ImageProcessing/Interpolation/Interpolation 2 MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk 2 MultiSource/Benchmarks/Rodinia/backprop/backprop 1 ---------------------------------------------------------------------------------------------- Total 68
While the implementation of LoopFlatten recognises a few patterns and could be made more generic, I believe these numbers show that it's generic enough to trigger on a wide variety of code bases, making it worthwile to enable it by default.
LoopFlatten is a relatively simple pass, it e.g. doesn't implement a computationally expensive algorithm, and doesn't require more analysis than a
typical loop pass. Compile-times for the llvm test suite (ClamAV, 7zip, tramp3d-v4, kimwitu++, sqlite3, mafft, SPASS, lencod, Bullet) show a very minor increase of ~0.04% to 0.28%. There are cases that improve compile times, but I haven't analysed that and don't want to claim of course that in general it will improve compile-times.
We have LoopFlatten enable by default downstream for many years now, thus it should have had a lot of exposure and usage and we are not aware of any problems.
this doesn't look like the right place for the notes
Non-comprehensive list of changes in this release section looks better