Page MenuHomePhabricator

[BasicAA] Be more careful with modulo ops on VariableGEPIndex.
Needs ReviewPublic

Authored by fhahn on Mar 26 2021, 9:58 AM.

Details

Summary

(V * Scale) % X may not produce the same result for any possible value
of V, e.g. if the multiplication overflows. This means we currently
incorrectly determine NoAlias in some cases.

This patch adjusts the code linarizing GEPs to try to track if the
modulo semantics are preserved. There might still be GEPs and cases
where we miss, but it should address a few critical cases.

Diff Detail

Event Timeline

fhahn created this revision.Mar 26 2021, 9:58 AM
fhahn requested review of this revision.Mar 26 2021, 9:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2021, 9:58 AM

I'm somewhat confused by the implementation approach in this patch. I think there's really two orthogonal cases: The first is if all operations are non-wrapping, in which case we can optimize for arbitrary modulos. The second is that we can always use power of two factors from the GCD, because arithmetic is over a power of two field.

I think handling for these two cases should be split. The first part can be handled via flag, and the second part can be handled by only extracting the power of two part if the flag is not set. I think this should both make the semantics of the flag clear and be more precise.

fhahn updated this revision to Diff 335931.Wed, Apr 7, 2:40 PM

Rebased on top of latest changes.

I'm somewhat confused by the implementation approach in this patch. I think there's really two orthogonal cases: The first is if all operations are non-wrapping, in which case we can optimize for arbitrary modulos. The second is that we can always use power of two factors from the GCD, because arithmetic is over a power of two field.

I think handling for these two cases should be split. The first part can be handled via flag, and the second part can be handled by only extracting the power of two part if the flag is not set. I think this should both make the semantics of the flag clear and be more precise.

I'm not sure where the logic should be split? I tried to just have a may-wrap flag and then check if Scale is a power-of-2 when the scale is used for the modulo computations. But I think that wouldn't correctly handle cases like (a + x) * 2^y, if the multiply operand gets distributed. Or were you thinking about something else?

nikic added inline comments.Wed, Apr 7, 3:01 PM
llvm/test/Analysis/BasicAA/gep-modulo.ll
133

This is an example of the case I have in mind: This can be NoAlias (https://alive2.llvm.org/ce/z/yYFAAz) even though the mul+sub can wrap. Despite the wrapping, we are still guaranteed that the power-of-2 part of the GCD still holds.

nikic added inline comments.Wed, Apr 7, 3:14 PM
llvm/test/Analysis/BasicAA/gep-modulo.ll
133

Eh, maybe I'm wrong here regarding the general case. I'll have to think about it more carefully.

fhahn added a comment.Thu, Apr 8, 1:54 AM
llvm/test/Analysis/BasicAA/gep-modulo.ll
133

I think it may hold for cases where we compute % power-of-2, but not if the operand is not a power-of-2? I'm not too familiar with the code that actually uses the modulo, but if that's the case we may be able to use this? If so, we should be able to do so by adjusting the check to
if (!DecompGEP1.VarIndices[i].PreservesModulo && !Scale.isPowerOf2())

fhahn added a comment.Thu, Apr 8, 2:48 AM

The impact of this seems quite limited. Below the AA stats for MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto:

aa.NumMayAlias
Program                                        base      basicaa   diff 
 test-suite...urce/Applications/aha/aha.test   271.00    292.00     7.7%
 test-suite...rks/FreeBench/pifft/pifft.test   14990.00  15689.00   4.7%
 test-suite...CFP2006/433.milc/433.milc.test   13793.00  14243.00   3.3%
 test-suite...-dbl/LinearDependence-dbl.test   522.00    525.00     0.6%
 test-suite...T2000/300.twolf/300.twolf.test   45884.00  46038.00   0.3%
 test-suite...TimberWolfMC/timberwolfmc.test   49643.00  49798.00   0.3%
 test-suite...-flt/LinearDependence-flt.test   1216.00   1219.00    0.2%
 test-suite...CFP2000/177.mesa/177.mesa.test   67899.00  68059.00   0.2%
 test-suite.../CINT2006/429.mcf/429.mcf.test   1134.00   1136.00    0.2%
 test-suite.../CINT2000/181.mcf/181.mcf.test   1173.00   1175.00    0.2%
 test-suite...libquantum/462.libquantum.test   3498.00   3502.00    0.1%
 test-suite...ications/JM/lencod/lencod.test   293097.00 293418.00  0.1%
 test-suite...nal/skidmarks10/skidmarks.test   16112.00  16126.00   0.1%
 test-suite...006/453.povray/453.povray.test   120925.00 121011.00  0.1%
 test-suite.../CINT2000/254.gap/254.gap.test   67251.00  67293.00   0.1%
 Geomean difference                                                 0.5%

aa.NumMustAlias
Program                                        base     basicaa  diff 
 test-suite...CFP2006/433.milc/433.milc.test   1932.00  1901.00  -1.6%
 test-suite.../Benchmarks/Bullet/bullet.test   25887.00 25875.00 -0.0%
 test-suite...CFP2000/177.mesa/177.mesa.test   8657.00  8653.00  -0.0%
 test-suite...006/453.povray/453.povray.test   28386.00 28374.00 -0.0%
 Geomean difference                                              -0.4%

aa.NumNoAlias
Program                                        base       basicaa    diff 
 test-suite...urce/Applications/aha/aha.test   4913.00    4832.00    -1.6%
 test-suite...rks/FreeBench/pifft/pifft.test   96670.00   95195.00   -1.5%
 test-suite...CFP2006/433.milc/433.milc.test   96144.00   95308.00   -0.9%
 test-suite...-dbl/LinearDependence-dbl.test   3488.00    3461.00    -0.8%
 test-suite...-flt/LinearDependence-flt.test   8855.00    8828.00    -0.3%
 test-suite...TimberWolfMC/timberwolfmc.test   306639.00  305968.00  -0.2%
 test-suite...T2000/300.twolf/300.twolf.test   458839.00  458169.00  -0.1%
 test-suite...000/183.equake/183.equake.test   75059.00   75009.00   -0.1%
 test-suite...libquantum/462.libquantum.test   7910.00    7906.00    -0.1%
 test-suite...CFP2000/177.mesa/177.mesa.test   607413.00  607138.00  -0.0%
 test-suite...oxyApps-C/miniAMR/miniAMR.test   120107.00  120076.00  -0.0%
 test-suite.../CINT2000/254.gap/254.gap.test   193489.00  193447.00  -0.0%
 test-suite.../CINT2000/181.mcf/181.mcf.test   10422.00   10420.00   -0.0%
 test-suite.../CINT2006/429.mcf/429.mcf.test   12104.00   12102.00   -0.0%
 test-suite...ications/JM/lencod/lencod.test   2077915.00 2077590.00 -0.0%
 Geomean difference                                                  -0.2%

And the size impact (binary changes to 9 out of 237 binaries)

Same hash: 228 (filtered out)
Remaining: 9
Metric: size.__text

Program                                        base     basicaa  diff
 test-suite...arks/mafft/pairlocalalign.test    233639   233927   0.1%
 test-suite...-dbl/LinearDependence-dbl.test    58195    58243    0.1%
 test-suite.../Benchmarks/Bullet/bullet.test    309159   309175   0.0%
 test-suite...006/453.povray/453.povray.test    1152978  1152994  0.0%
 test-suite...CFP2000/177.mesa/177.mesa.test    694753   694753   0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test    3238353  3238353  0.0%
 test-suite.../CINT2000/176.gcc/176.gcc.test    1429678  1429678  0.0%
 test-suite...CFP2006/433.milc/433.milc.test    119989   119861  -0.1%
 test-suite...-flt/LinearDependence-flt.test    54951    54743   -0.4%

There's no difference with special handling for % power-of-2.

nikic added a comment.Thu, Apr 22, 1:06 PM

Okay, I've finally gotten around to looking into this in more detail. Because thinking about this really fries my brain, I ended up modelling this in alive2 (hopefully correctly...)

https://alive2.llvm.org/ce/z/HYBxGs This is the general case and shows that we only need nsw flags on the arithmetic, rather than both nuw and nsw. This makes sense, as all the arithmetic involved is signed.

https://alive2.llvm.org/ce/z/qXicaB This shows that it is indeed safe to always take the power-of-2 portion of the scale, even if the arithmetic is not nsw.