This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Restrict equality propagation for pointers
Needs ReviewPublic

Authored by mnadeem on Feb 1 2023, 6:46 PM.

Details

Diff Detail

Event Timeline

mnadeem created this revision.Feb 1 2023, 6:46 PM
mnadeem requested review of this revision.Feb 1 2023, 6:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 6:46 PM
mnadeem updated this revision to Diff 494139.Feb 1 2023, 6:49 PM
nikic added a reviewer: nikic.Feb 2 2023, 12:48 AM
nikic added a subscriber: nikic.

This looks okay on the surface, but this needs at least a spot check of llvm-test-suite IR diffs to get an idea of how much this affects codegen. If there are no substantial regressions, let's go with it. Otherwise we may need to think harder about how we can mitigate regressions.

The one case I'm probably worried about most are checks of the form %p == null, where replacing %p with null is not generally legal because it has nullary provenance. Replacing two arbitrary pointers likely isn't super important, but replacement with null might be.

Note that it *is* possible to perform the replacement for uses that are provenance-independent, in particular inside icmp instructions. Whether we need to make that distinction depends on how the fallout looks like.

mnadeem planned changes to this revision.Feb 2 2023, 5:40 PM

This looks okay on the surface, but this needs at least a spot check of llvm-test-suite IR diffs to get an idea of how much this affects codegen. If there are no substantial regressions, let's go with it. Otherwise we may need to think harder about how we can mitigate regressions.

I'll try to collect some data with the getUnderlyingObject() change you suggested.

@nikic
I was looking at https://llvm.org/devmtg/2018-04/slides/Lopes-Sbirlea-Pointers,%20Alias%20and%20ModRef%20Analyses.pdf page 15, which says that replacement with null ptr is safe. So I did two runs, one where we do pointer replacement only when they have same underlying objects, and the another where we also replace pointers with null.

Here are the diffs for llvm-test-suite bitcode:
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/125698f591df1f8de81ad4119b0838e135a18af8
and
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/a778479334852e7f05b5a248da4473106aa986fb

here is some data:

Tests: 2432
Metric: gvn.NumGVNInstr

Program                                       gvn.NumGVNInstr
                                              base            sameunderlying-and-nullptr sameunderlying diff
MultiSource/Benchmarks/Olden/tsp/tsp            35.00           21.00                      21.00        66.7%
MultiSourc...s/Prolangs-C++/objects/objects     16.00           16.00                      14.00        14.3%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4   7857.00         6964.00                    6936.00        13.3%
MultiSource/Benchmarks/McCat/09-vor/vor         59.00           55.00                      54.00         9.3%
MicroBench...CALS/SubsetBRawLoops/lcalsBRaw   1267.00         1363.00                    1324.00         7.6%
MicroBench...ubsetBLambdaLoops/lcalsBLambda   1270.00         1366.00                    1327.00         7.6%
MicroBench...ubsetALambdaLoops/lcalsALambda   1350.00         1446.00                    1407.00         7.1%
MicroBench...CALS/SubsetARawLoops/lcalsARaw   1321.00         1414.00                    1375.00         7.0%
MicroBench...ubsetCLambdaLoops/lcalsCLambda   1370.00         1466.00                    1427.00         7.0%
MicroBench...CALS/SubsetCRawLoops/lcalsCRaw   1393.00         1489.00                    1450.00         6.9%
MicroBench...teralFiltering/BilateralFilter     69.00           72.00                      72.00         4.3%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR   6769.00         7044.00                    7038.00         4.1%
MicroBench.../ImageProcessing/Dither/Dither     75.00           78.00                      78.00         4.0%
MultiSourc...ks/FreeBench/analyzer/analyzer     29.00           29.00                      30.00         3.4%
SingleSour...ks/Misc-C++/stepanov_container    577.00          558.00                     558.00         3.4%
                           Geomean difference                                                            0.5%
      gvn.NumGVNInstr
run              base sameunderlying-and-nullptr sameunderlying        diff
count  334.000000      334.000000                 334.000000     334.000000
mean   376.790419      376.655689                 375.895210     0.005643
std    1264.258518     1254.702622                1253.774169    0.039618
min    1.000000        1.000000                   1.000000       0.000000
25%    9.000000        9.000000                   9.000000       0.000000
50%    29.000000       29.000000                  29.000000      0.000000
75%    125.000000      125.000000                 125.000000     0.000000
max    10671.000000    10671.000000               10671.000000   0.666667
Tests: 2432
Metric: gvn.NumGVNLoad

Program                                       gvn.NumGVNLoad
                                              base           sameunderlying-and-nullptr sameunderlying diff
ultiSource/Applications/SPASS/SPASS           398.00         394.00                     394.00          1.0%
ultiSource/Applications/kimwitu++/kc          204.00         202.00                     202.00          1.0%
ultiSource...-ProxyApps-C++/PENNANT/PENNANT   149.00         150.00                     150.00          0.7%
icroBenchm...ubsetBLambdaLoops/lcalsBLambda   207.00         207.00                     208.00          0.5%
icroBenchm...CALS/SubsetBRawLoops/lcalsBRaw   209.00         209.00                     210.00          0.5%
icroBenchm...CALS/SubsetARawLoops/lcalsARaw   226.00         226.00                     227.00          0.4%
icroBenchm...ubsetALambdaLoops/lcalsALambda   230.00         230.00                     231.00          0.4%
icroBenchm...ubsetCLambdaLoops/lcalsCLambda   231.00         231.00                     232.00          0.4%
icroBenchm...CALS/SubsetCRawLoops/lcalsCRaw   244.00         244.00                     245.00          0.4%
ultiSource.../Applications/JM/ldecod/ldecod   646.00         645.00                     645.00          0.2%
ultiSource.../DOE-ProxyApps-C++/CLAMR/CLAMR   934.00         933.00                     933.00          0.1%
ultiSource...nchmarks/tramp3d-v4/tramp3d-v4   978.00         977.00                     977.00          0.1%
ultiSource...ring-flt/LoopRestructuring-flt     1.00           1.00                       1.00          0.0%
ultiSource...ring-dbl/LoopRestructuring-dbl     1.00           1.00                       1.00          0.0%
ultiSource...marks/Trimaran/enc-pc1/enc-pc1     2.00           2.00                       2.00          0.0%
                           Geomean difference                                                           0.0%
      gvn.NumGVNLoad
run             base sameunderlying-and-nullptr sameunderlying        diff
count  216.000000     216.000000                 216.000000     216.000000
mean   92.689815      92.652778                  92.680556      0.000265
std    283.845653     283.785744                 283.798757     0.001286
min    1.000000       1.000000                   1.000000       0.000000
25%    2.000000       2.000000                   2.000000       0.000000
50%    10.000000      10.000000                  10.000000      0.000000
75%    44.000000      44.000000                  44.000000      0.000000
max    2516.000000    2516.000000                2516.000000    0.010152
Tests: 2432
Metric: gvn.NumGVNPRE

Program                                       gvn.NumGVNPRE
                                              base          sameunderlying-and-nullptr sameunderlying diff
SingleSour.../Benchmarks/Misc-C++-EH/spirit     5.00          2.00                       2.00         150.0%
SingleSour...ks/Misc-C++/stepanov_container    22.00         19.00                      19.00          15.8%
MultiSourc...-ProxyApps-C++/PENNANT/PENNANT    12.00         11.00                      11.00           9.1%
MultiSourc...e/Applications/ClamAV/clamscan   109.00        107.00                     107.00           1.9%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR   377.00        372.00                     372.00           1.3%
Bitcode/Be...hmarks/Halide/blur/halide_blur     4.00          4.00                       4.00           0.0%
MultiSourc...s/Prolangs-C++/objects/objects     1.00          1.00                       1.00           0.0%
MultiSourc...olangs-C/unix-smail/unix-smail     8.00          8.00                       8.00           0.0%
MultiSourc...s/Prolangs-C/football/football     6.00          6.00                       6.00           0.0%
MultiSourc...s/Prolangs-C/compiler/compiler     2.00          2.00                       2.00           0.0%
MultiSourc...marks/Prolangs-C/bison/mybison    19.00         19.00                      19.00           0.0%
MultiSourc...Prolangs-C/assembler/assembler     1.00          1.00                       1.00           0.0%
MultiSourc...rolangs-C/archie-client/archie     2.00          2.00                       2.00           0.0%
MultiSourc...chmarks/Prolangs-C/agrep/agrep    17.00         17.00                      17.00           0.0%
MultiSourc...gs-C/TimberWolfMC/timberwolfmc    49.00         49.00                      49.00           0.0%
                           Geomean difference                                                           1.1%
      gvn.NumGVNPRE
run            base sameunderlying-and-nullptr sameunderlying        diff
count  111.000000    110.000000                 110.000000     111.000000
mean   20.684685     20.654545                  20.654545      0.016044
std    46.993226     46.829857                  46.829857      0.143183
min    1.000000      1.000000                   1.000000       0.000000
25%    2.000000      2.000000                   2.000000       0.000000
50%    5.000000      5.000000                   5.000000       0.000000
75%    16.000000     16.500000                  16.500000      0.000000
max    377.000000    372.000000                 372.000000     1.500000
Tests: 2432
Metric: gvn.NumGVNBlocks

Program                                       gvn.NumGVNBlocks
                                              base             sameunderlying-and-nullptr sameunderlying diff
MultiSourc...OE-ProxyApps-C++/miniFE/miniFE    69.00            68.00                      68.00          1.5%
MultiSourc...enchmarks/mafft/pairlocalalign   578.00           578.00                     578.00          0.0%
SingleSour...enchmarks/BenchmarkGame/n-body    15.00            15.00                      15.00          0.0%
SingleSour...chmarks/BenchmarkGame/fannkuch    14.00            14.00                      14.00          0.0%
SingleSour...arks/BenchmarkGame/Large/fasta     2.00             2.00                       2.00          0.0%
SingleSour...arks/Adobe-C++/stepanov_vector     4.00             4.00                       4.00          0.0%
SingleSour...Adobe-C++/stepanov_abstraction     4.00             4.00                       4.00          0.0%
SingleSour...++/simple_types_loop_invariant     2.00             2.00                       2.00          0.0%
SingleSour.../simple_types_constant_folding   126.00           126.00                     126.00          0.0%
SingleSour...nchmarks/Adobe-C++/loop_unroll     8.00             8.00                       8.00          0.0%
SingleSour...arks/Adobe-C++/functionobjects     3.00             3.00                       3.00          0.0%
MultiSourc...++11/frame_layout/frame_layout     2.00             2.00                       2.00          0.0%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4   503.00           503.00                     503.00          0.0%
MultiSource/Benchmarks/nbench/nbench           48.00            48.00                      48.00          0.0%
MultiSourc...nch/mpeg2/mpeg2dec/mpeg2decode    53.00            53.00                      53.00          0.0%
                           Geomean difference                                                             0.0%
      gvn.NumGVNBlocks
run               base sameunderlying-and-nullptr sameunderlying        diff
count  291.000000       291.000000                 291.000000     291.000000
mean   47.079038        47.075601                  47.075601      0.000051
std    130.828991       130.828426                 130.828426     0.000862
min    1.000000         1.000000                   1.000000       0.000000
25%    2.000000         2.000000                   2.000000       0.000000
50%    8.000000         8.000000                   8.000000       0.000000
75%    50.000000        50.000000                  50.000000      0.000000
max    1476.000000      1476.000000                1476.000000    0.014706
Tests: 2432
Metric: gvn.NumGVNSimpl

Program                                       gvn.NumGVNSimpl
                                              base            sameunderlying-and-nullptr sameunderlying diff
MultiSourc...s/Prolangs-C++/objects/objects      4.00            4.00                       2.00        100.0%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4   3739.00         3002.00                    2984.00         25.3%
MicroBench.../ImageProcessing/Dither/Dither     34.00           39.00                      39.00         14.7%
MicroBench...ubsetBLambdaLoops/lcalsBLambda    415.00          474.00                     434.00         14.2%
MicroBench...CALS/SubsetBRawLoops/lcalsBRaw    415.00          474.00                     434.00         14.2%
MicroBench...CALS/SubsetCRawLoops/lcalsCRaw    421.00          480.00                     440.00         14.0%
MicroBench...ubsetCLambdaLoops/lcalsCLambda    426.00          485.00                     445.00         13.8%
MicroBench...ubsetALambdaLoops/lcalsALambda    430.00          489.00                     449.00         13.7%
MicroBench...CALS/SubsetARawLoops/lcalsARaw    412.00          468.00                     428.00         13.6%
MicroBench...teralFiltering/BilateralFilter     38.00           43.00                      43.00         13.2%
MultiSource/Applications/kimwitu++/kc          702.00          786.00                     786.00         12.0%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR   2460.00         2740.00                    2734.00         11.4%
SingleSource/Benchmarks/Misc-C++/bigfib         53.00           59.00                      59.00         11.3%
MultiSource/Benchmarks/McCat/09-vor/vor         10.00           10.00                       9.00         11.1%
MultiSourc...ks/FreeBench/analyzer/analyzer     15.00           15.00                      16.00          6.7%
                           Geomean difference                                                             0.9%
      gvn.NumGVNSimpl
run              base sameunderlying-and-nullptr sameunderlying        diff
count  311.000000      311.000000                 311.000000     311.000000
mean   155.244373      155.379421                 154.565916     0.010764
std    511.943543      502.146410                 501.382767     0.063660
min    1.000000        1.000000                   1.000000       0.000000
25%    6.000000        6.000000                   6.000000       0.000000
50%    17.000000       17.000000                  17.000000      0.000000
75%    59.000000       60.500000                  60.500000      0.000000
max    4872.000000     4871.000000                4871.000000    1.000000
Tests: 2432
Metric: gvn.NumGVNEqProp

Program                                       gvn.NumGVNEqProp
                                              base             sameunderlying-and-nullptr sameunderlying diff
MultiSourc...OE-ProxyApps-C++/miniFE/miniFE    66.00             8.00                       4.00         1550.0%
SingleSour.../Benchmarks/Misc-C++-EH/spirit    49.00             3.00                       3.00         1533.3%
SingleSour...ks/Misc-C++/stepanov_container   107.00            12.00                      11.00          872.7%
SingleSource/Benchmarks/Misc-C++/bigfib        18.00             2.00                       2.00          800.0%
MultiSourc.../DOE-ProxyApps-C++/HPCCG/HPCCG     6.00             1.00                       1.00          500.0%
MultiSourc.../Benchmarks/McCat/15-trie/trie     6.00             6.00                       1.00          500.0%
MultiSourc...hmarks/MallocBench/cfrac/cfrac     6.00             6.00                       1.00          500.0%
MultiSourc.../DOE-ProxyApps-C++/CLAMR/CLAMR   874.00           175.00                     169.00          417.2%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4   222.00            75.00                      43.00          416.3%
MultiSource/Benchmarks/McCat/09-vor/vor        14.00             3.00                                     366.7%
MultiSource/Applications/kimwitu++/kc         293.00            72.00                      63.00          365.1%
MultiSourc...-ProxyApps-C++/PENNANT/PENNANT    55.00            12.00                      12.00          358.3%
MicroBench...ubsetCLambdaLoops/lcalsCLambda   122.00            70.00                      37.00          229.7%
MicroBench...ubsetBLambdaLoops/lcalsBLambda   122.00            70.00                      37.00          229.7%
MicroBench...CALS/SubsetBRawLoops/lcalsBRaw   122.00            70.00                      37.00          229.7%
                           Geomean difference                                                              37.7%
      gvn.NumGVNEqProp
run               base sameunderlying-and-nullptr sameunderlying        diff
count  147.000000       133.000000                 126.000000     147.000000
mean   44.442177        33.511278                  30.103175      0.809993
std    104.076048       69.462614                  60.728356      2.227728
min    1.000000         1.000000                   1.000000       0.000000
25%    2.000000         2.000000                   2.000000       0.000000
50%    9.000000         9.000000                   8.000000       0.000000
75%    36.500000        34.000000                  34.000000      0.431444
max    874.000000       509.000000                 373.000000     15.500000
Tests: 2432
Metric: gvn.NumPRELoad

Program                                       gvn.NumPRELoad
                                              base           sameunderlying-and-nullptr sameunderlying diff
ultiSource/Benchmarks/Olden/tsp/tsp             11.00           7.00                       7.00        57.1%
ultiSource/Benchmarks/McCat/09-vor/vor          10.00           8.00                       8.00        25.0%
ultiSource...nchmarks/tramp3d-v4/tramp3d-v4   1133.00        1061.00                    1061.00         6.8%
ultiSource/Applications/d/make_dparser         191.00         192.00                     192.00         0.5%
ultiSource...sumer-typeset/consumer-typeset    727.00         726.00                     726.00         0.1%
ultiSource...VC/Expansion-dbl/Expansion-dbl      6.00           6.00                       6.00         0.0%
ultiSource/Benchmarks/Ptrdist/ft/ft              3.00           3.00                       3.00         0.0%
ultiSource/Benchmarks/Ptrdist/ks/ks              3.00           3.00                       3.00         0.0%
ultiSource...Benchmarks/Ptrdist/yacr2/yacr2     35.00          35.00                      35.00         0.0%
ultiSource...Benchmarks/SciMark2-C/scimark2     11.00          11.00                      11.00         0.0%
ultiSource...ontrolFlow-dbl/ControlFlow-dbl      7.00           7.00                       7.00         0.0%
ultiSource...ontrolFlow-flt/ControlFlow-flt      7.00           7.00                       7.00         0.0%
ultiSource...lds-dbl/CrossingThresholds-dbl      1.00           1.00                       1.00         0.0%
ultiSource...lds-flt/CrossingThresholds-flt      1.00           1.00                       1.00         0.0%
icroBenchmarks/Builtins/Int128/Builtins         12.00          12.00                      12.00         0.0%
                           Geomean difference                                                           0.4%
      gvn.NumPRELoad
run             base sameunderlying-and-nullptr sameunderlying        diff
count  194.000000     194.000000                 194.000000     194.000000
mean   74.097938      73.695876                  73.695876      0.004618
std    186.856476     184.798491                 184.798491     0.044925
min    1.000000       1.000000                   1.000000       0.000000
25%    2.000000       2.000000                   2.000000       0.000000
50%    7.500000       7.000000                   7.000000       0.000000
75%    42.000000      42.000000                  42.000000      0.000000
max    1133.000000    1061.000000                1061.000000    0.571429
Tests: 2432
Metric: gvn.NumPRELoad

Program                                       gvn.NumPRELoad
                                              base           sameunderlying-and-nullptr sameunderlying diff
ultiSource/Benchmarks/Olden/tsp/tsp             11.00           7.00                       7.00        57.1%
ultiSource/Benchmarks/McCat/09-vor/vor          10.00           8.00                       8.00        25.0%
ultiSource...nchmarks/tramp3d-v4/tramp3d-v4   1133.00        1061.00                    1061.00         6.8%
ultiSource/Applications/d/make_dparser         191.00         192.00                     192.00         0.5%
ultiSource...sumer-typeset/consumer-typeset    727.00         726.00                     726.00         0.1%
ultiSource...VC/Expansion-dbl/Expansion-dbl      6.00           6.00                       6.00         0.0%
ultiSource/Benchmarks/Ptrdist/ft/ft              3.00           3.00                       3.00         0.0%
ultiSource/Benchmarks/Ptrdist/ks/ks              3.00           3.00                       3.00         0.0%
ultiSource...Benchmarks/Ptrdist/yacr2/yacr2     35.00          35.00                      35.00         0.0%
ultiSource...Benchmarks/SciMark2-C/scimark2     11.00          11.00                      11.00         0.0%
ultiSource...ontrolFlow-dbl/ControlFlow-dbl      7.00           7.00                       7.00         0.0%
ultiSource...ontrolFlow-flt/ControlFlow-flt      7.00           7.00                       7.00         0.0%
ultiSource...lds-dbl/CrossingThresholds-dbl      1.00           1.00                       1.00         0.0%
ultiSource...lds-flt/CrossingThresholds-flt      1.00           1.00                       1.00         0.0%
icroBenchmarks/Builtins/Int128/Builtins         12.00          12.00                      12.00         0.0%
                           Geomean difference                                                           0.4%
      gvn.NumPRELoad
run             base sameunderlying-and-nullptr sameunderlying        diff
count  194.000000     194.000000                 194.000000     194.000000
mean   74.097938      73.695876                  73.695876      0.004618
std    186.856476     184.798491                 184.798491     0.044925
min    1.000000       1.000000                   1.000000       0.000000
25%    2.000000       2.000000                   2.000000       0.000000
50%    7.500000       7.000000                   7.000000       0.000000
75%    42.000000      42.000000                  42.000000      0.000000
max    1133.000000    1061.000000                1061.000000    0.571429
Tests: 2432
Metric: gvn.NumPRELoopLoad

Program                                       gvn.NumPRELoopLoad
                                              base               sameunderlying-and-nullptr sameunderlying diff
MicroBench...ubsetALambdaLoops/lcalsALambda     1.00               1.00                       1.00          0.0%
MultiSourc...marks/Prolangs-C++/simul/simul     1.00               1.00                       1.00          0.0%
MultiSourc...chmarks/Prolangs-C/agrep/agrep     4.00               4.00                       4.00          0.0%
MultiSourc...marks/Prolangs-C/bison/mybison    17.00              17.00                      17.00          0.0%
MultiSourc...chmarks/Prolangs-C/cdecl/cdecl     1.00               1.00                       1.00          0.0%
MultiSourc...s/Prolangs-C/compiler/compiler     1.00               1.00                       1.00          0.0%
MultiSourc...s/Prolangs-C/football/football     1.00               1.00                       1.00          0.0%
MultiSourc...Prolangs-C/simulator/simulator     1.00               1.00                       1.00          0.0%
MultiSourc...hmarks/Ptrdist/anagram/anagram     1.00               1.00                       1.00          0.0%
MultiSourc...Benchmarks/Ptrdist/yacr2/yacr2     4.00               4.00                       4.00          0.0%
MultiSourc...enchmarks/mafft/pairlocalalign     5.00               5.00                       5.00          0.0%
MultiSourc...iabench/g721/g721encode/encode     1.00               1.00                       1.00          0.0%
MultiSourc...nch/mpeg2/mpeg2dec/mpeg2decode     1.00               1.00                       1.00          0.0%
MultiSource/Benchmarks/nbench/nbench            1.00               1.00                       1.00          0.0%
SingleSour...nchmarks/Adobe-C++/loop_unroll   130.00             130.00                     130.00          0.0%
                           Geomean difference                                                               0.0%
      gvn.NumPRELoopLoad
run                 base sameunderlying-and-nullptr sameunderlying  diff
count  45.000000          45.000000                  45.000000      45.0
mean   15.711111          15.711111                  15.711111      0.0
std    37.860163          37.860163                  37.860163      0.0
min    1.000000           1.000000                   1.000000       0.0
25%    1.000000           1.000000                   1.000000       0.0
50%    1.000000           1.000000                   1.000000       0.0
75%    6.000000           6.000000                   6.000000       0.0
max    168.000000         168.000000                 168.000000     0.0
mnadeem requested review of this revision.Feb 9 2023, 7:16 PM
aqjune added a subscriber: nlopes.Feb 10 2023, 9:06 AM

The one case I'm probably worried about most are checks of the form %p == null, where replacing %p with null is not generally legal because it has nullary provenance. Replacing two arbitrary pointers likely isn't super important, but replacement with null might be.

I was looking at https://llvm.org/devmtg/2018-04/slides/Lopes-Sbirlea-Pointers,%20Alias%20and%20ModRef%20Analyses.pdf page 15, which says that replacement with null ptr is safe. So I did two runs, one where we do pointer replacement only when they have same underlying objects, and the another where we also replace pointers with null.

The validity of this transformation depends on the definition of a null pointer which is in a gray area.
The slide is based on the following memory model: (1) we can safely replace p with inttoptr (ptrtoint p). (2) null pointer is equivalent to inttoptr 0 (or inttoptr c for some special, fixed constant).
In this model, replacing p with null is fine.

I think we can talk about this in the LLVM discourse group. The numbers look informative and people would like to see them & suggest their ideas.

nikic added a comment.Feb 14 2023, 1:39 AM

Thanks a lot for the detailed analysis!

@nikic
I was looking at https://llvm.org/devmtg/2018-04/slides/Lopes-Sbirlea-Pointers,%20Alias%20and%20ModRef%20Analyses.pdf page 15, which says that replacement with null ptr is safe. So I did two runs, one where we do pointer replacement only when they have same underlying objects, and the another where we also replace pointers with null.

I believe the currently accepted interpretation of null in LLVM is that it has nullary provenance. For example, we rely on this here: https://github.com/llvm/llvm-project/blob/46cdf7f0991280a3601a777a80fbc460196fb033/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L1064 This folds load (gep null, x) to unreachable. Of course, we also fold inttoptr 0 to null, which is inconsistent with that interpretation.

As @nlopes suggested, this should probably be discussed on discourse. Giving null universal provenance seems wrong from a principled perspective, but it may well be the most pragmatic option. (I don't think we rely on nullary provenance in many places, not sure there are any beyond the one I shared, and for that one we can probably recover most practically useful cases with an inbounds requirement.)

Here are the diffs for llvm-test-suite bitcode:
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/125698f591df1f8de81ad4119b0838e135a18af8
and
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/a778479334852e7f05b5a248da4473106aa986fb

Oh, that's a nice way to analyze differences. How did you produce the output?

I looked through the diffs a bit, and while most of them are immaterial diffs (just replacing one pointer with another without further simplification), it looks like there are also many genuine regressions in optimization quality. The first one is https://github.com/UsmanNadeem/vigilant-octo-journey/commit/a778479334852e7f05b5a248da4473106aa986fb#diff-429f0aa0b366d469c8f0fe4aaded9e87e83898118f192b6741caf4100e4e7792. I think the logic here is that we have a icmp eq ptr %link.112821, %48 condition on the edge, and then check

%313 = phi ptr [ %.pre12846, %for.end11932.loopexit ], [ %link.112821, %if.end11920 ]
%cmp11936 = icmp ne ptr %313, %48

so if we replace %link.112821 with %48 here, we see that the condition is always true when coming from %if.end11920 and that allows us to thread the branch. This is a legal optimization, because icmp is provenance independent.

From a quick glance, the same basic pattern also occurs in many other IR diffs, so it seems like this is something important to preserve, though I'm not entirely sure what the best way to only propagating the equality in provenance-independent contexts would be (doing a recursive user walk seems rather iffy, but I don't have a better idea).

Here are the diffs for llvm-test-suite bitcode:
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/125698f591df1f8de81ad4119b0838e135a18af8
and
https://github.com/UsmanNadeem/vigilant-octo-journey/commit/a778479334852e7f05b5a248da4473106aa986fb

...
Oh, that's a nice way to analyze differences. How did you produce the output?
...

I compiled the test suite using -O3 with the base toolchain and made a list of files where GVN did pointer replacements by adding some debug output in that function. Copied the compile commands for those files from the make output and then used those to dump the IR for the base toolchain and toolchains with various patches.

Discourse link: https://discourse.llvm.org/t/restricting-equality-propagation-for-pointers-whats-the-best-way-how-to-deal-with-regressions/68619

aqjune added inline comments.Feb 23 2023, 7:34 AM
llvm/lib/Transforms/Scalar/GVN.cpp
2563

As @nikic suggested in the Discourse thread, do you want to fix patchAndReplaceAllUsesWith so that if it is a pointer it allows replacing uses at instructions that are unaware of provenance? I wonder how things will go.
Actually, we can count different assembly files rather than LLVM IR. If it is less than 10, then we can investigate the files more deeply. Actually, if there are diffs even after allowing this, I think it might be miscompilation bugs.

llvm/test/Transforms/GVN/condprop.ll
591

Nit: could you avoid using br i1 undef? This is UB and there is this side project: https://llvm.org/OpenProjects.html#remove_ub_tests
You can add a function argument for this branch condition and use it instead.

mnadeem updated this revision to Diff 500469.Feb 25 2023, 4:35 PM
mnadeem retitled this revision from [GVN] Do not propagate equalities for noalias pointers to [GVN] Restrict equality propagation for pointers.

Not sure about the changes in the assume-equal.ll test.

mnadeem marked an inline comment as done.
mnadeem added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
2563

Maybe in a separate patch? I tried doing assembly files but the changes were more extensive due to register assignment changes.

mnadeem updated this revision to Diff 500825.Feb 27 2023, 9:05 AM

clang format

Thanks! Yes, indeed there are still many files having diffs. But the number shrunk to ~1/5, which seems nice!

I looked at the diffs, and I could see a few replacements in the past that seem buggy...
In 82.ll:

  %call103 = tail call ptr @encodingLine(ptr noundef %messageIn) #17
  %22 = load ptr, ptr %t_next, align 8, !tbaa !30
  %cmp105 = icmp eq ptr %call103, %22
  br i1 %cmp105, label %if.then107, label %do.cond

if.then107:
  %23 = load ptr, ptr %22, align 8, !tbaa !28 // Replacing %22 with %call103 is illegal

encodingLine does not have a function body as well as meaningful attributes, so it can return any pointer. This patch fixes the bug, which is good.

In 73.ll: I think we can make an improvement.

  %call.i = tail call noalias dereferenceable_or_null(64) ptr @malloc(i64 noundef 64) #35
  %tobool.not = icmp eq ptr %call.i, null
  ...

cond.end.i.i: // At this point %call.i isn't null.
  %5 = load ptr, ptr %arrayidx6.i.i, align 8, !tbaa !5
  %cmp.i.i = icmp eq ptr %5, %call.i
  br i1 %cmp.i.i, label %_Z15yy_flush_bufferP15yy_buffer_state.exit.thread.i,....

_Z15yy_flush_bufferP15yy_buffer_state.exit.thread.i: ; preds = %cond.end.i.i
  %yy_n_chars.i.i.i = getelementptr inbounds %struct.yy_buffer_state, ptr %5, i64 0, i32 4 // <- This can be replaced with 'gep inbounds %call.i, 0, 4'.
  %6 = load i32, ptr %yy_n_chars.i.i.i, align 4, !tbaa !13

Replacing p at load (gep inbounds p, ofs) with q at is fine if (1) q points to the beginning address of a live object (e.g. unfreed malloc), and (2) ofs is non-negative.
To guarantee liveness (1), you will probably need to check whether q hasn't been escaped, because an unknown function call may deallocate it.
My reasoning is like this:

  • If p was a freed (dead) pointer, load would have already been UB, so q can be anything.
  • If p was an unfreed (live) pointer, q must also be a live object because otherwise the replacement will make load introduce UB. If p and q are both live:
    • If p and q pointed to the same object: replacement is trivially correct
    • If p and q pointed to different objects: since p == q was true, the two different objects must be adjacent in their memory layout, and one of q or p must point to the beginning address of a live object. 73.ll corresponds to q pointing to 'the beginning address of a live object' case.

I hope I didn't miss something obvious in 73.ll & this replacement helps.

llvm/lib/Analysis/Loads.cpp
699

Could you elaborate a bit about isa<PHINode>(User) condition?
I think that in the following case replacing %p with %q should fail, but this condition seems to allow it:

BB1:
  %phi1 = phi [%p, <pred>], ..
  br BB2
BB2:
  %phi2 = phi [%phi1, <BB1>], ..
  store i32 0, ptr %phi2
mnadeem added inline comments.Feb 27 2023, 11:29 AM
llvm/lib/Analysis/Loads.cpp
699

I was trying to handle something like the example below. I guess I didn't think about the cases where the second phi's use could have illegal replacements.

Not sure what the best way to handle this would be.

BB1:
  %phi1 = phi [%p, <pred>], ..
  %cond = icmp eq ptr %phi1 , %q    <<---- This case
  br BB2
BB2:
  %phi2 = phi [%phi1, <BB1>], ..
  store i32 0, ptr %phi2
mnadeem updated this revision to Diff 501008.Feb 27 2023, 8:01 PM

Fix the check for PHI nodes and add some reduced tests relevant to the case.

mnadeem marked an inline comment as done.Feb 27 2023, 8:03 PM
aqjune added a comment.Mar 7 2023, 8:23 AM

Hi, I am currently occupied with something else. I will be back by today and continue reviewing this.

aqjune added a comment.Mar 7 2023, 9:07 PM

Regarding the updated experimental results, I think people might have thoughts about them. I will leave a comment there.

llvm/include/llvm/Transforms/Utils/Local.h
416
llvm/lib/Analysis/Loads.cpp
699

I think dealing with this case is tricky.
To replace %p in the operand of %phi1, we should copy it as follows:

BB1:
  %phi1 = phi [%p, <pred>], ..
  %phi1_copy = phi [%p, <pred>], .. <<-- we can replace %p with %q here
  %cond = icmp eq ptr %phi1_copy , %q    <<---- ... because %phi1_copy's only use is this icmp.
  br BB2
BB2:
  %phi2 = phi [%phi1, <BB1>], ..
  store i32 0, ptr %phi2
arichardson added inline comments.
llvm/test/Transforms/GVN/assume-equal.ll
56

Instead of the negative checks (which may pass for other reasons) I'd use a positive one with the new output.

nikic requested changes to this revision.Mar 9 2023, 7:50 AM

Okay, the test diffs look much better now. Some interesting remaining cases:

  • 10.ll: This is another jump threading failure. The problem here is that the phi nodes are used in icmps, but not *only* in icmps, so we can't simply replace the operand there. I'm not sure what to do about this one. We could obviously duplicate the phi node and replace its non-provenance uses, but that's not going to be profitable in general. Possibly there is some specialized jump threading support one could add for this, but this is technically tricky, especially as threading can't use the dominator tree. LVI currently only tracks equality to constants.
  • 101.ll (sink) and 103.ll (hoist): Here we have something like if (p == q) { store v, p } else { store v, q } where previously store v, q could be sunk/hoisted out of it. This isn't legal anymore. This is probably unrecoverable at the IR level, but maybe the machine layer handles it (I didn't check).

Overall I'd be willing to try landing this and seeing whether any significant performance issues arise.

llvm/include/llvm/Transforms/Utils/Local.h
416

The callback doesn't need From and To -- if it needs them, it can capture them from context. Instead of User * we should pass Use &, which is more general. (E.g. if some operands are provenance-dependent, but others aren't -- I have future ptr_provenance support in mind here).

llvm/lib/Analysis/Loads.cpp
706

I think we should split this up into two APIs: One should stay as previous any just answer the question "can this always be replaced?" and the other should only take a Use and answer the question "is this provenance-independent?" We don't need to perform some of these checks for every single use, and some of the things (e.g. addition to leader table) can only be done if all uses can be replaced.

llvm/lib/Transforms/Scalar/GVN.cpp
2325

I don't think we can do this unless it's legal to unconditionally replace the pointers.

This revision now requires changes to proceed.Mar 9 2023, 7:50 AM

+1 for landing this patch if there is no significant input from the discourse thread by next Monday (Mar. 13).

mnadeem planned changes to this revision.Mar 10 2023, 9:21 AM
mnadeem updated this revision to Diff 505980.Mar 16 2023, 8:38 PM
  • Guard addToLeaderTable with the pointer replacement check.
  • Create two APIs
    • bool canReplacePointersIfEqual(const Value *From, const Value *To);
    • bool canReplacePointersIfEqualInUse(const Use &U, const Value *To);
  • For the callback in replaceDominatedUsesWithIf I changed the from argument with a Use. But I still need the to value in the callback.
  • Clang format
  • Update tests
mnadeem marked 2 inline comments as done.Mar 16 2023, 8:49 PM
mnadeem added inline comments.
llvm/include/llvm/Transforms/Utils/Local.h
416

You'll want to use clang-format. https://llvm.org/docs/Contributing.html#format-patches

It is already clang-formatted, not sure why the formatting is now different.
git diff -U0 --relative HEAD~ | clang/tools/clang-format/clang-format-diff.py -p1 -sort-includes -i

416

The callback doesn't need From and To -- if it needs them, it can capture them from context. Instead of User * we should pass Use &, which is more general. (E.g. if some operands are provenance-dependent, but others aren't -- I have future ptr_provenance support in mind here).

I replaced From with Use &, but I still needed the To param. Can you look at loads.cpp to see if it makes sense to use both the use and the to value?

llvm/lib/Analysis/Loads.cpp
699

I fixed the code and added a test case in a previous revision. I think fixing this will need more extensive changes than can be easily done in replaceDominatedUsesWith

706

I created an isPointerUseReplacable(const Use &U but made it a static function which can be exposed in the future.
I have also taken some of the checks out of the recursive function.

llvm/lib/Transforms/Scalar/GVN.cpp
2563

I have guarded addToLeaderTable with the check so IIUC this is not needed any more?

nikic requested changes to this revision.Apr 4 2023, 7:17 AM
nikic added inline comments.
llvm/lib/Analysis/Loads.cpp
704

This API should not contain any Use-based reasoning, it should be just the isPointerAlwaysReplacable() part. The leader table replacement isn't valid even if all uses of this particular pointer are fine, as we may end up performing an indirect replacement on additional users.

llvm/test/Transforms/GVN/assume-equal.ll
24

Hm, these regressions look problematic. These assumptions are emitted by clang in EmitVTableAssumptionLoad() specifically to allow these replacements. This is going to lose devirtualization support for these cases.

We probably need to carve out another exception for this case. Possibly we should just keep the existing exception for constants (i.e. allow if they are deferencable) to make this work.

This revision now requires changes to proceed.Apr 4 2023, 7:17 AM
mnadeem updated this revision to Diff 511731.EditedApr 7 2023, 10:58 AM

Added an exception for constant dereferenceable pointers and changed canReplacePointersIfEqual to not look at the Uses.

To get the dereferenceability of the pointer, Instead of passing the DL around I am trying to get it from the value, let me know if it looks okay.

mnadeem marked 2 inline comments as done.Apr 7 2023, 11:01 AM

@nikic Please feel free to take over/commandeer this revision.