⚙ D63860 [AMDGPU] Add test

piotr created this revision.Jun 27 2019, 1:31 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2019, 1:31 AM

Herald added subscribers: llvm-commits, nhaehnle, jvesely. · View Herald Transcript

Harbormaster completed remote builds in B33983: Diff 206794.Jun 27 2019, 1:31 AM

piotr added a reviewer: arsenm.Jun 27 2019, 1:32 AM

Herald added a subscriber: wdng. · View Herald TranscriptJun 27 2019, 1:32 AM

Related to D63709.

arsenm added inline comments.Jul 1 2019, 7:50 AM

lib/CodeGen/MachineCSE.cpp
482–490 ↗	(On Diff #206794)	I'm not sure the phis only part is needed. It would make sense to me to check the successor blocks if it's trivial (i.e. the successor only has the one predecessor here)
test/CodeGen/AMDGPU/cse-phi-incoming-val.ll
13 ↗	(On Diff #206794)	There should probably be additional MIR tests for this. I wouldn't trust that the control flow block layout we emit know will remain constant

arsenm added reviewers: anton-afanasyev, craig.topper, rtereshin, bogner.Jul 1 2019, 7:51 AM

V2: Removing condition that BB has to contain phis only. Added a MIR test.

Harbormaster completed remote builds in B34179: Diff 207498.Jul 2 2019, 3:20 AM

piotr marked 2 inline comments as done.Jul 2 2019, 3:24 AM

piotr added inline comments.

lib/CodeGen/MachineCSE.cpp
482–490 ↗	(On Diff #206794)	Good point, I removed the condition that BB has to contain phis only. I am checking that the BB has only one successor. However, that successor might have more than one predecessor, as the pattern I am matching here is a block containing phi nodes, and such blocks typically have more than one predecessor.
test/CodeGen/AMDGPU/cse-phi-incoming-val.ll
13 ↗	(On Diff #206794)	Added a MIR test.

V3: Updating commit message.

piotr edited the summary of this revision. (Show Details)Jul 2 2019, 3:27 AM

Harbormaster completed remote builds in B34180: Diff 207500.Jul 2 2019, 3:29 AM

Hi @piotr, have you tried regression benchmarking with this patch? For instance, test suite.

In D63860#1572882, @anton-afanasyev wrote:

Hi @piotr, have you tried regression benchmarking with this patch? For instance, test suite.

Hi Anton, Yes, I have just run regression testing on the test-suite and the tests pass ("Expected Passes : 917").

arsenm added inline comments.Jul 8 2019, 4:34 PM

lib/CodeGen/MachineCSE.cpp
482–490 ↗	(On Diff #206794)	I'm not sure if this should include the phi check or not. More sample cases stressing the different numbers of successors and other instructions would be good?
test/CodeGen/AMDGPU/cse-phi-incoming-val.mir
42–51 ↗	(On Diff #207500)	Can you trim some of these instructions?

In D63860#1574717, @piotr wrote:

In D63860#1572882, @anton-afanasyev wrote:

Hi @piotr, have you tried regression benchmarking with this patch? For instance, test suite.

Hi Anton, Yes, I have just run regression testing on the test-suite and the tests pass ("Expected Passes : 917").

Hmm, I'm talking about performance benchmarking, from llvm Test Suite: https://llvm.org/docs/TestingGuide.html#test-suite.

In D63860#1576545, @anton-afanasyev wrote:

In D63860#1574717, @piotr wrote:

In D63860#1572882, @anton-afanasyev wrote:

Hi @piotr, have you tried regression benchmarking with this patch? For instance, test suite.

Hi Anton, Yes, I have just run regression testing on the test-suite and the tests pass ("Expected Passes : 917").

Hmm, I'm talking about performance benchmarking, from llvm Test Suite: https://llvm.org/docs/TestingGuide.html#test-suite.

Sorry, I referred to the entire test-suite. For MicroBenchmarks only (llvm-lit -v -j 1 -o patch.json MicroBenchmarks) I get the following detailed results:

$python3 ../../test-suite/utils/compare.py --filter-short orig.json patch.json
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Tests: 248
Metric: exec_time

Program orig patch diff
test-suite...mbda.test:BM_PIC_2D_LAMBDA/171 3.18 0.88 -72.2%
test-suite...alsCRaw.test:BM_PIC_2D_RAW/171 1.77 0.89 -49.8%
test-suite...Raw.test:BM_MAT_X_MAT_RAW/5001 4520.23 6770.15 49.8%
test-suite...aw.test:BM_MULADDSUB_RAW/44217 47.74 69.22 45.0%
test-suite...lsBRaw.test:BM_INIT3_RAW/44217 33.57 47.99 42.9%
test-suite...ambda.test:BM_EOS_LAMBDA/44217 46.45 58.30 25.5%
test-suite...ENCHMARK_BILATERAL_FILTER/16/2 22.35 27.97 25.2%
test-suite...ENCHMARK_BILATERAL_FILTER/16/4 68.97 85.68 24.2%
test-suite...ENCHMARK_BILATERAL_FILTER/32/2 103.48 127.84 23.5%
test-suite...ENCHMARK_BILATERAL_FILTER/32/4 381.09 468.47 22.9%
test-suite...mbda.test:BM_ICCG_LAMBDA/44217 32.80 39.22 19.6%
test-suite...st:BM_INT_PREDICT_LAMBDA/44217 366.07 434.41 18.7%
test-suite...est:BM_INT_PREDICT_LAMBDA/5001 14.69 16.82 14.5%
test-suite...BRaw.test:BM_IF_QUAD_RAW/44217 148.05 168.47 13.8%
test-suite....test:BENCHMARK_HARRIS/256/256 389.53 346.90 -10.9%
Geomean difference nan%
orig patch diff
count 232.000000 232.000000 232.000000
mean 2081.547998 2091.565842 0.007603
std 11692.576779 11651.097661 0.091756
min 0.030994 0.030717 -0.722465
25% 3.198364 2.967566 -0.012795
50% 36.165086 39.408759 -0.000642
75% 318.340952 318.455341 0.009107
max 115341.083667 113080.746500 0.497745

piotr added inline comments.Jul 10 2019, 5:04 AM

lib/CodeGen/MachineCSE.cpp
482–490 ↗	(On Diff #206794)	My feeling is that without the phi check this heuristic would be too generic and will cause regressions, but can double check. The aim of this change is to target phi nodes re-using the same arguments. As for the second part, do you mean to add more tests?
test/CodeGen/AMDGPU/cse-phi-incoming-val.mir
42–51 ↗	(On Diff #207500)	Will do.

V4: Simplified the condition and test.

Harbormaster completed remote builds in B35234: Diff 210522.Jul 18 2019, 3:20 AM

piotr retitled this revision from [MachineCSE] Improve CSE on phi node incoming value to [MachineCSE] Extend CSE heuristic.Jul 18 2019, 3:20 AM

piotr edited the summary of this revision. (Show Details)

arsenm added inline comments.Jul 18 2019, 8:02 AM

lib/CodeGen/MachineCSE.cpp
482–484 ↗	(On Diff #210522)	I'm having a bit too much trouble following the heuristic with the two loops. Can you split the first loop into a function to return a possible alternative block? The second loop would then remain as-is, except checking UseMI.getParent() == OrigParent \|\| == AlternativeParent
484–489 ↗	(On Diff #210522)	Can you re-flow this loop into separate BBUses and !BBUses case?

V5: Refactored code.

Harbormaster completed remote builds in B35350: Diff 210802.Jul 19 2019, 5:05 AM

Ping

Mostly LGTM, except for the empty block case I'm worried about

lib/CodeGen/MachineCSE.cpp
422 ↗	(On Diff #210802)	Extra empty line
test/CodeGen/AMDGPU/cse-phi-incoming-val.mir
73 ↗	(On Diff #210802)	Can you add another test with a totally empty successor block? I think those should also be ignored

Any update on this?

Thank you for bearing with me, I have been away.

Now, when I re-checked the original case from D63709 this morning (cse-phi-incoming-val.ll) I noticed that the extra mov is not present, even without my patch. It turns out that D63198 (which was pushed recently) made the original issue go away. I guess, the fact that the issue was in the structurizer kind of explains the observation that other targets seemed to not suffer from similar problems.

Anyway, I intend to just push cse-phi-incoming-val.ll as part of the current review.

Reverting the patch and leaving the test only as it passes without the change in code.

Harbormaster completed remote builds in B37578: Diff 218123.Aug 30 2019, 9:58 AM

piotr retitled this revision from [MachineCSE] Extend CSE heuristic to [AMDGPU] Add test.Aug 30 2019, 9:59 AM

piotr edited the summary of this revision. (Show Details)

Herald added subscribers: t-tye, tpr, dstuttard and 2 others. · View Herald TranscriptAug 30 2019, 9:59 AM

LGTM

This revision is now accepted and ready to land.Aug 30 2019, 10:05 AM

Closed by commit rL370634: [AMDGPU] Add test (authored by piotr). · Explain WhySep 2 2019, 3:02 AM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add test
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 218320

llvm/trunk/test/CodeGen/AMDGPU/cse-phi-incoming-val.ll