This is an archive of the discontinued LLVM Phabricator instance.

[MachineCombiner] Improve debug output (NFC)
ClosedPublic

Authored by avt77 on Dec 15 2017, 2:02 AM.

Details

Summary

The suggested debug logging really improves the possibilty to check the job made by Machine Combiner.

Diff Detail

Event Timeline

avt77 created this revision.Dec 15 2017, 2:02 AM
pdouglas added inline comments.
lib/CodeGen/MachineCombiner.cpp
368

Typo: "DOES NOT improves" should be "DOES NOT improve"

avt77 updated this revision to Diff 127094.Dec 15 2017, 3:55 AM

The typo was fixed.

Adding extra MC reviewers

Thanks. I've left some minor comments. Could you add the debug output created for an example? That would be helpful to get a better picture on the new format.

Also adding Gerolf as reviewer.

lib/CodeGen/MachineCombiner.cpp
19

What is this needed for?

143

I think we should wrap that in DEBUG.

434

I think we should wrap that in DEBUG.

495

I think mentioning that for the current matched pattern, those are the instructions going to be removed and inserted instead would be clearer.

496

const pointer?

501

const pointer?

569

Usually we do not use this->.

fhahn added a comment.Dec 20 2017, 7:04 AM

nit: Something like [MachineCombiner] Improve debug output (NFC) as title seems slightly more compact

avt77 updated this revision to Diff 128129.EditedDec 25 2017, 1:22 AM
avt77 retitled this revision from NFC: The improved debug logging for Machine Combiner to [MachineCombiner] Improve debug output (NFC).

I fixed the comments raised by fhahn except the one: I can't show the new output format just now because we don't have real Machine Combiner Patterns tests at the moment. When I commit D40602 I'll be able to prepare such tests.

RKSimon added inline comments.Dec 29 2017, 2:01 PM
lib/CodeGen/MachineCombiner.cpp
143

Is STI used in this function? I couldn't see it.

578

If we're storing MF do we need to keep MRI and OptSize ?

Gerolf added inline comments.Jan 2 2018, 2:22 AM
lib/CodeGen/MachineCombiner.cpp
434

+ 1

Also, why recompute STI?

477

Duplicate info to what is displayed in the loops below?

495

The loops should be wrapped in DEBUG. Since this is a lot of info that is potentially logged I suggest to offer an option to enable this dump.

568–569

What motivates the changes in this function?

avt77 added inline comments.Jan 9 2018, 7:02 AM
lib/CodeGen/MachineCombiner.cpp
434

It seems it's better to use *STI initialized in runOnFuction - I'll change it.

477

No, here we have only one possible instr to remove (the message should be changed). Below we could have a set of instrs to remove.

495

OK, I'll create such an option.

568–569

I'd like to store MF here to be able to calculate STI reference later but it seems I should store *STI here instead of that and use this pointer instead of reference..

avt77 updated this revision to Diff 129271.Jan 10 2018, 6:32 AM

Following requirements I got rid of class attribute MF and replaced it with STI. In addition I added special hidden switch to allow dump of substituted instructions.

RKSimon added inline comments.Jan 11 2018, 3:10 AM
lib/CodeGen/MachineCombiner.cpp
42

to match the other maybe make this "machine-combiner-dump-subst-intrs"

avt77 updated this revision to Diff 129597.Jan 12 2018, 3:30 AM

The name of the new switch was changed.

@fhahn @Gerolf Any more comments?

fhahn added a comment.Jan 24 2018, 2:56 AM

Debug output with this patch for test/CodeGen/AArch64/machine-combiner.mir. I think it's an overall improvement , I just have a few more comments. I think it also would be worth explicitly stating that we applied a substitution.

Machine InstCombiner: inc_update_iterator_test
Combining MBB
Combining MBB
 Possible instr to replace
	 sched: [1:0.50]: %6:gpr32 = ADDWrr %3, killed %5Computing MinInstr trace through %bb.1
  pred for %bb.0: null
  pred for %bb.1: %bb.0
  succ for %bb.1: null

Depths for %bb.0:
      0 Instructions
0	%3:gpr32 = COPY %w2
0	%2:gpr32 = COPY %w1
0	%1:gpr32 = COPY %w0
0	%0:fpr64 = COPY %d0
0	%4:gpr32 = SUBSWrr %1:gpr32, %2:gpr32, implicit-def %nzcv
1	Bcc 13, %bb.2, implicit %nzcv
0	B %bb.1

Depths for %bb.1:
      3 Instructions
     2c @ A57UnitB (2 ops x6)
     1c @ A57UnitI (1 ops x3)
0	%5:gpr32 = MADDWrrr %1:gpr32, %2:gpr32, %wzr
3	%6:gpr32 = ADDWrr %3:gpr32, killed %5:gpr32
4	%7:fpr64 = SCVTFUWDri killed %6:gpr32
14	%8:fpr64 = FMULDrr %7:fpr64, %7:fpr64
19	%9:fpr64 = FADDDrr %0:fpr64, killed %8:fpr64
24	%d0 = COPY %9:fpr64
24	RET_ReallyLR implicit %d0
Heights for %bb.1:
      6 Instructions
     1c @ A57UnitB (1 ops x6)
     1c @ A57UnitI (1 ops x3)
     1c @ A57UnitL (1 ops x6)
     1c @ A57UnitM (1 ops x6)
     2c @ A57UnitV (3 ops x3)
24	0	RET_ReallyLR implicit %d0
24	0	%d0 = COPY %9:fpr64
24	5	%9:fpr64 = FADDDrr %0:fpr64, killed %8:fpr64
24	10	%8:fpr64 = FMULDrr %7:fpr64, %7:fpr64
24	20	%7:fpr64 = SCVTFUWDri killed %6:gpr32
24	21	%6:gpr32 = ADDWrr %3:gpr32, killed %5:gpr32
24	24	%5:gpr32 = MADDWrrr %1:gpr32, %2:gpr32, %wzr
%bb.1 Live-ins: %0@5 %3@21 %1@24 %2@24 WZR@24
Critical path: 24
  Dependence data for %6:gpr32 = ADDWrr %3:gpr32, killed %5:gpr32
	NewRootDepth: 1	RootDepth: 3
	NewRootLatency: 3	RootLatency: 4
	RootSlack: 0 SlackIsAccurate=1
	NewRootDepth + NewRootLatency = 4
	RootDepth + RootLatency + RootSlack = 7
	  It IMPROVES PathLen because
		NewCycleCount = 4, OldCycleCount = 7
		Resource length before replacement: 5 and after: 4
		  As result it IMPROVES/PRESERVES Resource Length
Invalidate %bb.1 MinInstr height.
Invalidate %bb.1 MinInstr depth.
 Possible instr to replace
	 sched: [5:0.50]: %9:fpr64 = FADDDrr %0, killed %8Computing MinInstr trace through %bb.1
  pred for %bb.1: %bb.0
  succ for %bb.1: null

Depths for %bb.1:
      3 Instructions
     2c @ A57UnitB (2 ops x6)
     1c @ A57UnitI (1 ops x3)
0	%6:gpr32 = MADDWrrr %1:gpr32, %2:gpr32, %3:gpr32
3	%7:fpr64 = SCVTFUWDri killed %6:gpr32
13	%8:fpr64 = FMULDrr %7:fpr64, %7:fpr64
18	%9:fpr64 = FADDDrr %0:fpr64, killed %8:fpr64
23	%d0 = COPY %9:fpr64
23	RET_ReallyLR implicit %d0
Heights for %bb.1:
      6 Instructions
     1c @ A57UnitB (1 ops x6)
     1c @ A57UnitI (1 ops x3)
     1c @ A57UnitL (1 ops x6)
     1c @ A57UnitM (1 ops x6)
     2c @ A57UnitV (3 ops x3)
23	0	RET_ReallyLR implicit %d0
23	0	%d0 = COPY %9:fpr64
23	5	%9:fpr64 = FADDDrr %0:fpr64, killed %8:fpr64
23	10	%8:fpr64 = FMULDrr %7:fpr64, %7:fpr64
23	20	%7:fpr64 = SCVTFUWDri killed %6:gpr32
23	23	%6:gpr32 = MADDWrrr %1:gpr32, %2:gpr32, %3:gpr32
%bb.1 Live-ins: %0@5 %1@23 %2@23 %3@23
Critical path: 23
  Dependence data for %9:fpr64 = FADDDrr %0:fpr64, killed %8:fpr64
	NewRootDepth: 13	RootDepth: 18
	NewRootLatency: 9	RootLatency: 10
	RootSlack: 0 SlackIsAccurate=1
	NewRootDepth + NewRootLatency = 22
	RootDepth + RootLatency + RootSlack = 28
	  It IMPROVES PathLen because
		NewCycleCount = 22, OldCycleCount = 28
		Resource length before replacement: 5 and after: 4
		  As result it IMPROVES/PRESERVES Resource Length
Invalidate %bb.1 MinInstr height.
Invalidate %bb.1 MinInstr depth.
Combining MBB
lib/CodeGen/MachineCombiner.cpp
437

Why did you drop this debug message? It seems helpful to know which instructions the machine combiner already visited. Although I can also see why one could argue why dropping it is ok, as we have the reference in which basic block we are at through the debug message at the beginning of the function

477

I think " Possible instr to replace" is not quite accurate. At this point we found > 0 patterns that could replace a pattern with MI as root.

478

What additional information does the SchedInfoStr add here?

479

I think we need a \n after print().

lib/CodeGen/MachineInstr.cpp
1216

The changes to MachineInstr.cpp seem unrelated to improving the debug output for the MachineCombiner?

avt77 added inline comments.Jan 26 2018, 1:42 AM
lib/CodeGen/MachineCombiner.cpp
477

OK, "Possible patterns to replace the instr:\n" ? And in this case we need loop here, right?

478

We are able to compare lattency:throughput of original instr with replaced instrs.

lib/CodeGen/MachineInstr.cpp
1216

We call MachineInstr::print from MachineCombiner and here we try to improve the output if we don't have valid TII pointer (I saw such situations if the given istr was not included in any MachineFunction yet).

avt77 added inline comments.Jan 26 2018, 1:45 AM
lib/CodeGen/MachineCombiner.cpp
437

Because we have such print below - see 479.

479

Why? print adds \n: do you like to see an empty string here?

avt77 updated this revision to Diff 132128.Jan 31 2018, 2:52 AM

I removed incorrect output about possibly removed instr: instead I added an output about Patterns used in combiner.

avt77 added a comment.Feb 12 2018, 2:38 AM

fhahn, do you have any questions here?

fhahn added a comment.Feb 13 2018, 2:00 AM

Thanks, and sorry for taking so long to come back to this. I think it looks good, there is just the missing newline issue. After that's resolved I would be happy to sign off on the machine combiner changes, but I would prefer if we move the lib/CodeGen/MachineInstr.cpp to a separate patch.

lib/CodeGen/MachineCombiner.cpp
496

I do not think we reference the pattern ID anywhere else, so I am not sure how it would be helpful here. Also, should Pattern be lowercase?

499

I think you need \n after the print calls in your patch . I get the following output (see the machine instrs all on the same line):

	For the Pattern (21) these instructions could be removed
		 sched: [5:0.50]: %8:fpr64 = FMULDrr %7, %7		 sched: [5:0.50]: %9:fpr64 = FADDDrr %0, killed %8	These instructions could replace the removed ones
		 sched: [9:0.50]: %9 = UNKNOWN %7, %7, %0Computing MinInstr trace through %bb.1
avt77 updated this revision to Diff 134186.Feb 14 2018, 3:24 AM
  1. The issue with "\n" was fixed.
  2. About pattern ID in output: in theory there could be several patterns for one Root and every pattern could have several alternative sequences, right? I think in this case the pattern ID will help to distinguish all these removed/substituted sequences from each other. Am I right or it's better to remove the IDs?
fhahn added a comment.Feb 14 2018, 7:53 AM

About pattern ID in output: in theory there could be several patterns for one Root and every pattern could have several alternative sequences, right? I think in this case the pattern ID will help to distinguish all these removed/substituted sequences from each other. Am I right or it's better to remove the IDs?

Right thanks!

MachineCombiner part LGTM now. I still would prefer if you submit it without the change in lib/CodeGen/MachineInstr.cpp and do that in a separate change.

MachineCombiner part LGTM now. I still would prefer if you submit it without the change in lib/CodeGen/MachineInstr.cpp and do that in a separate change.

OK, I'll do it as a separate patch.

MachineCombiner part LGTM now. I still would prefer if you submit it without the change in lib/CodeGen/MachineInstr.cpp and do that in a separate change.

OK, I'll do it as a separate patch.

AFAICT the MachineCombiner.cpp part landed at rL325217 - @fhahn are you happy for the MachineInstr.cpp diff just to be committed or do you want @avt77 to raise another phab?

I was going to commit it as a part of D40602 because the isue raised there. Is it OK?

Commit it by itself, no need to wait for another review, as long as its a separate commit its fine to tag it with 'Differential Revision: https://reviews.llvm.org/D41278' (like you should have done for rL325217 as well).

This revision was not accepted when it landed; it landed in state Needs Review.Feb 26 2018, 1:47 AM
This revision was automatically updated to reflect the committed changes.