This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/TableGen/
-
TableGen/
1/3
opt-remark-diag.td
-
utils/TableGen/
-
TableGen/
-
CMakeLists.txt
1
OptRemarkDiagEmitter.cpp
-
TableGen.cpp
-
TableGenBackends.h

Differential D72425

[OptRemark] RFC: Introduce a message table for OptRemarks
Needs ReviewPublic

Authored by andrew.w.kaylor on Jan 8 2020, 3:58 PM.

Download Raw Diff

Details

Reviewers

thegameg
anemet
hfinkel
karthiksenthil

Summary

This is a very preliminary proposal to introduce a message table that would let us replace hard-coded strings with an identifier. The basic idea is that instead of something like this:

ORE->emit([&]() {
  return OptimizationRemark(DEBUG_TYPE, "LoadElim", LI)
         << "load of type " << NV("Type", LI->getType()) << " eliminated"
         << setExtraArgs() << " in favor of "
         << NV("InfavorOfValue", AvailableValue);
});

We'd be able to have something like this:

ORE->emit([&]() {
  return OptimizationRemark(DEBUG_TYPE, diag::remark_gvn_load_elim, LI)
         << NV("Type", LI->getType())
         << setExtraArgs() << NV("InfavorOfValue", AvailableValue);
});

I think this opens up a lot of possibilities for more compact storage of remarks and more reliable identification of remarks in the DiagHandler. It also brings up a lot of questions about how much of the information that is currently part of the OptimizationRemark class (and its siblings) should be part of the message table. I hope to discuss that in this review.

Diff Detail

Event Timeline

andrew.w.kaylor created this revision.Jan 8 2020, 3:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 8 2020, 3:58 PM

Herald added subscribers: llvm-commits, mgorny. · View Herald Transcript

I like this general idea; couple of thoughts...

Clang has a similar system, and one disadvantage is that any time you change/add any message, it seems to trigger a large rebuild. Could we have this TG system generate separate .inc files for each category of things, so we don't have the same kind of rebuild problem.
Given that are arguments are named, can we use something like "Format string with optional specifier like %{NV}" instead of "Format string with optional specifier like %0"? I think that the former would be better.

As Francis mentioned it before it would be good derive the pass name from the remark type (diag::remark_gvn_load_elim -> gvn) . I.e. I would drop the DEBUG_TYPE argument.

In D72425#1811098, @hfinkel wrote:

I like this general idea; couple of thoughts...

Clang has a similar system, and one disadvantage is that any time you change/add any message, it seems to trigger a large rebuild. Could we have this TG system generate separate .inc files for each category of things, so we don't have the same kind of rebuild problem.

Probably. One thing I was imagining coming out of the .td file(s) is an enum somewhere that assigns values to all of the messages. I suppose we could make that multiple enums, each with a fixed starting value to avoid triggering rebuilds. In general, these should lend themselves to logical groupings. We'd probably also want a way for downstream LLVM-based products to painlessly add their own remarks and groups should help with that.

Given that are arguments are named, can we use something like "Format string with optional specifier like %{NV}" instead of "Format string with optional specifier like %0"? I think that the former would be better.

Yes, that's a great idea. So, if I understand what you're suggesting, the entry for the remark I used as an example would look like this:

def remark_gvn_load_elim: OptRemark<"LoadElim", "load of type %Type eliminated", " in favor of %InfavorOfValue">;

Is that the idea?

In D72425#1811118, @anemet wrote:

As Francis mentioned it before it would be good derive the pass name from the remark type (diag::remark_gvn_load_elim -> gvn) . I.e. I would drop the DEBUG_TYPE argument.

This is one of the things that I thought could be a field in the OptRemark class. I think it makes sense to have this kind of property tightly coupled with the message. Francis mentioned a potential problem with keeping that synchronized with where the remarks are emitted, but I think if we think of it in terms of a category of optimization rather than a pass name that isn't a problem, because the nature of the optimization being described won't change even if, for example, you move it from InstCombine to AggressiveInstCombine. Or perhaps I'm introducing a second concept here. There's a bit of a disconnect between compiler developers who want to use this feature and compiler users who want to use the feature. The latter group is probably more interested in being able to say, for example, show me all remarks related to loop optimization rather than show me remarks from the loop rotate pass.

It also occurs to me that we could move the Optimization/Missed/Analysis hierarchy into this table.

In D72425#1811143, @andrew.w.kaylor wrote:
In D72425#1811098, @hfinkel wrote:

I like this general idea; couple of thoughts...

Clang has a similar system, and one disadvantage is that any time you change/add any message, it seems to trigger a large rebuild. Could we have this TG system generate separate .inc files for each category of things, so we don't have the same kind of rebuild problem.

Probably. One thing I was imagining coming out of the .td file(s) is an enum somewhere that assigns values to all of the messages. I suppose we could make that multiple enums, each with a fixed starting value to avoid triggering rebuilds. In general, these should lend themselves to logical groupings. We'd probably also want a way for downstream LLVM-based products to painlessly add their own remarks and groups should help with that.

Given that are arguments are named, can we use something like "Format string with optional specifier like %{NV}" instead of "Format string with optional specifier like %0"? I think that the former would be better.

Yes, that's a great idea. So, if I understand what you're suggesting, the entry for the remark I used as an example would look like this:
def remark_gvn_load_elim: OptRemark<"LoadElim", "load of type %Type eliminated", " in favor of %InfavorOfValue">;
Is that the idea?

Yep, something like that.

In D72425#1811150, @andrew.w.kaylor wrote:

In D72425#1811118, @anemet wrote:

As Francis mentioned it before it would be good derive the pass name from the remark type (diag::remark_gvn_load_elim -> gvn) . I.e. I would drop the DEBUG_TYPE argument.

This is one of the things that I thought could be a field in the OptRemark class. I think it makes sense to have this kind of property tightly coupled with the message. Francis mentioned a potential problem with keeping that synchronized with where the remarks are emitted, but I think if we think of it in terms of a category of optimization rather than a pass name that isn't a problem, because the nature of the optimization being described won't change even if, for example, you move it from InstCombine to AggressiveInstCombine. Or perhaps I'm introducing a second concept here. There's a bit of a disconnect between compiler developers who want to use this feature and compiler users who want to use the feature. The latter group is probably more interested in being able to say, for example, show me all remarks related to loop optimization rather than show me remarks from the loop rotate pass.

I think making a distinction here is actually a good idea. We could make foptimization-record-passes allow such groups as well (or add another flag for groups specifically).

Adding it in the same way as the Group<...> in include/clang/Driver/Options.td would be great:

def fsave_optimization_record : Flag<["-"], "fsave-optimization-record">,
  Group<f_Group>, HelpText<"Generate a YAML optimization record file">;

It also occurs to me that we could move the Optimization/Missed/Analysis hierarchy into this table.

That would also be great.

llvm/utils/TableGen/OptRemarkDiagEmitter.cpp
1	OptRemarkDiagEmitter.cpp

Updated to incorporate review suggestions

ping

Any more feedback? Should I proceed in this direction?

fhahn added a subscriber: fhahn.Feb 25 2020, 3:56 AM

This is great! Sorry for the delay. More comments inline.

llvm/include/llvm/Remarks/OptRemarkDiagBase.td
23 ↗	(On Diff #239983)	I think a few places call this "Passed". Would that be better than "General"?
33 ↗	(On Diff #239983)	In this scheme, it will be hard for a remark to be part of multiple groups, right? I was imagining something where we can do things like: def gvn_load_elim : OptRemark<"LoadElim", "...">; def gvn_load_elim_missed : OptRemark<"LoadElimMissed", "...">, Kind<Kind_Missed>; def licm_hoisted : OptRemark<"InstHoisted", "...">; def loop_vectorized : OptRemark<"LoopVectorized", "...">; def GroupGVN : OptRemarkGroup<"GVN", [ gvn_load_elim, gvn_load_elim_missed ]>; def GroupLICM : OptRemarkGroup<"LICM", [ licm_hoisted ]>; def GroupLV : OptRemarkGroup<"LV", [ loop_vectorized ]>; def GroupLoopStuff : OptRemarkGroup<"LoopStuff", [ licm_hoisted, loop_vectorized ]>; or even groups of groups to avoid listing everything: def GroupLoopStuff : OptRemarkGroup<"LoopStuff", [ GroupLICM, GroupLoopStuff ]>; where we would have the liberty of putting remarks in different groups: for compiler devs: group remarks by pass for users: group remarks by a higher level concept: loop optimizations, actionability, sanitizer-added code, etc. I don't know how hard it is to get something like this, but let me know what you think.
llvm/test/TableGen/opt-remark-diag.td
21	What would be the use case of these enums? Can't the same be achieved by not quoting `RemarkName` and `Test1` in the `OPT_REMARK(` macros like in `clang/include/clang/Driver/Options.h`: enum ID { OPT_INVALID = 0, // This is not an option ID. #define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM, \ HELPTEXT, METAVAR, VALUES) \ OPT_##ID, #include "clang/Driver/Options.inc" LastOption #undef OPTION };

JDevlieghere added a subscriber: JDevlieghere.Feb 25 2020, 12:45 PM

a.elovikov added a subscriber: a.elovikov.Jun 8 2020, 4:21 PM

a.elovikov added inline comments.Jun 8 2020, 4:44 PM

llvm/test/TableGen/opt-remark-diag.td
21	I'm more interested in what other use-cases for OPT_REMARKs are. I believe C++ macros and tablegen serve the same purpose so I'm not sure if it's beneficial to use both at the same time. On the other hand, I want to extend the .td description of remarks to simple statistics so that each remark will be something like "number of <smth>: X" and auto-generate most of the code for handling it. I.e., given let Group = StatisticsFoo in { def remark_statistic_one : OptRemark<"StatisticOne", "Number of EventOne: %{Arg}>; def remark_statistic_two : OptRemark<"StatisticTwo", "Number of EventTwo: %{Arg}>; } I'd like to be able to generate struct StatisticsFooStorage { NumStatisticOne = 0; NumStatisticTwo = 0; void emit(RemarkEmitter Emitter) { Emitter.emit(StatisticOneRemarkString.format(NumStatisticOne); Emitter.emit(StatisticTwoRemarkString.format(NumStatisticTwo); } } And actual optizmiation StatisticsFooStorage StatStorage; // ... if (something) ++StatStorage.NumStatisticOne; // RemarksEmitter Emitter; StatStorage.emit(Emitter); I think writing direct tblgen emitter for this might be easier than using the OPT_REMARK macros (although that would probably be doable as well).

Sorry for having let this drop for so long. Some other priorities came up, but I am still interested in seeing this through.

llvm/include/llvm/Remarks/OptRemarkDiagBase.td
23 ↗	(On Diff #239983)	I don't like "Passed" but "General" isn't very helpful either. I think this will apply to "optimizations that were performed" as opposed to missed opportunities and "analysis" (which seems to mean "information"), but I'm not sure there are no cases where the base OptimizationRemark class is used to mean something else. The clang documentation describes the groups this way: When the pass makes a transformation (-Rpass). When the pass fails to make a transformation (-Rpass-missed). When the pass determines whether or not to make a transformation (-Rpass-analysis). That last description seems bad. I don't really have strong feelings about what we call it. I guess what's important is to give it a good enough name that someone won't accidentally use if for a remark that doesn't align with the intended use. "General" doesn't do that. I guess "Passed" does.
33 ↗	(On Diff #239983)	Multiple groups seems useful. I'll see what I can do with that.
llvm/test/TableGen/opt-remark-diag.td
21	@thegameg The enum ID would be used in the optimization pass to emit the remark and somewhere else (possibly the diagnostic handler) to look up the string for the remark. The reason I was generating the enum explicitly is that I wanted to establish the base ID for each group based on other information in the .td file, but I guess that could be accomplished by moving the starting ID to the header file in the way that you suggest. @a.elovikov I'm not sure I understand what you want to accomplish with the statistics. What does this accomplish that you can't do with existing LLVM statistics other than enabling the information to be reported in a release build?

thegameg added inline comments.Jun 16 2020, 6:34 PM

llvm/include/llvm/Remarks/OptRemarkDiagBase.td
23 ↗	(On Diff #239983)	No strong feelings either, I agree with everything you said.

Revision Contents

Path

Size

llvm/

test/

TableGen/

opt-remark-diag.td

18 lines

utils/

TableGen/

CMakeLists.txt

1 line

OptRemarkDiagEmitter.cpp

40 lines

TableGen.cpp

8 lines

TableGenBackends.h

1 line

Diff 236926

llvm/test/TableGen/opt-remark-diag.td

This file was added.

				// RUN: llvm-tblgen -gen-optremark-diags %s 2>&1 \| FileCheck %s

				// CHECK: OPT_REMARK(remark_example_remark, "RemarkName", "Format string with optional specifier like %0", "Verbose format string")
				// CHECK: OPT_REMARK(remark_example_remark2, "RemarkName2", "Format string", "")

				class OptRemark<string Name,
				string Format,
				string FormatVerbose = ""> {
				string RemarkName = Name;
				string FormatStr = Format;
				string VerboseFormatStr = FormatVerbose;
				}

				def remark_example_remark : OptRemark<"RemarkName",
				"Format string with optional specifier like %0",
				"Verbose format string">;
				def remark_example_remark2 : OptRemark<"RemarkName2",
				"Format string">;
				thegamegUnsubmitted Not Done Reply Inline Actions What would be the use case of these enums? Can't the same be achieved by not quoting `RemarkName` and `Test1` in the `OPT_REMARK(` macros like in `clang/include/clang/Driver/Options.h`: enum ID { OPT_INVALID = 0, // This is not an option ID. #define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM, \ HELPTEXT, METAVAR, VALUES) \ OPT_##ID, #include "clang/Driver/Options.inc" LastOption #undef OPTION }; thegameg: What would be the use case of these enums? Can't the same be achieved by not quoting…
				a.elovikovUnsubmitted Not Done Reply Inline Actions I'm more interested in what other use-cases for OPT_REMARKs are. I believe C++ macros and tablegen serve the same purpose so I'm not sure if it's beneficial to use both at the same time. On the other hand, I want to extend the .td description of remarks to simple statistics so that each remark will be something like "number of <smth>: X" and auto-generate most of the code for handling it. I.e., given let Group = StatisticsFoo in { def remark_statistic_one : OptRemark<"StatisticOne", "Number of EventOne: %{Arg}>; def remark_statistic_two : OptRemark<"StatisticTwo", "Number of EventTwo: %{Arg}>; } I'd like to be able to generate struct StatisticsFooStorage { NumStatisticOne = 0; NumStatisticTwo = 0; void emit(RemarkEmitter Emitter) { Emitter.emit(StatisticOneRemarkString.format(NumStatisticOne); Emitter.emit(StatisticTwoRemarkString.format(NumStatisticTwo); } } And actual optizmiation StatisticsFooStorage StatStorage; // ... if (something) ++StatStorage.NumStatisticOne; // RemarksEmitter Emitter; StatStorage.emit(Emitter); I think writing direct tblgen emitter for this might be easier than using the OPT_REMARK macros (although that would probably be doable as well). a.elovikov: I'm more interested in what other use-cases for OPT_REMARKs are. I believe C++ macros and…
				andrew.w.kaylorAuthorUnsubmitted Done Reply Inline Actions @thegameg The enum ID would be used in the optimization pass to emit the remark and somewhere else (possibly the diagnostic handler) to look up the string for the remark. The reason I was generating the enum explicitly is that I wanted to establish the base ID for each group based on other information in the .td file, but I guess that could be accomplished by moving the starting ID to the header file in the way that you suggest. @a.elovikov I'm not sure I understand what you want to accomplish with the statistics. What does this accomplish that you can't do with existing LLVM statistics other than enabling the information to be reported in a release build? andrew.w.kaylor: @thegameg The enum ID would be used in the optimization pass to emit the remark and somewhere…

llvm/utils/TableGen/CMakeLists.txt

Show All 29 Lines	add_tablegen(llvm-tblgen LLVM
GICombinerEmitter.cpp		GICombinerEmitter.cpp
GlobalISelEmitter.cpp		GlobalISelEmitter.cpp
InfoByHwMode.cpp		InfoByHwMode.cpp
InstrInfoEmitter.cpp		InstrInfoEmitter.cpp
InstrDocsEmitter.cpp		InstrDocsEmitter.cpp
IntrinsicEmitter.cpp		IntrinsicEmitter.cpp
OptEmitter.cpp		OptEmitter.cpp
OptParserEmitter.cpp		OptParserEmitter.cpp
		OptRemarkDiagEmitter.cpp
OptRSTEmitter.cpp		OptRSTEmitter.cpp
PredicateExpander.cpp		PredicateExpander.cpp
PseudoLoweringEmitter.cpp		PseudoLoweringEmitter.cpp
RISCVCompressInstEmitter.cpp		RISCVCompressInstEmitter.cpp
RegisterBankEmitter.cpp		RegisterBankEmitter.cpp
RegisterInfoEmitter.cpp		RegisterInfoEmitter.cpp
SDNodeProperties.cpp		SDNodeProperties.cpp
SearchableTableEmitter.cpp		SearchableTableEmitter.cpp
Show All 14 Lines

llvm/utils/TableGen/OptRemarkDiagEmitter.cpp

This file was added.

				//===- OptEmitter.cpp - Helper for emitting options.----------- -----------===//
				thegamegUnsubmitted Not Done Reply Inline Actions OptRemarkDiagEmitter.cpp thegameg: OptRemarkDiagEmitter.cpp
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/TableGen/Record.h"

				/// EmitOptRemarkDiags - This tablegen backend takes an input .td file
				/// describing a list of optimization remarks and emits a series of macros
				/// that will be associate an ID with each remark and allow the associated
				/// strings to be looked up from the ID.
				///
				/// The expected input format for each diagnostic is:
				///
				/// def remark_example_remark : OptRemark<"RemarkName",
				/// "Format string with optional specifier like %0"
				/// "Verbose format string">;
				///
				/// This will be transformed into this format:
				///
				/// OPT_REMARK(remark_example_remark,
				/// "RemarkName", "Format string with optional specifier like %0",
				/// "Verbose format string")

				namespace llvm {

				void EmitOptRemarkDiags(RecordKeeper &Records, raw_ostream &OS) {
				std::vector<Record*> Remarks = Records.getAllDerivedDefinitions("OptRemark");
				for (Record *Remark : Remarks) {
				OS << "OPT_REMARK(" << Remark->getName() << ", "
				<< "\"" << Remark->getValueAsString("RemarkName") << "\", "
				<< "\"" << Remark->getValueAsString("FormatStr") << "\", "
				<< "\"" << Remark->getValueAsString("VerboseFormatStr") << "\")\n";
				}
				}

				} // namespace llvm

llvm/utils/TableGen/TableGen.cpp

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	enum ActionType {
GenSearchableTables,		GenSearchableTables,
GenGlobalISel,		GenGlobalISel,
GenGICombiner,		GenGICombiner,
GenX86EVEX2VEXTables,		GenX86EVEX2VEXTables,
GenX86FoldTables,		GenX86FoldTables,
GenRegisterBank,		GenRegisterBank,
GenExegesis,		GenExegesis,
GenAutomata,		GenAutomata,
		GenOptRemarkDiags,
};		};

namespace llvm {		namespace llvm {
/// Storage for TimeRegionsOpt as a global so that backends aren't required to		/// Storage for TimeRegionsOpt as a global so that backends aren't required to
/// include CommandLine.h		/// include CommandLine.h
bool TimeRegions = false;		bool TimeRegions = false;
} // end namespace llvm		} // end namespace llvm

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	cl::values(
clEnumValN(GenX86EVEX2VEXTables, "gen-x86-EVEX2VEX-tables",		clEnumValN(GenX86EVEX2VEXTables, "gen-x86-EVEX2VEX-tables",
"Generate X86 EVEX to VEX compress tables"),		"Generate X86 EVEX to VEX compress tables"),
clEnumValN(GenX86FoldTables, "gen-x86-fold-tables",		clEnumValN(GenX86FoldTables, "gen-x86-fold-tables",
"Generate X86 fold tables"),		"Generate X86 fold tables"),
clEnumValN(GenRegisterBank, "gen-register-bank",		clEnumValN(GenRegisterBank, "gen-register-bank",
"Generate registers bank descriptions"),		"Generate registers bank descriptions"),
clEnumValN(GenExegesis, "gen-exegesis",		clEnumValN(GenExegesis, "gen-exegesis",
"Generate llvm-exegesis tables"),		"Generate llvm-exegesis tables"),
clEnumValN(GenAutomata, "gen-automata", "Generate generic automata")));		clEnumValN(GenAutomata, "gen-automata", "Generate generic automata"),
		clEnumValN(GenOptRemarkDiags, "gen-optremark-diags",
		"Generate optimization remark diagnostics")));

cl::OptionCategory PrintEnumsCat("Options for -print-enums");		cl::OptionCategory PrintEnumsCat("Options for -print-enums");
cl::opt<std::string> Class("class", cl::desc("Print Enum list for this class"),		cl::opt<std::string> Class("class", cl::desc("Print Enum list for this class"),
cl::value_desc("class name"),		cl::value_desc("class name"),
cl::cat(PrintEnumsCat));		cl::cat(PrintEnumsCat));

cl::opt<bool, true>		cl::opt<bool, true>
TimeRegionsOpt("time-regions",		TimeRegionsOpt("time-regions",
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	case GenX86FoldTables:
EmitX86FoldTables(Records, OS);		EmitX86FoldTables(Records, OS);
break;		break;
case GenExegesis:		case GenExegesis:
EmitExegesis(Records, OS);		EmitExegesis(Records, OS);
break;		break;
case GenAutomata:		case GenAutomata:
EmitAutomata(Records, OS);		EmitAutomata(Records, OS);
break;		break;
		case GenOptRemarkDiags:
		EmitOptRemarkDiags(Records, OS);
		break;
}		}

return false;		return false;
}		}
}		}

int main(int argc, char **argv) {		int main(int argc, char **argv) {
sys::PrintStackTraceOnErrorSignal(argv[0]);		sys::PrintStackTraceOnErrorSignal(argv[0]);
Show All 21 Lines

llvm/utils/TableGen/TableGenBackends.h

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	void EmitSearchableTables(RecordKeeper &RK, raw_ostream &OS);			void EmitSearchableTables(RecordKeeper &RK, raw_ostream &OS);
	void EmitGlobalISel(RecordKeeper &RK, raw_ostream &OS);			void EmitGlobalISel(RecordKeeper &RK, raw_ostream &OS);
	void EmitGICombiner(RecordKeeper &RK, raw_ostream &OS);			void EmitGICombiner(RecordKeeper &RK, raw_ostream &OS);
	void EmitX86EVEX2VEXTables(RecordKeeper &RK, raw_ostream &OS);			void EmitX86EVEX2VEXTables(RecordKeeper &RK, raw_ostream &OS);
	void EmitX86FoldTables(RecordKeeper &RK, raw_ostream &OS);			void EmitX86FoldTables(RecordKeeper &RK, raw_ostream &OS);
	void EmitRegisterBank(RecordKeeper &RK, raw_ostream &OS);			void EmitRegisterBank(RecordKeeper &RK, raw_ostream &OS);
	void EmitExegesis(RecordKeeper &RK, raw_ostream &OS);			void EmitExegesis(RecordKeeper &RK, raw_ostream &OS);
	void EmitAutomata(RecordKeeper &RK, raw_ostream &OS);			void EmitAutomata(RecordKeeper &RK, raw_ostream &OS);
				void EmitOptRemarkDiags(RecordKeeper &Records, raw_ostream &OS);

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif