This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/TableGen/
-
llvm/
-
TableGen/
-
Parser.h
-
lib/TableGen/
-
TableGen/
-
CMakeLists.txt
1/2
Parser.cpp
-
Record.cpp
-
RecordContext.h
-
unittests/TableGen/
-
TableGen/
-
CMakeLists.txt
-
ParserEntryPointTest.cpp

Differential D119899

[TableGen] Add a library-based entry point for parsing td files
ClosedPublic

Authored by rriddle on Feb 15 2022, 3:20 PM.

Download Raw Diff

Details

Reviewers

jpienaar
lattner

Commits

rGe865fa75308a: [TableGen] Add a library-based entry point for parsing td files

Summary

This commit adds a new TableGenParseFile entry point for tablegen
that parses an input buffer and invokes a callback function with
a record keeper (notably without an output buffer). This kind of entry
point is very useful for tablegen consuming tools that don't create
output, and want invoke tablegen multiple times. The current way
that we interact with tablegen is via relative includes to
TGParser(not great).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rriddle created this revision.Feb 15 2022, 3:20 PM

Herald added subscribers: bollu, hiraditya, mgorny. · View Herald TranscriptFeb 15 2022, 3:20 PM

rriddle requested review of this revision.Feb 15 2022, 3:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 15 2022, 3:20 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

rriddle added a child revision: D119900: [PDLL] Add support for tablegen includes and importing ODS information.Feb 15 2022, 3:20 PM

Essentially the same justification as when I submitted D108934, we essentially have a growing desire to interact with tablegen as a library (see the child revision for an example). The current situation (as noted in the description) is that we use relative include to TGParser, but this isn't really tenable when adding usages upstream. This commit hopefully starts to rectify that, by exposing a minimal interface into the TG parser.

rriddle added reviewers: jpienaar, lattner.Feb 15 2022, 3:24 PM

Harbormaster completed remote builds in B149846: Diff 409074.Feb 15 2022, 5:22 PM

I'm in favor of exposing this, I actually ran into this problem just yesterday in an external project.

However, the proposed interface using a callback is strange. How about returning an Expected<RecordKeeper> instead?

In D119899#3325071, @nhaehnle wrote:

I'm in favor of exposing this, I actually ran into this problem just yesterday in an external project.

However, the proposed interface using a callback is strange. How about returning an Expected<RecordKeeper> instead?

Ah, I think I would be fine going with something like that. The thing I like about the callback approach is that it has more explicit expectations around lifetime, which tablegen right now has a horrible notion of. Not shown here is that to actually properly invoke tablegen multiple times, you need to call llvm_shutdown/reset the global SrcMgr/etc.

In D119899#3325086, @rriddle wrote:

In D119899#3325071, @nhaehnle wrote:

I'm in favor of exposing this, I actually ran into this problem just yesterday in an external project.

However, the proposed interface using a callback is strange. How about returning an Expected<RecordKeeper> instead?

Ah, I think I would be fine going with something like that. The thing I like about the callback approach is that it has more explicit expectations around lifetime, which tablegen right now has a horrible notion of. Not shown here is that to actually properly invoke tablegen multiple times, you need to call llvm_shutdown/reset the global SrcMgr/etc.

Should all of that perhaps be happening inside this function? E.g., could we have this be "hermetic" and have the lifetime of everything tblgen side scoped to the call (e.g., you can walk over all the records inside but not keep any references to it). That way if this gets changed tblgen side, the users of this function don't need to be updated.

rriddle updated this revision to Diff 409418.Feb 16 2022, 2:42 PM

rriddle edited the summary of this revision. (Show Details)

In D119899#3326471, @jpienaar wrote:

In D119899#3325086, @rriddle wrote:

In D119899#3325071, @nhaehnle wrote:

I'm in favor of exposing this, I actually ran into this problem just yesterday in an external project.

However, the proposed interface using a callback is strange. How about returning an Expected<RecordKeeper> instead?

Ah, I think I would be fine going with something like that. The thing I like about the callback approach is that it has more explicit expectations around lifetime, which tablegen right now has a horrible notion of. Not shown here is that to actually properly invoke tablegen multiple times, you need to call llvm_shutdown/reset the global SrcMgr/etc.

Should all of that perhaps be happening inside this function? E.g., could we have this be "hermetic" and have the lifetime of everything tblgen side scoped to the call (e.g., you can walk over all the records inside but not keep any references to it). That way if this gets changed tblgen side, the users of this function don't need to be updated.

Good idea, moved things to here. I also added a helper "reset" function to remove the need to call llvm_destroy (which can likely destroy more than what we need here).

rriddle added inline comments.Feb 16 2022, 2:48 PM

llvm/lib/TableGen/Parser.cpp
36	We could do this at the beginning and allow returning the RecordKeeper, but users would have to keep in mind that any successive call to this function will destroy any previously returned RecordKeeper. The current lifetime model of TableGen makes this quite annoying (that is more easily fixable now than it used to be, but someone still needs to put in a lot of the work to plumb through a context everywhere in TableGen code).

Could the RecordKeeper object be the tablegen context? Still need to plumb it through everything.

In D119899#3327655, @craig.topper wrote:

Could the RecordKeeper object be the tablegen context? Still need to plumb it through everything.

Conceptually yeah, we could use the RecordKeeper as the context itself (I mean that is basically what it is right now for the most part). I didn't mean to imply above that we explicitly need a new class (though we could have one), we just need something akin to LLVMContext/MLIRContext to plumb through the API.

Harbormaster completed remote builds in B150085: Diff 409418.Feb 16 2022, 4:19 PM

nhaehnle added inline comments.Feb 17 2022, 9:41 AM

llvm/lib/TableGen/Parser.cpp
36	This current version seems a reasonable compromise to me as long as all the issues around global state in TableGen haven't been addresses.

I agree with nhaehnle and others that Tblgen's internal implementation and lifetime model is really problematic. Until and if someone feels compelled to rewrite tblgen from scratch (any takers ;-) following best practices, this seems like a reasonable step forward. This lgtm. Thanks!

This revision is now accepted and ready to land.Feb 17 2022, 9:06 PM

Nice (I've wanted to use tblgen as library in little rewriter I was playing around with, so this will be useful there even just for experiments, thanks :-))

This revision was landed with ongoing or failed builds.Mar 3 2022, 4:14 PM

Closed by commit rGe865fa75308a: [TableGen] Add a library-based entry point for parsing td files (authored by rriddle). · Explain Why

This revision was automatically updated to reflect the committed changes.

rriddle added a commit: rGe865fa75308a: [TableGen] Add a library-based entry point for parsing td files.

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 4:14 PM

dblaikie mentioned this in rGd60a65abb6b0: Fix for D119899.Mar 3 2022, 9:23 PM

The unit test was failing for me (in the cmake and bazel builds) due to other files already being present in the SrcMgr - so I tried this patch: d60a65abb6b050e10d9efdbc56dcb2e2e4772af1 to address that. No idea if it's right/good, but I guess if this function can reset the SrcMgr at the end, it's probably OK to reset it at the start too?

In D119899#3359094, @dblaikie wrote:

The unit test was failing for me (in the cmake and bazel builds) due to other files already being present in the SrcMgr - so I tried this patch: d60a65abb6b050e10d9efdbc56dcb2e2e4772af1 to address that. No idea if it's right/good, but I guess if this function can reset the SrcMgr at the end, it's probably OK to reset it at the start too?

Weird. That fix looks appropriate to me, thanks for submitting it! Realistically we should be able to reset at the start, we can just plumb any initialization of the SrcMgr (e.g. diag handler) as params to this function.

Revision Contents

Path

Size

llvm/

include/

llvm/

TableGen/

Parser.h

39 lines

lib/

TableGen/

1 line

38 lines

3 lines

27 lines

unittests/

TableGen/

CMakeLists.txt

3 lines

ParserEntryPointTest.cpp

39 lines

Diff 412857

llvm/include/llvm/TableGen/Parser.h

This file was added.

				//===- llvm/TableGen/Parser.h - tblgen parser entry point -------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file declares an entry point into the tablegen parser for use by tools.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TABLEGEN_PARSER_H
				#define LLVM_TABLEGEN_PARSER_H

				#include "llvm/ADT/STLExtras.h"
				#include <string>
				#include <vector>

				namespace llvm {
				class MemoryBuffer;
				class RecordKeeper;

				/// Peform the tablegen action using the given set of parsed records. Returns
				/// true on error, false otherwise.
				using TableGenParserFn = function_ref<bool(RecordKeeper &)>;

				/// Parse the given input buffer containing a tablegen file, invoking the
				/// provided parser function with the set of parsed records. All tablegen state
				/// is reset after the provided parser function is invoked, i.e., the provided
				/// parser function should not maintain references to any tablegen constructs
				/// after executing. Returns true on failure, false otherwise.
				bool TableGenParseFile(std::unique_ptr<MemoryBuffer> Buffer,
				std::vector<std::string> IncludeDirs,
				TableGenParserFn ParserFn);

				} // end namespace llvm

				#endif // LLVM_TABLEGEN_PARSER_H

llvm/lib/TableGen/CMakeLists.txt

	add_llvm_component_library(LLVMTableGen			add_llvm_component_library(LLVMTableGen
	DetailedRecordsBackend.cpp			DetailedRecordsBackend.cpp
	Error.cpp			Error.cpp
	JSONBackend.cpp			JSONBackend.cpp
	Main.cpp			Main.cpp
				Parser.cpp
	Record.cpp			Record.cpp
	SetTheory.cpp			SetTheory.cpp
	StringMatcher.cpp			StringMatcher.cpp
	TableGenBackend.cpp			TableGenBackend.cpp
	TableGenBackendSkeleton.cpp			TableGenBackendSkeleton.cpp
	TGLexer.cpp			TGLexer.cpp
	TGParser.cpp			TGParser.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${LLVM_MAIN_INCLUDE_DIR}/llvm/TableGen			${LLVM_MAIN_INCLUDE_DIR}/llvm/TableGen

	LINK_COMPONENTS			LINK_COMPONENTS
	Support			Support
	)			)

llvm/lib/TableGen/Parser.cpp

This file was added.

				//===- Parser.cpp - Top-Level TableGen Parser implementation --------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/TableGen/Parser.h"
				#include "RecordContext.h"
				#include "TGParser.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/TableGen/Error.h"
				#include "llvm/TableGen/Record.h"

				using namespace llvm;

				bool llvm::TableGenParseFile(std::unique_ptr<MemoryBuffer> Buffer,
				std::vector<std::string> IncludeDirs,
				TableGenParserFn ParserFn) {
				RecordKeeper Records;
				Records.saveInputFilename(Buffer->getBufferIdentifier().str());

				SrcMgr.AddNewSourceBuffer(std::move(Buffer), SMLoc());
				SrcMgr.setIncludeDirs(IncludeDirs);
				TGParser Parser(SrcMgr, /Macros=/None, Records);
				if (Parser.ParseFile())
				return true;

				// Invoke the provided handler function.
				if (ParserFn(Records))
				return true;

				// After parsing, reset the tablegen data.
				detail::resetTablegenRecordContext();
				SrcMgr = SourceMgr();
				rriddleAuthorUnsubmitted Done Reply Inline Actions We could do this at the beginning and allow returning the RecordKeeper, but users would have to keep in mind that any successive call to this function will destroy any previously returned RecordKeeper. The current lifetime model of TableGen makes this quite annoying (that is more easily fixable now than it used to be, but someone still needs to put in a lot of the work to plumb through a context everywhere in TableGen code). rriddle: We could do this at the beginning and allow returning the RecordKeeper, but users would have to…
				nhaehnleUnsubmitted Not Done Reply Inline Actions This current version seems a reasonable compromise to me as long as all the issues around global state in TableGen haven't been addresses. nhaehnle: This current version seems a reasonable compromise to me as long as all the issues around…
				return false;
				}

llvm/lib/TableGen/Record.cpp

//===- Record.cpp - Record implementation ---------------------------------===//		//===- Record.cpp - Record implementation ---------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Implement the tablegen record classes.		// Implement the tablegen record classes.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/TableGen/Record.h"		#include "llvm/TableGen/Record.h"
		#include "RecordContext.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/FoldingSet.h"		#include "llvm/ADT/FoldingSet.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	struct RecordContext {

unsigned LastRecordID;		unsigned LastRecordID;
};		};
} // namespace detail		} // namespace detail
} // namespace llvm		} // namespace llvm

ManagedStatic<detail::RecordContext> Context;		ManagedStatic<detail::RecordContext> Context;

		void llvm::detail::resetTablegenRecordContext() { Context.destroy(); }

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Type implementations		// Type implementations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)		#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void RecTy::dump() const { print(errs()); }		LLVM_DUMP_METHOD void RecTy::dump() const { print(errs()); }
#endif		#endif

▲ Show 20 Lines • Show All 2,736 Lines • Show Last 20 Lines

llvm/lib/TableGen/RecordContext.h

This file was added.

				//===- RecordContext.h - RecordContext implementation ---------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains functions for interacting with the tablegen record
				// context.
				//
				//===----------------------------------------------------------------------===//

				namespace llvm {
				namespace detail {

				/// Resets the Tablegen record context and all currently parsed record data.
				/// Tablegen currently relies on a lot of static data to keep track of parsed
				/// records, which accumulates into static fields. This method resets all of
				/// that data to enable successive executions of the tablegen parser.
				/// FIXME: Ideally tablegen would use a properly scoped (non-static) context,
				/// which would remove any need for managing the context in this way. In that
				/// case, this method could be removed.
				void resetTablegenRecordContext();

				} // end namespace detail
				} // end namespace llvm

llvm/unittests/TableGen/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	TableGen			TableGen
	Support			Support
	)			)

	set(LLVM_TARGET_DEFINITIONS Automata.td)			set(LLVM_TARGET_DEFINITIONS Automata.td)

	tablegen(LLVM AutomataTables.inc -gen-searchable-tables)			tablegen(LLVM AutomataTables.inc -gen-searchable-tables)
	tablegen(LLVM AutomataAutomata.inc -gen-automata)			tablegen(LLVM AutomataAutomata.inc -gen-automata)
	add_public_tablegen_target(AutomataTestTableGen)			add_public_tablegen_target(AutomataTestTableGen)

	add_llvm_unittest(TableGenTests DISABLE_LLVM_LINK_LLVM_DYLIB			add_llvm_unittest(TableGenTests DISABLE_LLVM_LINK_LLVM_DYLIB
	CodeExpanderTest.cpp
	AutomataTest.cpp			AutomataTest.cpp
				CodeExpanderTest.cpp
				ParserEntryPointTest.cpp
	)			)
	include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../../utils/TableGen)			include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../../utils/TableGen)
	target_link_libraries(TableGenTests PRIVATE LLVMTableGenGlobalISel LLVMTableGen)			target_link_libraries(TableGenTests PRIVATE LLVMTableGenGlobalISel LLVMTableGen)

llvm/unittests/TableGen/ParserEntryPointTest.cpp

This file was added.

				//===- unittest/TableGen/ParserEntryPointTest.cpp - Parser tests ----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/TableGen/Parser.h"
				#include "llvm/TableGen/Record.h"
				#include "gmock/gmock.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				TEST(Parser, SanityTest) {
				// Simple TableGen source file with a single record.
				const char *SimpleTdSource = R"td(
				def Foo {
				string strField = "value";
				}
				)td";

				auto ProcessFn = [&](const RecordKeeper &Records) {
				Record *Foo = Records.getDef("Foo");
				Optional<StringRef> Field = Foo->getValueAsOptionalString("strField");
				EXPECT_TRUE(Field.hasValue());
				EXPECT_EQ(Field.getValue(), "value");
				return false;
				};

				bool ProcessResult = TableGenParseFile(
				MemoryBuffer::getMemBuffer(SimpleTdSource, "test_buffer"),
				/IncludeDirs=/{}, ProcessFn);
				EXPECT_FALSE(ProcessResult);
				}

This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Add a library-based entry point for parsing td filesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 412857

llvm/include/llvm/TableGen/Parser.h

llvm/lib/TableGen/CMakeLists.txt

llvm/lib/TableGen/Parser.cpp

llvm/lib/TableGen/Record.cpp

llvm/lib/TableGen/RecordContext.h

llvm/unittests/TableGen/CMakeLists.txt

llvm/unittests/TableGen/ParserEntryPointTest.cpp

[TableGen] Add a library-based entry point for parsing td files
ClosedPublic