This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/IR/
-
mlir/
-
IR/
-
OpImplementation.h
-
lib/AsmParser/
-
AsmParser/
2/3
AsmParserImpl.h
-
test/
-
IR/
-
parser.mlir
-
lib/Dialect/Test/
-
Dialect/
-
Test/
-
TestDialect.cpp
-
TestOps.td

Differential D138090

[MLIR][Parser] Add `parseBase64Bytes`.
ClosedPublic

Authored by bzcheeseman on Nov 15 2022, 9:23 PM.

Download Raw Diff

Details

Reviewers

rriddle
nicolasvasilache

Commits

rGbf87d5ad8207: [MLIR][Parser] Add `parseBase64Bytes`.

Summary

This patch adds parseBase64Bytes to the parser. It attempts to avoid double-allocating the buffer by re-using the token's spelling directly and eliding the quotes if they exist. It also avoids extra allocations by using std::vector<char> in the API - something we should change when the llvm::decodeBase64 API changes.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

bzcheeseman created this revision.Nov 15 2022, 9:23 PM

Herald added a reviewer: rriddle. · View Herald TranscriptNov 15 2022, 9:23 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: Moerafaat, zero9178, sdasgup3 and 19 others. · View Herald Transcript

bzcheeseman requested review of this revision.Nov 15 2022, 9:23 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptNov 15 2022, 9:23 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

bzcheeseman updated this revision to Diff 475678.Nov 15 2022, 9:24 PM

Harbormaster completed remote builds in B197904: Diff 475678.Nov 15 2022, 9:52 PM

rriddle accepted this revision.Nov 18 2022, 1:48 AM

rriddle added inline comments.

mlir/lib/AsmParser/AsmParserImpl.h
262	Why does this need to trim whitespace? I'd just consume the string characters and let `decodeBase64` error out as necessary.

This revision is now accepted and ready to land.Nov 18 2022, 1:48 AM

bzcheeseman added inline comments.Nov 18 2022, 8:12 AM

mlir/lib/AsmParser/AsmParserImpl.h
262	It's because we're using the token spelling directly, so I couldn't actually get any example to work without trim whitespace and trimming the quotes. If you use `token.getStringValue()` then it works without trimming, so it's a tradeoff between double-allocating the buffer and having to trim.

This revision was landed with ongoing or failed builds.Nov 18 2022, 8:13 AM

Closed by commit rGbf87d5ad8207: [MLIR][Parser] Add `parseBase64Bytes`. (authored by bzcheeseman). · Explain Why

This revision was automatically updated to reflect the committed changes.

bzcheeseman added a commit: rGbf87d5ad8207: [MLIR][Parser] Add `parseBase64Bytes`..

bzcheeseman added inline comments.Nov 18 2022, 8:14 AM

mlir/lib/AsmParser/AsmParserImpl.h
262	I'm happy to revisit this and go the other way if you'd prefer :)

Revision Contents

Path

Size

mlir/

include/

mlir/

IR/

OpImplementation.h

3 lines

lib/

AsmParser/

AsmParserImpl.h

23 lines

test/

IR/

parser.mlir

7 lines

lib/

Dialect/

Test/

TestDialect.cpp

15 lines

TestOps.td

5 lines

Diff 476486

mlir/include/mlir/IR/OpImplementation.h

Show First 20 Lines • Show All 571 Lines • ▼ Show 20 Lines	ParseResult parseString(std::string *string) {
if (parseOptionalString(string))		if (parseOptionalString(string))
return emitError(loc, "expected string");		return emitError(loc, "expected string");
return success();		return success();
}		}

/// Parse a quoted string token if present.		/// Parse a quoted string token if present.
virtual ParseResult parseOptionalString(std::string *string) = 0;		virtual ParseResult parseOptionalString(std::string *string) = 0;

		/// Parses a Base64 encoded string of bytes.
		virtual ParseResult parseBase64Bytes(std::vector<char> *bytes) = 0;

/// Parse a `(` token.		/// Parse a `(` token.
virtual ParseResult parseLParen() = 0;		virtual ParseResult parseLParen() = 0;

/// Parse a `(` token if present.		/// Parse a `(` token if present.
virtual ParseResult parseOptionalLParen() = 0;		virtual ParseResult parseOptionalLParen() = 0;

/// Parse a `)` token.		/// Parse a `)` token.
virtual ParseResult parseRParen() = 0;		virtual ParseResult parseRParen() = 0;
▲ Show 20 Lines • Show All 1,053 Lines • Show Last 20 Lines

mlir/lib/AsmParser/AsmParserImpl.h

//===- AsmParserImpl.h - MLIR AsmParserImpl Class ---------------- C++ --===//		//===- AsmParserImpl.h - MLIR AsmParserImpl Class ---------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef MLIR_LIB_ASMPARSER_ASMPARSERIMPL_H		#ifndef MLIR_LIB_ASMPARSER_ASMPARSERIMPL_H
#define MLIR_LIB_ASMPARSER_ASMPARSERIMPL_H		#define MLIR_LIB_ASMPARSER_ASMPARSERIMPL_H

#include "Parser.h"		#include "Parser.h"
#include "mlir/AsmParser/AsmParserState.h"		#include "mlir/AsmParser/AsmParserState.h"
#include "mlir/IR/Builders.h"		#include "mlir/IR/Builders.h"
#include "mlir/IR/OpImplementation.h"		#include "mlir/IR/OpImplementation.h"
		#include "llvm/Support/Base64.h"

namespace mlir {		namespace mlir {
namespace detail {		namespace detail {
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// AsmParserImpl		// AsmParserImpl
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This class provides the implementation of the generic parser methods within		/// This class provides the implementation of the generic parser methods within
▲ Show 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	if (!parser.getToken().is(Token::string))
return failure();		return failure();

if (string)		if (string)
*string = parser.getToken().getStringValue();		*string = parser.getToken().getStringValue();
parser.consumeToken();		parser.consumeToken();
return success();		return success();
}		}

		/// Parses a Base64 encoded string of bytes.
		ParseResult parseBase64Bytes(std::vector<char> *bytes) override {
		auto loc = getCurrentLocation();
		if (!parser.getToken().is(Token::string))
		return emitError(loc, "expected string");

		if (bytes) {
		// decodeBase64 doesn't modify its input so we can use the token spelling
		// and just slice off the quotes/whitespaces if there are any. Whitespace
		// and quotes cannot appear as part of a (standard) base64 encoded string,
		// so this is safe to do.
		StringRef b64QuotedString = parser.getTokenSpelling();
		StringRef b64String =
		b64QuotedString.ltrim("\" \t\n\v\f\r").rtrim("\" \t\n\v\f\r");
		rriddleUnsubmitted Not Done Reply Inline Actions Why does this need to trim whitespace? I'd just consume the string characters and let `decodeBase64` error out as necessary. rriddle: Why does this need to trim whitespace? I'd just consume the string characters and let…
		bzcheesemanAuthorUnsubmitted Done Reply Inline Actions It's because we're using the token spelling directly, so I couldn't actually get any example to work without trim whitespace and trimming the quotes. If you use `token.getStringValue()` then it works without trimming, so it's a tradeoff between double-allocating the buffer and having to trim. bzcheeseman: It's because we're using the token spelling directly, so I couldn't actually get any example…
		bzcheesemanAuthorUnsubmitted Done Reply Inline Actions I'm happy to revisit this and go the other way if you'd prefer :) bzcheeseman: I'm happy to revisit this and go the other way if you'd prefer :)
		if (auto err = llvm::decodeBase64(b64String, *bytes))
		return emitError(loc, toString(std::move(err)));
		}

		parser.consumeToken();
		return success();
		}

/// Parse a floating point value from the stream.		/// Parse a floating point value from the stream.
ParseResult parseFloat(double &result) override {		ParseResult parseFloat(double &result) override {
bool isNegative = parser.consumeIf(Token::minus);		bool isNegative = parser.consumeIf(Token::minus);
Token curTok = parser.getToken();		Token curTok = parser.getToken();
SMLoc loc = curTok.getLoc();		SMLoc loc = curTok.getLoc();

// Check for a floating point value.		// Check for a floating point value.
if (curTok.is(Token::floatliteral)) {		if (curTok.is(Token::floatliteral)) {
▲ Show 20 Lines • Show All 317 Lines • Show Last 20 Lines

mlir/test/IR/parser.mlir

	Show First 20 Lines • Show All 1,179 Lines • ▼ Show 20 Lines

	// CHECK-LABEL: func @parse_wrapped_keyword_test			// CHECK-LABEL: func @parse_wrapped_keyword_test
	func.func @parse_wrapped_keyword_test() {			func.func @parse_wrapped_keyword_test() {
	// CHECK: test.parse_wrapped_keyword foo.keyword			// CHECK: test.parse_wrapped_keyword foo.keyword
	test.parse_wrapped_keyword foo.keyword			test.parse_wrapped_keyword foo.keyword
	return			return
	}			}

				// CHECK-LABEL: func @parse_base64_test
				func.func @parse_base64_test() {
				// CHECK: test.parse_b64 "hello world"
				test.parse_b64 "aGVsbG8gd29ybGQ="
				return
				}

	// CHECK-LABEL: func @"\22_string_symbol_reference\22"			// CHECK-LABEL: func @"\22_string_symbol_reference\22"
	func.func @"\"_string_symbol_reference\""() {			func.func @"\"_string_symbol_reference\""() {
	// CHECK: ref = @"\22_string_symbol_reference\22"			// CHECK: ref = @"\22_string_symbol_reference\22"
	"foo.symbol_reference"() {ref = @"\"_string_symbol_reference\""} : () -> ()			"foo.symbol_reference"() {ref = @"\"_string_symbol_reference\""} : () -> ()
	return			return
	}			}

	// CHECK-LABEL: func private @parse_opaque_attr_escape			// CHECK-LABEL: func private @parse_opaque_attr_escape
	▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

mlir/test/lib/Dialect/Test/TestDialect.cpp

Show First 20 Lines • Show All 856 Lines • ▼ Show 20 Lines	ParseResult ParseWrappedKeywordOp::parse(OpAsmParser &parser,
if (parser.parseKeyword(&keyword))		if (parser.parseKeyword(&keyword))
return failure();		return failure();
result.addAttribute("keyword", parser.getBuilder().getStringAttr(keyword));		result.addAttribute("keyword", parser.getBuilder().getStringAttr(keyword));
return success();		return success();
}		}

void ParseWrappedKeywordOp::print(OpAsmPrinter &p) { p << " " << getKeyword(); }		void ParseWrappedKeywordOp::print(OpAsmPrinter &p) { p << " " << getKeyword(); }

		ParseResult ParseB64BytesOp::parse(OpAsmParser &parser,
		OperationState &result) {
		std::vector<char> bytes;
		if (parser.parseBase64Bytes(&bytes))
		return failure();
		result.addAttribute("b64", parser.getBuilder().getStringAttr(
		StringRef(&bytes.front(), bytes.size())));
		return success();
		}

		void ParseB64BytesOp::print(OpAsmPrinter &p) {
		// Don't print the base64 version to check that we decoded it correctly.
		p << " \"" << getB64() << "\"";
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Test WrapRegionOp - wrapping op exercising `parseGenericOperation()`.		// Test WrapRegionOp - wrapping op exercising `parseGenericOperation()`.

ParseResult WrappingRegionOp::parse(OpAsmParser &parser,		ParseResult WrappingRegionOp::parse(OpAsmParser &parser,
OperationState &result) {		OperationState &result) {
if (parser.parseKeyword("wraps"))		if (parser.parseKeyword("wraps"))
return failure();		return failure();

▲ Show 20 Lines • Show All 724 Lines • Show Last 20 Lines

mlir/test/lib/Dialect/Test/TestOps.td

Show First 20 Lines • Show All 1,760 Lines • ▼ Show 20 Lines	def ParseIntegerLiteralOp : TEST_Op<"parse_integer_literal"> {
let hasCustomAssemblyFormat = 1;		let hasCustomAssemblyFormat = 1;
}		}

def ParseWrappedKeywordOp : TEST_Op<"parse_wrapped_keyword"> {		def ParseWrappedKeywordOp : TEST_Op<"parse_wrapped_keyword"> {
let arguments = (ins StrAttr:$keyword);		let arguments = (ins StrAttr:$keyword);
let hasCustomAssemblyFormat = 1;		let hasCustomAssemblyFormat = 1;
}		}

		def ParseB64BytesOp : TEST_Op<"parse_b64"> {
		let arguments = (ins StrAttr:$b64);
		let hasCustomAssemblyFormat = 1;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Test region argument list parsing.		// Test region argument list parsing.

def IsolatedRegionOp : TEST_Op<"isolated_region", [IsolatedFromAbove]> {		def IsolatedRegionOp : TEST_Op<"isolated_region", [IsolatedFromAbove]> {
let summary = "isolated region operation";		let summary = "isolated region operation";
let description = [{		let description = [{
Test op with an isolated region, to test passthrough region arguments. Each		Test op with an isolated region, to test passthrough region arguments. Each
argument is of index type.		argument is of index type.
▲ Show 20 Lines • Show All 1,287 Lines • Show Last 20 Lines