This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Support/
-
llvm/
-
Support/
2
JSON.h
-
lib/Support/
-
Support/
-
JSON.cpp
-
unittests/Support/
-
Support/
-
JSONTest.cpp

Differential D87335

[json] Create some llvm::Expected-based accessors
AbandonedPublic

Authored by wallace on Sep 8 2020, 4:03 PM.

Download Raw Diff

Details

Reviewers

clayborg
sammccall

Summary

In D85705 I found the need for such accessors in llvm::json and I
created them in that diff. I'm moving those changes to this new diff.
The idea of these accessors is that they either return the requested JSON value
or a error string with a descriptive message useful for the user to fix
it.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wallace created this revision.Sep 8 2020, 4:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 8 2020, 4:03 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

wallace requested review of this revision.Sep 8 2020, 4:03 PM

wallace added a child revision: D85705: Add a "Trace" plug-in to LLDB to add process trace support in stages..Sep 8 2020, 4:04 PM

Harbormaster completed remote builds in B71011: Diff 290600.Sep 8 2020, 4:38 PM

clayborg mentioned this in D85705: Add a "Trace" plug-in to LLDB to add process trace support in stages..Sep 8 2020, 5:01 PM

ping

This is a sensible thing to want and very nice code.
I'm not sure about putting it in JSON.h - there are a few unfortunate aspects that I just don't know how to fix, but limit the benefit which has to be measured against the cost of adding more complexity to the API.

the error messages aren't *quite* good enough to be directly user-visible in the general case. e.g. "JSON object missing the 'id' field" - some hint to which (sub)object is often needed. U
because the getAsX() method names are taken, the new accessors are stuck with awkward names
Expected<pointer> (for object/array) is an awkward type, because -> doesn't really work and the meaning/impossibility of nullptr isn't obvious.
this is useful when mapping objects with many fields, but it's not clear how to extend it to ObjectMapper or fromJSON (which maybe themselves don't have the best design, but...)

Argh, I hit enter too soon. One thing I'd planned to add was I'm sorry about the delay in reviewing this, too.

I think the first concern is the most important - providing an error-bearing API whose error messages aren't actually good enough for a lot of purposes is a bit of a hazard.
To give good errors you really need to pass around some context, but this cuts down on the simplicity of the API.

Happy to discuss this further if we want to try to come up with a general solution (I know clangd currently reports no errors other than "this is invalid", so I have some interest!)
Otherwise it might be best to carry these as non-member functions in the local project that can make the tradeoffs around APIs. Since they're implemented entirely on top of the public API this seems manageable, just a little ugly :-\

Thanks for the review. I'll implement this as helper functions in my patch, however it'd be interesting to figure out a way to create this API so that callers don't recreate this functionality. My general understanding is that showing some error message, even if incomplete, is better than showing almost nothing. But I can understand your point.

Right now all users of this code end up creating little extra functions that must work around this. lldb-vscode has it's own methods, the new IntelPT will do its own thing.

I think these APIs make sense for the API itself as a user of the API from my lldb-vscode work. Can we iterate on this and come up with a solution?

In D87335#2282034, @sammccall wrote:

This is a sensible thing to want and very nice code.
I'm not sure about putting it in JSON.h - there are a few unfortunate aspects that I just don't know how to fix, but limit the benefit which has to be measured against the cost of adding more complexity to the API.

the error messages aren't *quite* good enough to be directly user-visible in the general case. e.g. "JSON object missing the 'id' field" - some hint to which (sub)object is often needed.

I believe they are good enough, they say 'id' is missing, but then print the full JSON object that they are missing from right?

because the getAsX() method names are taken, the new accessors are stuck with awkward names

See my "expect" suggestion in inline comments?

Expected<pointer> (for object/array) is an awkward type, because -> doesn't really work and the meaning/impossibility of nullptr isn't obvious.

Since this we are using Expected can we change the value to be "Object&" or "Array &" instead of a pointer?

this is useful when mapping objects with many fields, but it's not clear how to extend it to ObjectMapper or fromJSON (which maybe themselves don't have the best design, but...)

I think the first concern is the most important - providing an error-bearing API whose error messages aren't actually good enough for a lot of purposes is a bit of a hazard.
To give good errors you really need to pass around some context, but this cuts down on the simplicity of the API.

Most of these errors are when we are trying to extract a key/value pair from an Object. So I don't really think we need to pass around more context? to make things make sense.

Happy to discuss this further if we want to try to come up with a general solution (I know clangd currently reports no errors other than "this is invalid", so I have some interest!)
Otherwise it might be best to carry these as non-member functions in the local project that can make the tradeoffs around APIs. Since they're implemented entirely on top of the public API this seems manageable, just a little ugly :-\

I would like to see more consistent use of this API with common errors instead of everyone make up their own error messages.

llvm/include/llvm/Support/JSON.h
152–158	Maybe instead of these start with "get" and ending with "OrError", we could just name these "expect...". The getOptionalStringOrError doesn't map well to this though
459–462	switch over to "expect" as mentioned above?

TL;DR:

I agree we should solve this, thanks for pushing back
I think the path-to-error (e.g. processes[0].triple) is critical in some cases
this really seems like an marshaling concern rather than something for Value/Object
happy to take a stab at a design but I think it'll need some prototyping

In D87335#2282676, @wallace wrote:

it'd be interesting to figure out a way to create this API so that callers don't recreate this functionality. My general understanding is that showing some error message, even if incomplete, is better than showing almost nothing.

In D87335#2283040, @clayborg wrote:

Right now all users of this code end up creating little extra functions that must work around this. lldb-vscode has it's own methods, the new IntelPT will do its own thing.

I think these APIs make sense for the API itself as a user of the API from my lldb-vscode work. Can we iterate on this and come up with a solution?

I agree there's value here, let's try to build something really nice :-) One advantage of the current situation is it's clear that error-reporting is on the caller, so they have to deal with the question of whether it's good enough. If we provide an opinionated option then I think it's important it's good enough for most use cases.

One disadvantage of the current situation is you have to avoid fromJSON/ObjectMapper in favor of the caller inventing something much more verbose in order to get error messages - this seems like a comparable-sized problem to me.

In D87335#2282034, @sammccall wrote:

the error messages aren't *quite* good enough to be directly user-visible in the general case. e.g. "JSON object missing the 'id' field" - some hint to which (sub)object is often needed.

I believe they are good enough, they say 'id' is missing, but then print the full JSON object that they are missing from right?

I have two concerns about this approach:

the full JSON object may be huge (e.g. missing key in an object where sibling keys may be giant objects)
the object where the error is found is not unique enough to be clear ("missing key 'id' in value '{}'" in a large object). Wrapping errors could help, but has its own problems.

I think the most important information we should be aiming to capture:

what the root object is, in the context of the application (caller must supply this of course)
the path from the root to the local error
some description of the local error

The second is not present here, and requires a different API to achieve. llvm::Expected is good for *exposing* errors, but both awkward and inefficient for assembling complex errors (especially in cases where failure to parse isn't a real error). So I'd expect conversion of some error description to an llvm::Error to happen at the end of parsing.

because the getAsX() method names are taken, the new accessors are stuck with awkward names

See my "expect" suggestion in inline comments?

Nice! I like expectString() etc a lot better. I think the answer for getOptionalStringOrError() is to drop that function - it seems fairly nonorthogonal and niche.

While IMO this solves the name problem, it would still be nice if we don't need these on the core data structure classes but rather treated this as a concern of marshaling functions, where it tends to come up. TraceSettingsParser seems to follow this pattern - obviously lack of error handling makes ObjectMapper unsuitable, but do you think it would otherwise work?

Expected<pointer> (for object/array) is an awkward type, because -> doesn't really work and the meaning/impossibility of nullptr isn't obvious.

Since this we are using Expected can we change the value to be "Object&" or "Array &" instead of a pointer?

Indeed! I thought Expected<T&> wasn't allowed, but looks like I was hallucinating. (and I have some project code to go clean up!)

Most of these errors are when we are trying to extract a key/value pair from an Object. So I don't really think we need to pass around more context? to make things make sense.

(I think I covered this above and don't want to bore with repetition, but can go into more examples and stuff if that's interesting)

I have some ideas about this, but this comment is long enough. I'll dump some in the next one and then try to make time to prototype.

A rough design sketch:

When parsing an object at the top level, we create an error context object. This can store e.g. the name of the schema for the top-level, as well as details of the created error. This is fixed-size with no allocations unless an actual error occurs.

While parsing a subobject, the JSON path is stored as a reverse linked list of stack objects, again with no other allocations until an error happens. (Actual string data is either constant or in the JSON object itself, I think)
When a parse-object-function wants to parse a subobject field "foo", it passes Path.derive("foo") to the corresponding parse function.

When an error is reported, the parsing function walks up the chain of path elements to find the root error context:

storing path elements in the root (which has a vector<Elt> so) to preserve them once destroyed
storing error semantics in the root as well (could include enum like "wrong type", custom error messages, pointer to relevant JSON data, etc)

Then the error context provides an API to check success, assemble a llvm::Error or Expected etc.

In D87335#2284855, @sammccall wrote:

A rough design sketch

I've fleshed this out in D88103.
As noted there it could be split into several patches, but I'm pretty happy with the ability there to show an error in context without producing an overwhelming amount of information.
(And without paying for rendering everything upfront when it may be unused or even a nonfatal error at the top level).

Interested whether this is something you could use?

wallace abandoned this revision.Sep 22 2020, 2:20 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Support/

JSON.h

26 lines

lib/

Support/

JSON.cpp

93 lines

unittests/

Support/

JSONTest.cpp

113 lines

Diff 290600

llvm/include/llvm/Support/JSON.h

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines

public:

llvm::Optional<bool> getBoolean(StringRef K) const;

llvm::Optional<double> getNumber(StringRef K) const;

llvm::Optional<int64_t> getInteger(StringRef K) const;

llvm::Optional<llvm::StringRef> getString(StringRef K) const;

const json::Object *getObject(StringRef K) const;

json::Object *getObject(StringRef K);

const json::Array *getArray(StringRef K) const;

json::Array *getArray(StringRef K);

/// \a llvm::Expected based accessors

/// In case of an error, they return a user-friendly message.

/// \{

llvm::Expected<const json::Value *> getOrError(StringRef K) const;

llvm::Expected<int64_t> getIntegerOrError(StringRef K) const;

llvm::Expected<llvm::StringRef> getStringOrError(StringRef K) const;

llvm::Expected<llvm::Optional<llvm::StringRef>>

getOptionalStringOrError(StringRef K) const;

llvm::Expected<const json::Object *> getObjectOrError(StringRef K) const;

llvm::Expected<const json::Array *> getArrayOrError(StringRef K) const;

clayborgUnsubmitted

Not Done

/// \{

- llvm::Expected<const json::Value *> getOrError(StringRef K) const;

- llvm::Expected<int64_t> getIntegerOrError(StringRef K) const;

- llvm::Expected<llvm::StringRef> getStringOrError(StringRef K) const;

- llvm::Expected<llvm::Optional<llvm::StringRef>>

- getOptionalStringOrError(StringRef K) const;

- llvm::Expected<const json::Object *> getObjectOrError(StringRef K) const;

- llvm::Expected<const json::Array *> getArrayOrError(StringRef K) const;

+ llvm::Expected<const json::Value *> expect(StringRef K) const;

+ llvm::Expected<int64_t> expectInteger(StringRef K) const;

+ llvm::Expected<llvm::StringRef> expectString(StringRef K) const;

+ llvm::Expected<const json::Object *> expectObject(StringRef K) const;

+ llvm::Expected<const json::Array *> expectArray(StringRef K) const;

/// \}

Maybe instead of these start with "get" and ending with "OrError", we could just name these "expect...". The getOptionalStringOrError doesn't map well to this though

clayborg: Maybe instead of these start with "get" and ending with "OrError", we could just name these…

/// \}

private:

llvm::Error createMissingKeyError(llvm::StringRef K) const;

};

bool operator==(const Object &LHS, const Object &RHS);

inline bool operator!=(const Object &LHS, const Object &RHS) {

return !(LHS == RHS);

}

/// An Array is a JSON array, which contains heterogeneous JSON values.

/// It simulates std::vector<Value>.

▲ Show 20 Lines • Show All 277 Lines • ▼ Show 20 Lines

public:

}

const json::Array *getAsArray() const {

return LLVM_LIKELY(Type == T_Array) ? &as<json::Array>() : nullptr;

}

json::Array *getAsArray() {

return LLVM_LIKELY(Type == T_Array) ? &as<json::Array>() : nullptr;

}

/// \a llvm::Expected based accessors

/// In case of an error, they return a user-friendly message.

/// \{

llvm::Expected<const json::Object *> getAsObjectOrError() const;

llvm::Expected<const json::Array *> getAsArrayOrError() const;

llvm::Expected<llvm::StringRef> getAsStringOrError() const;

llvm::Expected<int64_t> getAsIntegerOrError() const;

clayborgUnsubmitted

Not Done

switch over to "expect" as mentioned above?

clayborg: switch over to "expect" as mentioned above?

/// \}

private:

llvm::Error createWrongTypeError(llvm::StringRef Type) const;

void destroy();

void copyFrom(const Value &M);

// We allow moving from *const* Values, by marking all members as mutable!

// This hack is needed to support initializer-list syntax efficiently.

// (std::initializer_list<T> is a container of const T).

void moveFrom(const Value &&M);

friend class Array;

friend class Object;

▲ Show 20 Lines • Show All 414 Lines • Show Last 20 Lines

llvm/lib/Support/JSON.cpp

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	if (auto *V = get(K))
return V->getAsArray();		return V->getAsArray();
return nullptr;		return nullptr;
}		}
json::Array *Object::getArray(StringRef K) {		json::Array *Object::getArray(StringRef K) {
if (auto *V = get(K))		if (auto *V = get(K))
return V->getAsArray();		return V->getAsArray();
return nullptr;		return nullptr;
}		}

		llvm::Error Object::createMissingKeyError(llvm::StringRef K) const {
		std::string S;
		llvm::raw_string_ostream OS(S);
		json::Object Obj = *this;
		OS << llvm::formatv(
		"JSON object is missing the \"{0}\" field.\nValue:\n{1:2}", K,
		json::Value(std::move(Obj)));
		OS.flush();

		return llvm::createStringError(std::errc::invalid_argument, OS.str().c_str());
		}

		llvm::Expected<const json::Value *> Object::getOrError(StringRef K) const {
		if (const json::Value *V = get(K))
		return V;
		else
		return createMissingKeyError(K);
		}

		llvm::Expected<int64_t> Object::getIntegerOrError(StringRef K) const {
		if (llvm::Expected<const json::Value *> V = getOrError(K))
		return V.get()->getAsIntegerOrError();
		else
		return V.takeError();
		}

		llvm::Expected<StringRef> Object::getStringOrError(StringRef K) const {
		if (llvm::Expected<const json::Value *> V = getOrError(K))
		return V.get()->getAsStringOrError();
		else
		return V.takeError();
		}

		llvm::Expected<const json::Array *> Object::getArrayOrError(StringRef K) const {
		if (llvm::Expected<const json::Value *> V = getOrError(K))
		return V.get()->getAsArrayOrError();
		else
		return V.takeError();
		}

		llvm::Expected<const json::Object *>
		Object::getObjectOrError(StringRef K) const {
		if (llvm::Expected<const json::Value *> V = getOrError(K))
		return V.get()->getAsObjectOrError();
		else
		return V.takeError();
		}

		llvm::Expected<llvm::Optional<StringRef>>
		Object::getOptionalStringOrError(StringRef K) const {
		if (const json::Value *V = get(K))
		return V->getAsStringOrError();
		return llvm::None;
		}

bool operator==(const Object &LHS, const Object &RHS) {		bool operator==(const Object &LHS, const Object &RHS) {
if (LHS.size() != RHS.size())		if (LHS.size() != RHS.size())
return false;		return false;
for (const auto &L : LHS) {		for (const auto &L : LHS) {
auto R = RHS.find(L.first);		auto R = RHS.find(L.first);
if (R == RHS.end() \|\| L.second != R->second)		if (R == RHS.end() \|\| L.second != R->second)
return false;		return false;
}		}
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	case T_Object:
break;		break;
case T_Array:		case T_Array:
create<json::Array>(std::move(M.as<json::Array>()));		create<json::Array>(std::move(M.as<json::Array>()));
M.Type = T_Null;		M.Type = T_Null;
break;		break;
}		}
}		}

		llvm::Error Value::createWrongTypeError(llvm::StringRef Type) const {
		std::string S;
		llvm::raw_string_ostream OS(S);
		OS << llvm::formatv("JSON value is expected to be \"{0}\".\nValue:\n{1:2}",
		Type, *this);
		OS.flush();

		return llvm::createStringError(std::errc::invalid_argument, OS.str().c_str());
		}

		llvm::Expected<int64_t> Value::getAsIntegerOrError() const {
		llvm::Optional<int64_t> V = getAsInteger();
		if (V.hasValue())
		return *V;
		return createWrongTypeError("integer");
		}

		llvm::Expected<StringRef> Value::getAsStringOrError() const {
		llvm::Optional<StringRef> V = getAsString();
		if (V.hasValue())
		return *V;
		return createWrongTypeError("string");
		}

		llvm::Expected<const json::Array *> Value::getAsArrayOrError() const {
		if (const json::Array *V = getAsArray())
		return V;
		return createWrongTypeError("array");
		}

		llvm::Expected<const json::Object *> Value::getAsObjectOrError() const {
		if (const json::Object *V = getAsObject())
		return V;
		return createWrongTypeError("object");
		}

void Value::destroy() {		void Value::destroy() {
switch (Type) {		switch (Type) {
case T_Null:		case T_Null:
case T_Boolean:		case T_Boolean:
case T_Double:		case T_Double:
case T_Integer:		case T_Integer:
break;		break;
case T_StringRef:		case T_StringRef:
▲ Show 20 Lines • Show All 549 Lines • ▼ Show 20 Lines

void llvm::format_provider<llvm::json::Value>::format(		void llvm::format_provider<llvm::json::Value>::format(
const llvm::json::Value &E, raw_ostream &OS, StringRef Options) {		const llvm::json::Value &E, raw_ostream &OS, StringRef Options) {
unsigned IndentAmount = 0;		unsigned IndentAmount = 0;
if (!Options.empty() && Options.getAsInteger(/Radix=/10, IndentAmount))		if (!Options.empty() && Options.getAsInteger(/Radix=/10, IndentAmount))
llvm_unreachable("json::Value format options should be an integer");		llvm_unreachable("json::Value format options should be an integer");
json::OStream(OS, IndentAmount).value(E);		json::OStream(OS, IndentAmount).value(E);
}		}

llvm/unittests/Support/JSONTest.cpp

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	TEST(JSONTest, Parse) {
Compare(R"({"obj":{},"arr":[]})", Object{{"obj", Object{}}, {"arr", {}}});		Compare(R"({"obj":{},"arr":[]})", Object{{"obj", Object{}}, {"arr", {}}});
Compare(R"({"\n":{"\u0000":[[[[]]]]}})",		Compare(R"({"\n":{"\u0000":[[[[]]]]}})",
Object{{"\n", Object{		Object{{"\n", Object{
{llvm::StringRef("\0", 1), {{{{}}}}},		{llvm::StringRef("\0", 1), {{{{}}}}},
}}});		}}});
Compare("\r[\n\t] ", {});		Compare("\r[\n\t] ", {});
}		}

		TEST(JSONTest, ExpectedGettersForObject) {
		Object O{{"a", 1},
		{"b", "2"},
		{"array", {1, 2, 3}},
		{"object", Object({{"key", "value"}})}};

		// Testing valid keys
		if (auto E = O.getOrError("a"))
		EXPECT_EQ(E.get()->kind(), llvm::json::Value::Kind::Number);
		else
		FAIL() << "Unexpected error";

		if (auto E = O.getIntegerOrError("a"))
		EXPECT_EQ(*E, 1);
		else
		FAIL() << "Unexpected error";

		if (auto E = O.getStringOrError("b"))
		EXPECT_EQ(*E, "2");
		else
		FAIL() << "Unexpected error";

		if (auto E = O.getOptionalStringOrError("c"))
		EXPECT_EQ(*E, llvm::None);
		else
		FAIL() << "Unexpected error";

		if (auto E = O.getArrayOrError("array"))
		EXPECT_EQ(E.get()->size(), (size_t)3);
		else
		FAIL() << "Unexpected error";

		if (auto E = O.getObjectOrError("object"))
		EXPECT_EQ(E.get()->size(), (size_t)1);
		else
		FAIL() << "Unexpected error";

		// Testing invalid keys
		if (auto E = O.getOrError("a2"))
		FAIL() << "Unexpected error";
		else
		handleAllErrors(E.takeError(), [](const llvm::ErrorInfoBase &E) {
		EXPECT_EQ(E.message(),
		std::string(R"(JSON object is missing the "a2" field.
		Value:
		{
		"a": 1,
		"array": [
		1,
		2,
		3
		],
		"b": "2",
		"object": {
		"key": "value"
		}
		})"));
		});

		if (auto E = O.getIntegerOrError("array"))
		FAIL() << "Unexpected error";
		else
		handleAllErrors(E.takeError(), [](const llvm::ErrorInfoBase &E) {
		EXPECT_EQ(E.message(),
		std::string(R"(JSON value is expected to be "integer".
		Value:
		[
		1,
		2,
		3
		])"));
		});
		}

		TEST(JSONTest, ExpectedConvertersForObject) {
		Value V(Object({{"key", "value"}}));

		if (auto E = V.getAsObjectOrError())
		EXPECT_EQ(E.get()->size(), (size_t)1);
		else
		FAIL() << "Unexpected error";

		if (auto E = V.getAsIntegerOrError())
		FAIL() << "Unexpected error";
		else
		handleAllErrors(E.takeError(), [](const llvm::ErrorInfoBase &E) {
		EXPECT_EQ(E.message(),
		std::string(R"(JSON value is expected to be "integer".
		Value:
		{
		"key": "value"
		})"));
		});

		Value V2("string");
		if (auto E = V2.getAsStringOrError())
		EXPECT_EQ(*E, "string");
		else
		FAIL() << "Unexpected error";

		Value V3({1, 2, 3});
		if (auto E = V3.getAsArrayOrError())
		EXPECT_EQ(E.get()->size(), (size_t)3);
		else
		FAIL() << "Unexpected error";

		Value V4(52);
		if (auto E = V4.getAsIntegerOrError())
		EXPECT_EQ(E.get(), 52);
		else
		FAIL() << "Unexpected error";
		}

TEST(JSONTest, ParseErrors) {		TEST(JSONTest, ParseErrors) {
auto ExpectErr = [](llvm::StringRef Msg, llvm::StringRef S) {		auto ExpectErr = [](llvm::StringRef Msg, llvm::StringRef S) {
if (auto E = parse(S)) {		if (auto E = parse(S)) {
// Compare both string forms and with operator==, in case we have bugs.		// Compare both string forms and with operator==, in case we have bugs.
FAIL() << "Parsed JSON >>> " << S << " <<< but wanted error: " << Msg;		FAIL() << "Parsed JSON >>> " << S << " <<< but wanted error: " << Msg;
} else {		} else {
handleAllErrors(E.takeError(), [S, Msg](const llvm::ErrorInfoBase &E) {		handleAllErrors(E.takeError(), [S, Msg](const llvm::ErrorInfoBase &E) {
EXPECT_THAT(E.message(), testing::HasSubstr(std::string(Msg))) << S;		EXPECT_THAT(E.message(), testing::HasSubstr(std::string(Msg))) << S;
▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[json] Create some llvm::Expected-based accessorsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 290600

llvm/include/llvm/Support/JSON.h

llvm/lib/Support/JSON.cpp

llvm/unittests/Support/JSONTest.cpp

[json] Create some llvm::Expected-based accessors
AbandonedPublic