Page MenuHomePhabricator

Lift JSON library from clang-tools-extra/clangd to llvm/Support.
ClosedPublic

Authored by sammccall on Apr 17 2018, 9:56 PM.

Details

Summary

This consists of four main parts:

  • an type json::Expr representing JSON values of dynamic kind, which can be composed, inspected, and modified
  • a JSON parser from string -> json::Expr
  • a JSON printer from json::Expr -> string, with optional pretty-printing
  • a convention for mapping json::Expr <=> native types (fromJSON/toJSON) Mapping functions are provided for primitives (e.g. int, vector) and the ObjectMapper helper helps implement fromJSON for struct/object types.

Based on clangd's usage, a couple of places I'd appreciate review attention:

  • fromJSON returns only bool. A richer error-signaling mechanism may be useful to provide useful messages, or let recursive fromJSONs (containers/structs) do careful error recovery.
  • should json::obj be always explicitly written (like json::ary)
  • there's no streaming parse API. I suspect there are some simple wins like a callback API where the document is a long array, and each element is small. But this can probably be bolted on easily when we see the need.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Address some review comments (but no doxygen yet)

Just my 5 cents, I feel that including this library would be useful for LLVM and Polly would be a happy user.

sammccall updated this revision to Diff 144305.Apr 27 2018, 2:49 AM
sammccall marked 3 inline comments as done.

Doxygen, comment and error message tweaks.

Looks like I forgot to clang-format before sending this :-/

To avoid adding spurious diffs during review, I'll do that before landing (if applicable!).

include/llvm/Support/JSON.h
26–27 ↗(On Diff #143588)

Used doxygen comment prefix and wrapped a long example in \code...\endcode.

Many comments apply to blocks of related functions so are not doxygenated.
For the most part no actual doxygen annotations seemed appropriate, and I think rewriting comments to make more use of doxygen often hurts the inline readability of the comments.

So I think this is done, but LMK if you disagree.

lib/Support/JSON.cpp
327 ↗(On Diff #143588)

There's a tension here between precise errors, consistent/useful errors, and parser complexity.
The "invalid number" is a false positive for elephant but a true positive for 123,00.

I've renamed these messages to be consistent but with a hint: "Invalid JSON value (number?)", "Invalid JSON value (null?)" etc, and "Invalid JSON value" respectively - WDYT?

simon_tatham added inline comments.Apr 27 2018, 5:23 AM
include/llvm/Support/JSON.h
315 ↗(On Diff #144305)

When I built this locally, I had a strange build failure involving this function with g++ 5.4.0 (i.e. the default compiler on Ubuntu 16.04). It reported, at the definition of this function in JSON.cpp, this error:

error: ‘llvm::raw_ostream& llvm::json::operator<<(llvm::raw_ostream&, const llvm::json::Value&)’ should have been declared inside ‘llvm::json’

for which, apparently, the fix was to add a repeated declaration of this function without the 'friend' keyword and outside the definition of class Value.

clang was happy with it, on the other hand. I've no idea :-)

lib/Support/JSON.cpp
87 ↗(On Diff #144305)

When I tried to build with this change locally, g++ pointed out a spurious semicolon here.

560 ↗(On Diff #144305)

Could this number formatting be changed? The default %g loses precision – you don't even get enough information to exactly reconstruct the same double you started with.

Also, over in D46054 I'm working on a JSON back end for TableGen, for which I'd find it useful to be able to pass an arbitrary 64-bit integer through this system and still have the full 64 bits of integer value visible in the JSON output file, for the benefit of JSON consumers (e.g. Python json.load) that go above the call of duty in returning it as an integer without rounding it to the nearest representable double.

So, would it be possible to have some method of constructing a json::Value that formats as a 64-bit integer literal?

sammccall updated this revision to Diff 144327.Apr 27 2018, 6:30 AM
sammccall marked 2 inline comments as done.

Fix GCC warnings/errors.
Add note about integer representation.

include/llvm/Support/JSON.h
315 ↗(On Diff #144305)

Indeed, thanks for catching this.
I think GCC is technically correct here, and most other compilers prefer to be helpful instead :-)
Added the namespace-scope declaration.

lib/Support/JSON.cpp
560 ↗(On Diff #144305)

Could this number formatting be changed? The default %g loses precision – you don't even get enough information to exactly reconstruct the same double you started with.

Yes, though is there an existing aware of round-trip safe double formatter in llvm?
I suspect this only actually matters when the values are integers, so we should consider your second suggestion first :-)

I'd find it useful to be able to pass an arbitrary 64-bit integer through this system and still have the full 64 bits of integer value visible in the JSON output file, for the benefit of JSON consumers (e.g. Python json.load) that go above the call of duty in returning it as an integer without rounding it to the nearest representable double.

What about this design:

  • Internally, a numeric value can be an integer or a double. i.e we split the internal ValueType T_Number into two, T_Integer and T_Double. public Kind remains unchanged.
  • when constructing, you get one or the other depending on the static type
  • when parsing, you get integer unless it has a nonzero decimal part or is out-of-range.
  • asDouble() always succeeds
  • asInteger() succeeds if the underlying value is integer or if it's a double that can be exactly represented as int64_t (same as now)
  • when serializing, you get %g for double and the usual representation for integers

Open questions:

  • is 1.2e3 a double or an integer? I kind of want the former, which complicates our heuristic.
  • int64_t leaves anyone who wants uint64_t out in the cold. But adding more options for types is going to lead to madness. Can we live with this limitation?

If this sounds good I can start on the changes, but I'd like to defer adding new features to another patch if that's OK. This one is largely moving mostly-battleworn code from clangd, and new features need closer review of the implementation.

simon_tatham added inline comments.Apr 27 2018, 7:20 AM
lib/Support/JSON.cpp
560 ↗(On Diff #144305)

That design certainly sounds as if it would do what I need, and a great deal more besides.

Another option that would be fine for me personally would be to have a means of constructing a ValueType called, say, T_Custom, which internally holds a string value, and serializes as exactly that string, unquoted. I could imagine that being used for other unusual purposes as well, such as controlling which of \uXXXX and UTF-8 was used to represent a non-ASCII character in a string literal.

(And that possibility is simple enough that I could add it myself as part of my patch.)

Oh yes, nearly forgot – you might be interested to know that when I ported my TableGen JSON back end to use this library in place of my previous serialization code, it reduced the running time to 60% of what it previously was and the client code ended up more legible. So in spite of posting nitpicks, I like this patch :-)

Meinersbur added inline comments.Apr 27 2018, 8:31 AM
include/llvm/Support/JSON.h
355 ↗(On Diff #143588)

std::initializer_list acting like a container of const elements is probably for a reason. I'd prefer no such hacks, but also see that endless copying of elements might justify cheating.

cppreference.com mentions that its elements must be copy-initialized, but only since C++17.

does anyone else have an opinion on this?

sammccall added inline comments.Apr 27 2018, 2:13 PM
include/llvm/Support/JSON.h
355 ↗(On Diff #143588)

I would also prefer no such hacks, but I can't see a way to get good syntax without the recursive copy, and without the hacks :(
Any ideas? Also interested in hearing more opinions on whether this is too hairy to rely on.

That said, I believe this is valid, per all relevant versions of the standard.

cppreference.com mentions that its elements must be copy-initialized, but only since C++17.

I think I'm missing your point here - can you explain why this is good or bad, and what the implication is? It also seems to be a cppreference mistake, the copy-initialized requirement is older.

C++11 says

as if the implementation allocated an array of N elements of type E [...] Each element of that array is copy-initialized

i.e. it creates a T[N]. We could probably even const_cast!

C++14 says

as if the implementation allocated a temporary array of N elements of type const E [...]. Each element of that array is copy-initialized [...] The implementation is free to allocate the array in read-only memory if an explicit array with the same initializer could be so allocated

(emphasis mine). So now the type changes to const T[N] (so const_cast would be invalid) but mutating mutable members of const objects is allowed, so read-only memory can't be used.

The C++17 language is more obscure

as if the implementation generated and materialized (7.4) a prvalue of type “array of N const E” [...] Each element of that array is copy-initialized

but seems to be the same for these purposes.

lib/Support/JSON.cpp
560 ↗(On Diff #144305)

D46209 adds int64 support, and fixes use of %g to retain full precision.

T_Custom is a cool idea and I definitely don't want to rule it out, but large integers in particular seems like something common that should "just work" if possible.
(It's also unclear how a T_Custom could solve the problem on the *parse* side, which is nice to have)

@chandlerc This is ready for you again whenever you have cycles. To summarize:

  • I think everything above was resolved except for the question of whether the mutable/initializer_list tricks are too gross to use.
  • a few more potential users seem pleased about this idea, and @simon_tatham has verified that it works for tablegen in D46054.
  • I've had a couple of suggestions which have been moved into D46209 (numeric precision) and D46274 (unicode validation), I don't think you personally need to review those.
Meinersbur added inline comments.May 2 2018, 9:46 AM
include/llvm/Support/JSON.h
355 ↗(On Diff #143588)

Thanks for looking into the standard (note that cppreference mentions C++14, not C++17 as I claimed).

I understood the standard the following way: If the element is copy-initialized, it should call the copy-constructor. In your implementation, it calls copyFrom, i.e. it still makes a recursive copy on each level, meaning the problem is not solved (But you save one by not recursively copying again when constructing from initializer_list).

However, I looked into the implementation of initializer_list in libc++ and msvc. It is a list of pointers-to-elements, rather than a flat array of objects. That is, no copy-initialization is not happening, it just points to the already existing elements. Not sure whether this is mandated by the standard.

sammccall added inline comments.May 2 2018, 11:52 PM
include/llvm/Support/JSON.h
355 ↗(On Diff #143588)

I understood the standard the following way: If the element is copy-initialized, it should call the copy-constructor

Ah, this is just the standard being confusing. copy-initialization is basically unrelated to copy-constructors. Certain syntaxes trigger copy-initialization, and others trigger direct-initialization, which behaves slightly differently (copy-initialization won't call explicit constructors).

http://en.cppreference.com/w/cpp/language/copy_initialization gives this example, which is relevant here:

std::string s2 = std::move(s); // this copy-initialization performs a move

@chandlerc Ping - I think you're the best person to decide whether a JSON-specific library should be added.
It'd be great to make some progress on that, even if actual review would take longer or be left for others.
(I think simon_tatham has work blocked on this, and there are some clangd changes I'm holding off to avoid diverging)
Are the arguments above compelling? Any more info I can provide?

(I'm going to be sporadically available over the next two weeks, but will keep an eye on this)

Sorry for delays circling back to this.

I think the primary concerns about putting this into the Support library is that we already have one API that is quite similar there: YAMLIO. However, that API is clearly not serving all of the needs of users given that there are mutliple JSON-parsing code paths in the wider project that are already using other libraries (this one, but also Polly). So I think adding this makes lots of sense at this point.

However, I'd like to take this opportunity to try to iterate a bit on the API since it is going into a new, more widely visible home and will almost certainly grow new users as soon as it lands. I'm fairly uncomfortable with the inheritance approach, and I think some of the additional APIs would benefit from (hopefully minor) adjustment. None of this is intended to change the fundamental design in any way though.

include/llvm/Support/JSON.h
62 ↗(On Diff #144327)

I understand the pragmatic reason for this, but I am pretty uncomfortable deriving from standard types like this. I'm worried it will hurt portability due to subtle variations in standard libraries, or subtle changes in standard libraries as we switch to C++N+1 or whatever.

I would be much more comfortable using something internal and owning the API you export.

Unrelatedly (and not blocking anything), I struggle to believe that std::map is the correct tool for the job... Is it just that DenseMap is a pain to use currently? Is it that these are typically small and not worth a DenseMap? I'm sorry to ask this as I suspect you've already explained this in the original review, but I must admit I'm curious.

69–71 ↗(On Diff #144327)

This does more than what it says. It hides std::map's operator[] overloads, no matter what they are. Is that what you intended?

This is perhaps a good example of why I find inheritance challenging here...

100 ↗(On Diff #144327)

Do you expect users to primarily use these typed accessors? Or the underlying vector?

Should they take iterators instead of indices?

These seem to make the API somewhat hostile to range based for loops. I'm surprised this isn't more a facility provided by the iterator...

I find the name somewhat confusing too -- see my comment below for some clarity of why.

But what's the advantage here? Why not just arr[i].asNull()?

266 ↗(On Diff #144327)

A more conventional name would be getAsFoo matching getAs<T>.

Also, is it really worth having all of these? getAs<bool> and getAs<double> seem just as nice as the non-templated versions to me... But I guess getAsString and getAsInteger do interesting validation and such.

Thanks for the thoughts, I'll take another pass and try to incorporate
them. But I'm out until Monday, so some answers/questions while this is in
your cache...

(Looks like Phabricator swallowed most of my email reply, please excuse some repetition)

Sorry for delays circling back to this.

I think the primary concerns about putting this into the Support library is that we already have one API that is quite similar there: YAMLIO. However, that API is clearly not serving all of the needs of users given that there are mutliple JSON-parsing code paths in the wider project that are already using other libraries (this one, but also Polly). So I think adding this makes lots of sense at this point.

However, I'd like to take this opportunity to try to iterate a bit on the API since it is going into a new, more widely visible home and will almost certainly grow new users as soon as it lands. I'm fairly uncomfortable with the inheritance approach, and I think some of the additional APIs would benefit from (hopefully minor) adjustment. None of this is intended to change the fundamental design in any way though.

That sounds good. Any pushback below is to clarify the reasons for decisions, not to resist changes - happy to change whatever you don't find convincing.

include/llvm/Support/JSON.h
62 ↗(On Diff #144327)

Are we worried about inheriting the interface of *llvm* types, or just std types?

I've switched to inheriting from DenseMap here, but I can wrap it if you prefer.
(std::map's ordered-ness made operator== and print() simple, so there's a few more lines now).

Array is more complicated: even SmallVector<Value, 0> needs Value, which sets up a cycle that's hard to break. So I've resorted to wrapping std::vector and exposing a hopefully-sane subset of the API.
(Aside: I suspect we *do* want to track e.g. the C++17 changes to vector, but that's a burden that shouldn't fall on whoever does the upgrade)

69–71 ↗(On Diff #144327)

There are already no such overloads available, as Value is not default-constructible. I reworded the comment a bit.

Agree that inheritance makes this a little subtle to reason about (though obviously if wrapping, this particular operator would look just the same).

100 ↗(On Diff #144327)

I expect the most common access patterns to be:

  1. mapping to a vector<Something> using fromJSON
  2. iterating over Values using ranged-for (followed by asFoo() on each element)
  3. these getFoo(I) accessors

They're not nearly as useful/important as the typed accessors on Object. The benefit is:

  • symmetry with Object makes the API easier to follow (thus size_t rather than iterator)
  • convenient when particular indices have semantics (like executeCommand parameter list in LSP), rather than being a homogeneous collection
  • more readable than operator[] when calling on a pointer, which is common due to the if (Array* A = V.asArray()) idiom - (*A)[i].asNull() vs A->getNull(i).

These seem to make the API somewhat hostile to range based for loops. I'm surprised this isn't more a facility provided by the iterator...

Can you elaborate on what kind of loop you'd like to write? I tend to find clever iterator APIs fairly undiscoverable/obscure, but I'm not sure what I'm missing.

266 ↗(On Diff #144327)

A more conventional name would be getAsFoo matching getAs<T>.

Sure, in Java ;-)
Adding "get" because a function must have a verb seems like cargo culting to me - we're trying to signal the effects, but there aren't any! Maybe it's my exposure, but I don't see any ambiguity as to what asBoolean() might do.

I'd like to propose a style guide change to be more like https://swift.org/documentation/api-design-guidelines/#strive-for-fluent-usage here. Meanwhile, I can change this to match the style guide as it stands if you think this is important. I do think it hurts the signal/noise.

Also, is it really worth having all of these? getAs<bool> and getAs<double> seem just as nice as the non-templated versions to me...

I do prefer the non-templated version:

  • getAs<bool> etc isn't fewer things to understand, and it's not shorter, it's just fewer distinct tokens. I'm not sure what we're trying to conserve here.
  • the non-templated functions are more discoverable: easier to read in the header, and work better with code completion, and easier to search for
  • getAsNumber is a better name than getAs<double>, as the relevant concept here is JSON's "Number" not the types used to represent them in C++. (this is "interesting validation and such" I think)
  • I don't want people to think they can write getAs<int>(), and if they do I want the error message to be easy to understand. Templates make both of these harder.
sammccall updated this revision to Diff 146855.May 15 2018, 9:27 AM

Array wraps vector instead of inheriting it.
Object inherits DenseMap instead of std::map
Change some Object accessors to use StringRef instead of ObjectKey.
clang-format, because it was getting out of control.

sammccall added inline comments.May 15 2018, 9:32 AM
include/llvm/Support/JSON.h
62 ↗(On Diff #144327)

(And I forgot to mention a new dirty trick here: the explicit use of DenseMapInfo<StringRef> works since ObjectKey can already be implicitly converted back and forth to StringRef. This saves a bunch of unreadable boilerplate which has to go at the *top* of the file, but is admittedly a bit fragile - WDYT?)

@chandlerc: Back-from-vacation ping.
(A short response like "yes, do change X" is fine, keeps me busy :-)

Thanks for the ping!

include/llvm/Support/JSON.h
62 ↗(On Diff #144327)

I'm somewhat concerned with inheritance generally. I feel like wrapping is just a better pattern and more easily understood and debugged.

The additional code doesn't worry me much.

100 ↗(On Diff #144327)

Definitely not advocating for more clever iterator API. Mostly pointing out that index-oriented APIs are particularly hard to use with range based for loops. Whereas, using foo[i].asNull() is fine because the indexing is orthogonal to the query and can be replaced with something more range-based or iterator based if useful.

If the issue is just that operator[] is annoying with pointers, how about just a method for indexing so that you can use -> to call it? A->at(i).asNull() or A->get(i).asNull() seem fine?

266 ↗(On Diff #144327)

I find the arguments against templates compelling FWIW.

That said, while I actually support not using get superfluously in many cases, I don't actually like it here. There is an interesting and potentially complex conversion happening, and I personally much prefer getAsFoo() for that. The only time I'm really quite happy omitting the get is when it is truly just an accessor and not providing any "interesting" logic (a higher bar than the Swift convention you cite in the llvm-dev thread).

Anyways, for now, I suggest the getAsNumber pattern.

sammccall marked an inline comment as done.May 25 2018, 10:32 AM
sammccall added inline comments.
include/llvm/Support/JSON.h
62 ↗(On Diff #144327)

Object now wraps DenseMap instead of inheriting it.

I've removed some less-frequently used methods (resize, reserve, etc).
I'm reluctant to simplify commonly used interfaces in ways that might be surprising (like fewer insert overloads where it may add more copies).

100 ↗(On Diff #144327)

The Object case is different and more important, and worth resolving first.

Object::getNumber(StringRef) returns a number if the property exists and is a number. Dropping this accessor and writing if (auto N = O->get("foo").getAsNumber()) ... isn't viable as it crashes if the property is missing. Instead you'd have to write if (auto MN = O->get("foo")) if (auto N = MN->getAsNumber()) .... This is the overwhelmingly common pattern for parsing objects, and I do think we should support this in one expression and preferably one call.

If you disagree with that, let's talk about that first :-)

If the issue is just that operator[] is annoying with pointers, how about just a method for indexing so that you can use -> to call it? A->at(i).asNull() or A->get(i).asNull() seem fine?

So assuming we're going to have these methods for object, I think the biggest issue is consistency with object. I do also think A->getNumber(I) is a lot nicer than A->get(I).getAsNumber(). But neither of these are hard objections and I'm happy to just drop all of the Array::get*.

266 ↗(On Diff #144327)

OK. I'd really like to avoid three-word names here so I'll have a think about this over the weekend.

I do think get is always superfluous - if you want to signal that something tricky is happening, get doesn't do that.

This is logically just accessing a member of a discriminated union, which I don't think is anything tricky at all, so I'm not sure what the verb should be. But I'll find something.

sammccall marked an inline comment as done.

Wrap DenseMap instead of inheriting, drop some container member functions.

sammccall updated this revision to Diff 148916.May 29 2018, 8:17 AM

Rename Value accessors asNumber() -> getAsNumber().
Remove typed element accessors on Array.

OK, i'm done being recalcitrant, I think everything is addressed now :-)

include/llvm/Support/JSON.h
100 ↗(On Diff #144327)

I've dropped the typed Array::get*, but kept Object::get* as explained above.

We now have this glorious code in the test: (*(*A)[4].getAsArray())[1].getAsInteger() but that's not terribly representative of likely real-world use.

266 ↗(On Diff #144327)

I tried a few out with a small survey (N=4). From most to least preferred

  • asNumber() - this was most people's favorite
  • getAsNumber() - everyone could live with this; one person liked it as much as asNumber
  • getNumber() - an awkward compromise
  • toNumber() - probably sounds too much like a conversion
  • number() - I really like this, but nobody else does

I've changed them to getAsNumber, but I'm still a bit conflicted here.

@chandlerc Ping. Hope you had fun in committee :-)

It feels a bit strange to be pinging a code review that's not even my own :-), but is this still being reviewed, or is it now entirely stalled?

If the latter, I can rework my JSON tablegen patch in D46054 to reintroduce the (much more trivial) JSON output code from its first draft. But I'd prefer to use this library if it still looks like landing.

@chandlerc Ping again :-)
I'm going on a month's leave fairly soon (a week-ish exact date TBD), it'd be great to get this wrapped up before then. Or a realistic estimate - a few things depending on this now.

@simon_tatham The plan is still for this to land. Chandler's schedule has been busy and also disrupted, so I'm hoping to catch him at a weak moment :-)

chandlerc accepted this revision.Jul 6 2018, 2:21 PM

Ok, I generally think this is looking good to go in... As mentioned previously, I think there is a good rationale around supporting this despite the existing other/similar libraries.

The code quality is crazy high. Awesome job there.

I have a bunch of pretty minor questions or suggestions below. But I think they're mostly optional stuff. Even if you decide to make the changes, I'm even fine with those being follow-up patches and such as they seem unlikely to wildly change the API surface here, and none of them seem to represent serious bugs or anything.

So don't block on any of this to land the patch and move to a more iterative model. The patch as a whole is LGTM, and I'd just suggest folding the suggestions below that make sense to you to fold into the initial one you land, and follow-up on anything else.

include/llvm/Support/JSON.h
292 ↗(On Diff #148916)

Does some compiler reject this as just std::enable_if<!std::is_same<T, bool>::value>::type? I'd guess MSVC might complain about the use of ! but that would make me sad.

353 ↗(On Diff #148916)

I would write this as std::modf(D, &D) == 0.0. The return is a double, and this way it looks less like you're checking for the absence of an error code and more like actually checking the fractional component is zero.

354–355 ↗(On Diff #148916)

I would explicitly convert the RHS of these two to a double to make it clear that this comparison takes place is double precision floating point, not in an integral type. I think that would help the reader out, because otherwise it could look a bit like a tautology.

396 ↗(On Diff #148916)

Sadly, I think this is invalid by the pedantic wording of the spec.

Because as<T>() binds a reference to the pointer, I think that the pointer has to point to a valid object. And it doesn't yet...

I would just inline the reinterpret_cast here despite that repetition because I think all other users of as<T>() are correct.

However, since as<T>() is a private implementation detail, you could alternatively just return the pointer. Really up to you.

458–460 ↗(On Diff #148916)

Sink this to be a non-friend operator like operator == and !=?

Or hoist those to be friend operators?

I don't have a strong opinion about one pattern or the other, mostly just advocating for consistency whichever way you want to go.

463–464 ↗(On Diff #148916)

Are there likely to be a lot of these? Might make sense to leave a note for the future that this could be optimized a lot by having a custom StringRef like implementation that encodes whether the data is owned w/o spending an extra pointer on it. Clearly not needed in this patch, just a thought for the future if it comes up.

474–477 ↗(On Diff #148916)

Consider inlining this above? I actually think it makes the std::initializer_list<KV> used in a public API much easier to read if this trivial wrapper is immediately visible.

Oh, I guess this is kept out-of-line in order to define ObjectKey after Object, Array, and Value?

Not sure this ordering is buying that much in terms of readability. Given that you can define ObjectKey first, I might just do that and avoid the need to make this out-of-line... Anyways, this is just an optional suggestion, I'm fine with whichever way you end up.

490–491 ↗(On Diff #148916)

Given that these have JSON in the name, does it make sense to move them out of the json namespace?

554–565 ↗(On Diff #148916)

This is really nice btw. =D

596–599 ↗(On Diff #148916)

Would it be a readability improvement to have this be llvm::parseJSON rather than llvm::json::parse?

This revision is now accepted and ready to land.Jul 6 2018, 2:21 PM
sammccall marked 8 inline comments as done.Jul 9 2018, 3:09 AM

Thanks a lot for the review!
Made most of the changes. The one I'm least certain about: I didn't move json::parse --> parseJSON.
As you suggest, going to land this patch and happy to make any followup changes, I don't expect much API churn (even parse doesn't tend to have a lot of callsites).

include/llvm/Support/JSON.h
292 ↗(On Diff #148916)

That works fine. I just didn't look critically enough at this once it compiled :-)

474–477 ↗(On Diff #148916)

I wish this was just readability :-(

The problem is a circular dependency: defining KV needs Value to be complete, Value requires Object to be complete (because of Union). So the Object -> Value -> KV ordering is necessary, I think.

ObjectKey could indeed be hoisted higher, but that alone doesn't buy anything, I think.

490–491 ↗(On Diff #148916)

I'm not sure. The intent is these are part of a family of functions found via ADL on the second argument.
When used in that way, they are always available (because of ADL on the first argument), and they don't particularly "belong "in namespace llvm (they overload for builtin and std types).

Most of these particular functions are for generic code (like ObjectMapper, or fromJSON(Value, std::vector) itself) . e.g. calling Value.getAsNumber() is generally nicer than fromJSON(Value, double). So moving these particular overloads into namespace llvm seems to give them more prominence compared to the rest of the API than seems warranted.

c.f. llvm::json::parse vs llvm::parseJSON, which *is* a main entrypoint.

554–565 ↗(On Diff #148916)

:-) We had plenty of this code in clangd to test the API on.

The one thing that's pretty awful is that when parsing fails there's no detailed representation of the error. But worse seems to be better for us so far.

596–599 ↗(On Diff #148916)

Hmm, this seems like a wash to me. parseJSON is slightly shorter than json::parse, while the latter hints (accurately) at a library boundary.

Given the *other* entrypoints (Value, ObjectMapper etc) are in namespace json, I lean towards keeping this there too - among other things, it makes the spelling/capitalization of json::parse, json::Value etc easier to remember I think.

Certainly happy to change this if you have a strong opinion here though.

This revision was automatically updated to reflect the committed changes.
sammccall marked 3 inline comments as done.