- User Since
- Aug 26 2016, 6:53 AM (90 w, 6 d)
Nice, simple and will admit refinements later.
Thanks, this looks a lot better!
There do seem to be a lot of little classes that exist exactly one-per-TU (ASTWorker, ASTBuilder, CachedAST, to a lesser extent ParsedAST) and I'm not sure the balance of responsibilities is quite right. Some comments below.
@chandlerc: Back-from-vacation ping.
(A short response like "yes, do change X" is fine, keeps me busy :-)
Having taken a closer look, I think the cache can be simplified/separated a bit more cleanly by returning shared pointers and not allowing lookups, instead restoring limited ownership in CppFile...
(Haven't reviewed the code yet, just design stuff)
I'm tempted to push back on the idea that we need to store any - "if it's fast enough for code complete, it's fast enough for GTD". But that's probably oversimplifying - we don't force reuse of a stable preamble for GTD I think. So we probably do want some cache.
Thanks for chasing this, good detective work!
Changes look plausible and tests are nice, so LG assuming you know what you're doing. If you're unsure, Manuel will give you better advice than me :)
Wed, May 16
Tue, May 15
I think a flag doesn't solve much of the problem, and adds new ones:
- users have to find the flag, and work out how to turn it on in their editor, and (other than embedders) few will bother. And each flag hurts the usability of all the other flags.
- if this heuristic is usable only sometimes, that's at codebase granularity, not user granularity. Flags don't work that way. (Static index currently has this problem...)
- these flags end up in config files, so if we later remove the flag we'll *completely* break clangd for such users
Array wraps vector instead of inheriting it.
Object inherits DenseMap instead of std::map
Change some Object accessors to use StringRef instead of ObjectKey.
clang-format, because it was getting out of control.
(Looks like Phabricator swallowed most of my email reply, please excuse some repetition)
@malaperle to expand on what Eric said, adding proto hacks with false positives and no way to turn them off is indeed not the way to go here!
There's probably going to be other places we want to filter symbols too, and it should probably be extensible/customizable in some way.
We don't yet have enough examples to know what the structure should be (something regex based, a code-plugin system based on Registry, or something in between), thus the simplest/least invasive option for now (it's important for our internal rollout to have *some* mitigation in place).
Nice, ship it!
Mon, May 14
This is pretty small and seems unlikely to be controversial. I think it's OK to commit at this point, it's been a few weeks.
Looks good, nits as always and I think you want a new test case.
Thu, May 10
Thanks for the thoughts, I'll take another pass and try to incorporate
them. But I'm out until Monday, so some answers/questions while this is in
Mon, May 7
I'm concerned about the scope of this patch - it does too many things and touches too many files for me to feel comfortable that I understand it well enough to review.
Is it possible to split it up? (You mention 6 distinct things in the description).
@chandlerc Ping - I think you're the best person to decide whether a JSON-specific library should be added.
It'd be great to make some progress on that, even if actual review would take longer or be left for others.
(I think simon_tatham has work blocked on this, and there are some clangd changes I'm holding off to avoid diverging)
Are the arguments above compelling? Any more info I can provide?
Thu, May 3
Added a test - it's kind of perfunctory, I really want to organize symbol scoring more thoroughly which will provide a better place to test well.
Add completion test.
Wed, May 2
Use existing library code from ConvertUTF for UTF-8 validation.
Reject invalid codepoints (surrogates, high codepoints).
Add more comments around UTF8 validation.
Tue, May 1
Mon, Apr 30
Thanks for the comments!
Not validating codepoints was deliberate but in hindsight arbitrary, I'll
fix that. (Both surrogates and high codepoints).
@chandlerc This is ready for you again whenever you have cycles. To summarize:
- I think everything above was resolved except for the question of whether the mutable/initializer_list tricks are too gross to use.
- a few more potential users seem pleased about this idea, and @simon_tatham has verified that it works for tablegen in D46054.
- I've had a couple of suggestions which have been moved into D46209 (numeric precision) and D46274 (unicode validation), I don't think you personally need to review those.
Negative number test case, fix std:: qualification.
Thanks for verifying this! (Still waiting for a decision on the master patch, I hope this doesn't block you too long!)
A few more nits (and some ideas for further restructuring the merge logic).
Otherwise LG, let's land this!
(oops, meant to accept this - please do add a test if you can)
Oops, I missed this, thanks!
Possible to add a testcase? Any character that's not ASCII and is legal in an identifier should do it.
Fri, Apr 27
Clarify docs, add deserializer for int64_t
Fix GCC warnings/errors.
Add note about integer representation.
LG (once the dependencies are done!)
My main question/concern is: if these APIs extract/format based on CodeCompletionString and friends, what should our plan be using this this in AST-based features, such as hover?
Looks like I forgot to clang-format before sending this :-/
Doxygen, comment and error message tweaks.
Re: locations in parameter USRs:
Wed, Apr 25
Code looks good/simple, just a couple of design questions
Tue, Apr 24
Address some review comments (but no doxygen yet)
Thank you! Other than "how does this relate to YAML", I don't think I'm familiar with the concerns so any pointers you have would be appreciated.
Remove some debugging junk, tweak a comment.
Apr 24 2018
Nice! I'm a little surprised that this wasn't fixed before, so I'm worried I missing something.
But nobody else has reviewed it, and it really seems right (and the test cases you fixed did seem broken). We can always revert :)
Apr 23 2018
@chandlerc: any interest in reviewing this JSON library (either at a high level for suitability in llvm/Support, or for detailed review)?
Address review comments, add a few more docs.
Go back to custom UTF transcoding because of failing test cases :-(
Thanks for the changes! I still find the logic quite confusing. Happy to chat offline if useful.
Apr 20 2018
I'm also not sure, but I really appreciate your feedback, and wanted to give you as a likely user a chance to shape the API.
Thanks for the great advice, don't feel any pressure to accept or examine everything if you don't feel like the right person.
Address review comments.
Apr 19 2018
I think this implementation will have problems in a "real" multi-machine MR framework. The lifetime of the Merger is the whole program, with output only coming at the end.
With N workers, each will store >1/N of the symbols (more because of overlap), and the streaming nature of a MR is important to its scaling.