This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Generate ConfigFragment/YAML/docs from one tablegen source
Needs ReviewPublic

Authored by sammccall on Dec 8 2021, 10:17 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

The idea is to reduce the number of places that need to be updated as we add
more config settings, encourage regular patterns, and make cross-cutting changes
easier (e.g. reformat docs).

Diff Detail

Event Timeline

sammccall created this revision.Dec 8 2021, 10:17 PM
sammccall requested review of this revision.Dec 8 2021, 10:17 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 8 2021, 10:17 PM
nridge added a subscriber: nridge.Dec 13 2021, 1:08 PM

The patch is failing to build for me locally, with clang-tools-extra/clangd/ConfigFragment.h:100:10: fatal error: 'ConfigFragment.gen.h' file not found. I did re-run cmake and the error persists. Is there another step I'm missing?

The patch is failing to build for me locally, with clang-tools-extra/clangd/ConfigFragment.h:100:10: fatal error: 'ConfigFragment.gen.h' file not found. I did re-run cmake and the error persists. Is there another step I'm missing?

You're not, sorry. It should be #include "ConfigFragment.inc". (I renamed the generated file and it kept working for me locally because cmake doesn't clean up the old one).

I had a discussion with @kadircet about this, we're not sure whether it factors out enough to be better/simpler overall. (Hand-written ConfigFragment + ConfigYAML + docs, vs table + generator + stubs of ConfigFragment + ConfigYAML + docs).
Any thoughts?

I had a discussion with @kadircet about this, we're not sure whether it factors out enough to be better/simpler overall. (Hand-written ConfigFragment + ConfigYAML + docs, vs table + generator + stubs of ConfigFragment + ConfigYAML + docs).
Any thoughts?

I feel like clangd's lack of support for non-self-contained files (https://github.com/clangd/clangd/issues/45) is relevant here.

With this patch, opening up the .inc files gives you a bunch of errors, and refs in those files do not appear to be found. So, while the patch makes it easier to add new config options, reading / understanding the relevant code becomes harder. Since code is read more often than it's written, that may not be a win.

Additionally, there's perhaps a matter of principle where clangd should try to avoid using code patterns (here, non-self-contained files) in its implementation code that it does not support operating on as a tool.

These make me lean towards not doing this.

That bothers me too.
I'm sure it's possible to factor the generated files so that they're self-contained, I didn't work hard on that yet. So I don't think it's central to the question.

there's perhaps a matter of principle where clangd should try to avoid using code patterns (here, non-self-contained files) in its implementation code that it does not support

Agree. However the counterargument is putting such patterns in front of our face all the time might encourage us to add some level of support!

there's perhaps a matter of principle where clangd should try to avoid using code patterns (here, non-self-contained files) in its implementation code that it does not support

Agree. However the counterargument is putting such patterns in front of our face all the time might encourage us to add some level of support!

Apologies if this is veering off topic (we can continue in a Github Discussion if you'd like), but I got the impression that clangd's lack of support for non-self-contained files was at least partly an opinionated statement that code patterns involving such files are not an important part of modern C++ development.

If that impression is mistaken, and it's just a matter of no one having gotten around to implementing such support, perhaps we could clarify that in https://github.com/clangd/clangd/issues/45 (and maybe put out a "help wanted" / "contributions welcome" announcement somewhere)?

there's perhaps a matter of principle where clangd should try to avoid using code patterns (here, non-self-contained files) in its implementation code that it does not support

Agree. However the counterargument is putting such patterns in front of our face all the time might encourage us to add some level of support!

Apologies if this is veering off topic (we can continue in a Github Discussion if you'd like), but I got the impression that clangd's lack of support for non-self-contained files was at least partly an opinionated statement that code patterns involving such files are not an important part of modern C++ development.

I think "at least partly" is fair. But this is more a statement of priorities and effort - I think if the feature were free or cheap, everyone would be happy to have it.

(FWIW I do find LLVM's reliance on preprocessor tricks to be not great. Between wanting to fit in with existing infrastructure, and not being able to easily add dependencies, it doesn't seem worth fighting. But this is just another way to say "sure, there are circumstances where it's reasonable to use such patterns")

If that impression is mistaken, and it's just a matter of no one having gotten around to implementing such support

It's definitely more than implementation, the design needs to solve some hard constraints:

  • There needs to be a clear scope and idea of how it'll solve the problems. For instance:
    • does it support files with parse context (e.g. snippets of class bodies like ConfigFragment.inc), or just PP context? If parse context, how is this persisted or recreated? (PCH can't do this today). How does this interact with preambles?
    • how is the file-that-provides-context identified? Is the existing index actually good/available enough for this?
  • it needs to not add _too much_ complexity, or at least isolate the complexity to a few places. "Priorities and effort" also means this can't become a maintenance timesink.
  • it needs to be useful/powerful enough to justify the complexity that adds. (This ~certainly means it needs to be safe to turn on by default eventually)

I'm not certain it's possible to satisfy all these, but it seems like it might be possible.

perhaps we could clarify that in https://github.com/clangd/clangd/issues/45 (and maybe put out a "help wanted" / "contributions welcome" announcement somewhere)?

Agreed. I'll post a version of this here.
I'm a little worried that people will expect that if someone makes an honest attempt and gets anywhere then it'll be merged. But I should be less paranoid.
I'm also aware that I & others haven't been a model of working on designs collaboratively in public, we're pretty reliant on getting in a room. So it may be unfair to expect others to do better.

For the website, we use tags to specify the clangd version that the option was first supported in.
I'd suggest that we also add a Version field to the Field class.
This could also cause problem down the line if we ever wanted to remove a config option.

sammccall added a comment.EditedJan 5 2022, 10:08 AM

For the website, we use tags to specify the clangd version that the option was first supported in.
I'd suggest that we also add a Version field to the Field class.
This could also cause problem down the line if we ever wanted to remove a config option.

I definitely agree.
FWIW, I didn't add all features or send this for review because I wasn't sure about the idea.

Talked to @kadircet offline a bit, and I hope he doesn't mind me trying to summarize...

  • avoiding a couple of duplicated parts is definitely good
  • there's still config.h and the mapping thereto, which I don't think can be tablegenerated
  • tablegen syntax is really bad and also hard to browse with the code (I agree)
  • I don't see a better representation within the tablegen language (which is dumb, because this is a simple tree, but tablegen isn't good at trees)
  • We couldn't think of a representation that would be *nice* to edit/maintain that's easy to have in-tree (apart from the current C++ structs, but parsing C++ at build time is terrible)

Actually we didn't discuss the option of using YAML itself as the format.
The downsides I can think of (vs current C++):

  • no type system/checking/assistance for our weird DSL (tablegen a bit)
  • have to use strings as comments rather than using comments as comments (though at least YAML has nice strings for this purpose)
  • YAML is a pretty quirky language
  • the inherent level-confusion of using YAML to describe a YAML schema

Maybe I'll mock this up, but I have some more pressing things to do so if anyone wants to shoot the idea first that'd be nice :-)