This is an archive of the discontinued LLVM Phabricator instance.

clang-tools-extra/clangd/schema/config.json
49	Technically `string` OR `enum [ "Ancestors", "None" ]` would be more accurate Not sure if it's worth specifying that... from a validation point of view, `string` is a superset of the other one, but maybe some people will look at the schema for documentation, in which it could be useful to have those "keywords" listed?
73	I don't think this is quite right: If `File` is specified, `Server` and `MountPoint` should not be If `Server` is specified, `MountPoint` can optionally be specified but `File` should not be

In D140462#4012451, @nridge wrote:

Thanks, this is pretty neat!

cc @sammccall as a heads up, just to make sure you're cool with having this in-tree

This has been discussed before, unfortunately while I *thought* it was on a bug, it was actually in a PR: https://github.com/clangd/vscode-clangd/pull/140

I don't think it's a good idea to add a third (arguably, fourth/fifth) place that the config structure and documentation must be maintained by hand, especially one that constrains the structure in non-obvious ways. This needs to be automatically generated.

This implies moving the source of truth to something we can generate {fragment struct, website, parsing code, YAML schema} from.)

I made an attempt at this in https://reviews.llvm.org/D115425 using TableGen, which is the closest LLVM has to a standard way to do this, but the language is pretty bad (for this purpose).
YAML itself is the best alternative we came up with to express the schema in. Using the JSON-schema itself as the source of truth is another interesting option, though it might be simpler to have a format we control.
My recollection was that I thought this was valuable to solve in one way or other, and @kadircet (whose call it is now, really) didn't see it as a high priority.

In D140462#4012996, @sammccall wrote:

This has been discussed before, unfortunately while I *thought* it was on a bug, it was actually in a PR: https://github.com/clangd/vscode-clangd/pull/140

I don't think it's a good idea to add a third (arguably, fourth/fifth) place that the config structure and documentation must be maintained by hand, especially one that constrains the structure in non-obvious ways. This needs to be automatically generated.

This implies moving the source of truth to something we can generate {fragment struct, website, parsing code, YAML schema} from.)

I made an attempt at this in https://reviews.llvm.org/D115425 [...]

Thanks for reminding me about the previous work on this! I definitely appreciate the value in generating these various files / pieces of code from a single source of truth.

That said, getting there requires what seems like a non-trivial refactoring that doesn't seem to be in anyone's immediate plans. Meanwhile, it's clear that there is demand in the user community for a schema, and this schema is already out there, today, at https://json.schemastore.org/clangd.json, automatically in effect for anyone who uses VSCode's YAML extension (or comparable functionality in another editor).

The practical implication of allowing the schema to be stored in the clangd repo seems to be:

(1) there is yet another place where the config structure is recorded, but at least that place is in the clangd repo so we're likely to remember to update it at the same time we add new config features

vs.

(2) there is yet another place where the config structure is recorded, but it's on a third-party site and it's probably going to get out of date and only updated when someone periodically remembers to update it, and that person then has to identify and document a batch of changes to the config file format that others have made in the intervening time

Given this choice, I think there's a clear benefit to having the file in the repo.

If/when the described refactoring is implemented, the schema file can start benefiting from also being generated from a single source of truth.

I think there's a clear benefit, but my feeling is it's outweighed by the costs, which are fairly high given lack of any automation/tests.
If clangd contributors should maintain this, how should contributors/reviewers know whether changes are correct?
If they needn't maintain it, then having it on a third-party site seems appropriate. Maybe a /contrib/ directory is a useful compromise?

This is definitely informed by my biases: I have to make/review changes that touch config, don't have a good understanding of JSON-schema, and I don't use VSCode.

One practical question: AIUI the main point of having this is so it can be provided to the VSCode YAML extension so it understands .clangd files. How does this work mechanically? Does it need to be part of the vscode-clangd repo instead?

In D140462#4014944, @sammccall wrote:

given lack of any automation/tests

That's a fair point, it would definitely help validate the correctness of the schema if we had tests in the form of example config files that pass or fail validation.

I would be supportive of adding such tests, but I'm also hesitant to require them as a condition of accepting the addition of the schema file since it does add a fair amount of effort and complexity.

If clangd contributors should maintain this, how should contributors/reviewers know whether changes are correct?

Is "look at existing sections of our schema file as a guide, or barring that, the JSON-schema specification" too unsatisfactory an answer?

My feeling is that, at the end of the day, the schema format is relatively straightforward compared to other things that a clangd reviewer needs to know (like C++, or the Clang AST API), and the stakes of letting a mistake slip in are quite low (i.e. maybe a user will fail to get code completion for a config file key, as opposed to say clangd crashing on your code).

If they needn't maintain it, then having it on a third-party site seems appropriate. Maybe a /contrib/ directory is a useful compromise?

My original thinking was that reviewers should ask for patches that change config to make correspoding changes to the schema, which I guess would fall under "clangd contributors should maintain this".

But I'm definitely happy to compromise on this -- perhaps we could keep it in a /contrib/ directory like you say, and encourage but not require that config changes keep it up to date? And perhaps, if someone contributes automated tests for the schema in the future, we could consider upgrading its status to "maintained" at that time?

I don't use VSCode

(Note, there's nothing tying this to VSCode in particular. For example, lsp-mode seems to have support for consuming schemas from SchemaStore based on https://emacs-lsp.github.io/lsp-mode/page/lsp-yaml/#lsp-yaml-schema-store-uri.)

One practical question: AIUI the main point of having this is so it can be provided to the VSCode YAML extension so it understands .clangd files. How does this work mechanically?

My understanding, based on the discussion in the issue and my experimentation is:

If the YAML extension's "yaml.schemaStore.enable" setting is enabled (which is the default), the extension downloads a catalog of schemas from schemastore.org
Clangd's entry in that catalog, which you can see if you load https://www.schemastore.org/api/json/catalog.json and filter for "clangd" [aside: I recommend Firefox's built-in JSON viewer for such tasks, the builtin JSON visualization and filtering is really neat; if you have any pull with the Chrome team, maybe put in a good word for adding a similar JSON viewer], declares that the schema applies to files named .clangd
The extension automatically applies the schema to the .clangd file in your workspace

Does it need to be part of the vscode-clangd repo instead?

I think being in the vscode-clangd repo would have the small advantage that the vscode-clangd extension could use the YAML extension's contribution point to install the schema locally even if "yaml.schemaStore.enable" is set to false. Not sure whether that's worth putting in place.

Improve schema correctness

Apply clang-format

Harbormaster completed remote builds in B204963: Diff 485352.Dec 26 2022, 11:29 PM

Thank you for the suggestions - I applied all of the fixes!

As someone who helps maintain a lot of the schemas on schemastore, it's nice when schemas are in-tree with their respective project, but I totally understand that this can increase complexity, especially for the reasons stated.

In D140462#4014944, @sammccall wrote:

given lack of any automation/tests.

In D140462#4014997, @nridge wrote:

And perhaps, if someone contributes automated tests for the schema in the future, we could consider upgrading its status to "maintained" at that time?

If you would like me to add tests to verify the schema (for now or later?), is there a utility within LLVM to help do so? I saw TableGen was mentioned, but it sounds like the schema validation use case is a different issue.

In D140462#4014944, @sammccall wrote:

One practical question: AIUI the main point of having this is so it can be provided to the VSCode YAML extension so it understands .clangd files. How does this work mechanically? Does it need to be part of the vscode-clangd repo instead?

I think it's worth mentioning that the JSON schema can be hosted anywhere. As already mentioned, the YAML extension automatically downloads and uses the schema if it finds a .clangd file, and the schema can point to an external URL. One example of that in practice is the xunit-2.2.json schema; it simply contains a $ref to the actual schema on xunit.net. Whether this schema goes in clangd/schema/config.json, clangd/contrib/schema/config.json, or in vscode-clangd - on the SchemaStore end, all I have to do is update $ref.

In D140462#4017237, @hyperupcall wrote:

If you would like me to add tests to verify the schema (for now or later?), is there a utility within LLVM to help do so? I saw TableGen was mentioned, but it sounds like the schema validation use case is a different issue.

Is there a command-line tool that can perform JSON schema validation which is included in a commonly used package that's available in the default package repositories of major Linux distros? If so, we may be able to get away with just requiring that the buildbots have that package installed.

If not, then we'd need to check in such a tool into the LLVM repo (and if it needs building, integrate its build into the build system).

In either case, the tests themselves can take the form of simple lit tests that invoke that tool on some test config files.

In D140462#4017240, @nridge wrote:

Is there a command-line tool that can perform JSON schema validation

(well, more specifically that can validate YAML files against a JSON schema)

The difficulty in testing is that:

JSON-schema is totally unknown in llvm-project, so there are no tools but also no expectation that contributors understand it
llvm-project doesn't like dependencies, particularly those that aren't in C++ or python
having done some digging, interop seems really poor...

I tried 3 consumers (ajv-cli, yaml-language-server, and yajsv) and hit different blocking bugs in all of them when using obvious, spec-compliant approaches to writing schemas.

My feeling is that, at the end of the day, the schema format is relatively straightforward compared to other things that a clangd reviewer needs to know

But this is additional, it's obscure (unrelated to all the other knowledge), and as it's untested, the reviewer needs to reliably catch problems, including subtle interop problems.
It has the "python problem", where it's readable and you can see at a glance why the code is correct - even if it isn't.
(Having spent a couple of days with json-schema now, I don't think it's straightforward. That file seems simple so far, but there are simple C++ programs too! And it carefully dodges the interop issues, which is *subtle*).

Meanwhile, changing config already is a sea of boilerplate that's hard to review and bugs/divergences have crept in. (Config settings that are compiled but not parsed, things missing from the web docs, web docs being reworded but internal docs staying the same, etc).

Fixing this is an unreasonable burden to place on a new contributor, I've taken a shot at this and have something almost working: https://reviews.llvm.org/D140745.

It's pretty draft-ish:

That patch just checks in the generated files {Fragment.inc, YAML.inc, doc.md, schema.json}, (maybe) that belongs in the build system
the schema is correct but yaml-language-server chokes on it due to https://github.com/redhat-developer/yaml-language-server/issues/823, so need to restructure
files would need to be moved around etc
tests need to be added/updated
i'm not sure whether having a json-schema for schema.yaml is actually a good idea, it was mostly a learning exercise

clang-tools-extra/clangd/schema/config.json
21	disabling `additionalProperties` probably yields useful diagnostics
26	Unfortunately it's not valid to specify description together with $ref (implementations are required to ignore it) https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3
82	external can also be the case-insensitive string "None"
116	this is rather a list of strings
173	hmm, there are two more categories here - were they missing somewhere?

In D140462#4018983, @sammccall wrote:

I tried 3 consumers (ajv-cli, yaml-language-server, and yajsv) and hit different blocking bugs in all of them when using obvious, spec-compliant approaches to writing schemas.

Thanks for experimenting. Happy to defer to your judgment on the best course of action given this state of affairs (and also the mentioned difficulty in accessing these tools from an LLVM test suite).

I've taken a shot at this and have something almost working: https://reviews.llvm.org/D140745.

Neat!

clang-tools-extra/clangd/schema/config.json
21	One question here is, do we want diagnostics from the schema validation to duplicate or replace clangd's own diagnostics for the config file?
26	Hmmm... the spec draft at https://json-schema.org/draft/2020-12/json-schema-core.html#section-8.2.3.1, which is newer (2022 rather than 2018) does not seem to have this wording, and the example at https://json-schema.org/learn/getting-started-step-by-step.html#references uses `$ref` together with `description`.
173	`Designators` is missing from https://clangd.llvm.org/config.html#inlayhints What's the other one?

sammccall added inline comments.Dec 28 2022, 6:03 PM

clang-tools-extra/clangd/schema/config.json
21	I'm not totally sure. It's tempting to call these a failed experiment. The UX for clangd-emitted diags is pretty sad (since they only generate/update when we happen to process the config file again). However there are definitely going to be things you can get wrong that the schema won't catch, and we should make them visible somehow.
26	Yeah, one of the complexities here is the version skew :-( Different versions are (partially) supported by different tools. Some of them pay attention to what the file is self-declared as, others don't. draft-07 looks like the lowest-common-denominator (and is what this file claims to be), and in particular `yaml-language-server` always validates against draft-07 regardless of what the file claims.
173	There isn't one, I was confused. (I think I was thinking of type-param hints, but we never added it :-()

In D140462#4018983, @sammccall wrote:

The difficulty in testing is that:

JSON-schema is totally unknown in llvm-project, so there are no tools but also no expectation that contributors understand it

llvm-project doesn't like dependencies, particularly those that aren't in C++ or python

having done some digging, interop seems really poor...

I tried 3 consumers (ajv-cli, yaml-language-server, and yajsv) and hit different blocking bugs in all of them when using obvious, spec-compliant approaches to writing schemas. I only have experience with ajv, so I suppose it means

It has the "python problem", where it's readable and you can see at a glance why the code is correct - even if it isn't.
(Having spent a couple of days with json-schema now, I don't think it's straightforward. That file seems simple so far, but there are simple C++ programs too! And it carefully dodges the interop issues, which is *subtle*).

I appreciate that you took the time to experiment with those three implementations of JSON-schema - perhaps I was naive in thinking that the different implementations were more mature. On my side, this makes me want to validate the various schemas in SchemaStore with more implementations than just ajv to improve the experience for everyone.

Fixing this is an unreasonable burden to place on a new contributor, I've taken a shot at this and have something almost working: https://reviews.llvm.org/D140745.

It's pretty draft-ish:

That patch just checks in the generated files {Fragment.inc, YAML.inc, doc.md, schema.json}, (maybe) that belongs in the build system

the schema is correct but yaml-language-server chokes on it due to https://github.com/redhat-developer/yaml-language-server/issues/823, so need to restructure

files would need to be moved around etc

tests need to be added/updated

i'm not sure whether having a json-schema for schema.yaml is actually a good idea, it was mostly a learning exercise

This looks pretty neat! I think something like that revision would be a better way forward. I will go ahead and upload a new Diff, I suppose just for completeness, but I don't expect anything to change much since it doesn't address your points about syncing with documentation or the obscurity of the specification.

Improve schema correctness and disable additionalProperties

I think I shall now abandon this revision for the reasons stated above - there is an improved path forward at https://reviews.llvm.org/D140745 with autogenerated files.

Harbormaster completed remote builds in B206729: Diff 487706.Jan 10 2023, 12:36 AM

Revision Contents

Path

Size

clang-tools-extra/

clangd/

schema/

config.json

272 lines

Diff 487706

clang-tools-extra/clangd/schema/config.json

This file was added.

				{
				"$schema": "http://json-schema.org/draft-07/schema",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"If": {
				"description": "Conditions in the If block restrict when a fragment applies.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"PathMatch": {
				"description": "The file being processed must fully match a regular expression.",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				sammccallUnsubmitted Not Done Reply Inline Actions disabling `additionalProperties` probably yields useful diagnostics sammccall: disabling `additionalProperties` probably yields useful diagnostics
				nridgeUnsubmitted Not Done Reply Inline Actions One question here is, do we want diagnostics from the schema validation to duplicate or replace clangd's own diagnostics for the config file? nridge: One question here is, do we want diagnostics from the schema validation to duplicate or replace…
				sammccallUnsubmitted Not Done Reply Inline Actions I'm not totally sure. It's tempting to call these a failed experiment. The UX for clangd-emitted diags is pretty sad (since they only generate/update when we happen to process the config file again). However there are definitely going to be things you can get wrong that the schema won't catch, and we should make them visible somehow. sammccall: I'm not totally sure. It's tempting to call these a failed experiment. The UX for clangd…
				}
				]
				},
				"PathExclude": {
				"description": "The file being processed must not fully match a regular expression.",
				sammccallUnsubmitted Not Done Reply Inline Actions Unfortunately it's not valid to specify description together with $ref (implementations are required to ignore it) https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3 sammccall: Unfortunately it's not valid to specify description together with $ref (implementations are…
				nridgeUnsubmitted Not Done Reply Inline Actions Hmmm... the spec draft at https://json-schema.org/draft/2020-12/json-schema-core.html#section-8.2.3.1, which is newer (2022 rather than 2018) does not seem to have this wording, and the example at https://json-schema.org/learn/getting-started-step-by-step.html#references uses `$ref` together with `description`. nridge: Hmmm... the spec draft at https://json-schema.org/draft/2020-12/json-schema-core.html#section-8.
				sammccallUnsubmitted Not Done Reply Inline Actions Yeah, one of the complexities here is the version skew :-( Different versions are (partially) supported by different tools. Some of them pay attention to what the file is self-declared as, others don't. draft-07 looks like the lowest-common-denominator (and is what this file claims to be), and in particular `yaml-language-server` always validates against draft-07 regardless of what the file claims. sammccall: Yeah, one of the complexities here is the version skew :-( Different versions are (partially)…
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				]
				}
				}
				},
				"CompileFlags": {
				"description": "Affects how a source file is parsed.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Add": {
				"description": "List of flags to append to the compile command.",
				"oneOf": [
				{
				nridgeUnsubmitted Done Reply Inline Actions Technically `string` OR `enum [ "Ancestors", "None" ]` would be more accurate Not sure if it's worth specifying that... from a validation point of view, `string` is a superset of the other one, but maybe some people will look at the schema for documentation, in which it could be useful to have those "keywords" listed? nridge: Technically `string` OR `enum [ "Ancestors", "None" ]` would be more accurate Not sure if it's…
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				]
				},
				"Remove": {
				"description": "List of flags to remove from the compile command",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				]
				},
				nridgeUnsubmitted Done Reply Inline Actions I don't think this is quite right: If `File` is specified, `Server` and `MountPoint` should not be If `Server` is specified, `MountPoint` can optionally be specified but `File` should not be nridge: I don't think this is quite right: * If `File` is specified, `Server` and `MountPoint` should…
				"CompilationDatabase": {
				"description": "Directory to search for compilation database (compile_commands.json etc).",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"enum": [
				"Ancestors",
				sammccallUnsubmitted Not Done Reply Inline Actions external can also be the case-insensitive string "None" sammccall: external can also be the case-insensitive string "None"
				"None"
				]
				}
				]
				},
				"Compiler": {
				"description": "String to replace the executable name in the compile flags. The name controls flag parsing (clang vs clang-cl), target inference (gcc-arm-noneabi) etc.",
				"type": "string"
				}
				}
				},
				"Index": {
				"description": "Controls how clangd understands code outside the current file.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Background": {
				"description": "Whether files are built in the background to produce a project index. This is checked for translation units only, not headers they include.",
				"type": "string",
				"enum": [
				"Build",
				"Skip"
				],
				"default": "Build"
				},
				"External": {
				"description": "Used to define an external index source",
				"oneOf": [
				{
				"type": "string",
				"pattern": "[nN][oO][nN][eE]"
				},
				{
				"type": "object",
				sammccallUnsubmitted Not Done Reply Inline Actions this is rather a list of strings sammccall: this is rather a list of strings
				"additionalProperties": false,
				"properties": {
				"File": {
				"type": "string"
				}
				},
				"required": [
				"File"
				]
				},
				{
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Server": {
				"type": "string"
				},
				"MountPoint": {
				"type": "string"
				}
				},
				"required": [
				"Server"
				]
				}
				]
				}
				}
				},
				"Style": {
				"description": "Describes the style of the codebase, beyond formatting",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"FullyQualifiedNamespaces": {
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				}
				},
				"Diagnostics": {
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Suppress": {
				"description": "Diagnostic codes that should be suppressed.",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				sammccallUnsubmitted Not Done Reply Inline Actions hmm, there are two more categories here - were they missing somewhere? sammccall: hmm, there are two more categories here - were they missing somewhere?
				nridgeUnsubmitted Not Done Reply Inline Actions `Designators` is missing from https://clangd.llvm.org/config.html#inlayhints What's the other one? nridge: `Designators` is missing from https://clangd.llvm.org/config.html#inlayhints What's the other…
				sammccallUnsubmitted Done Reply Inline Actions There isn't one, I was confused. (I think I was thinking of type-param hints, but we never added it :-() sammccall: There isn't one, I was confused. (I think I was thinking of type-param hints, but we never…
				}
				]
				},
				"ClangTidy": {
				"description": "Configure how clang-tidy runs over your files. The settings are merged with any settings found in .clang-tidy configuration files with the ones from clangd configs taking precedence.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Add": {
				"description": "List of checks to enable, can be globs.",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				]
				},
				"Remove": {
				"description": "List of checks to disable, can be globs.",
				"oneOf": [
				{
				"type": "string"
				},
				{
				"type": "array",
				"items": {
				"type": "string"
				}
				}
				]
				},
				"CheckOptions": {
				"description": "Key-value pairs of options for clang-tidy checks",
				"type": "object"
				},
				"UnusedIncludes": {
				"description": "Enables Include Cleaner's unused includes diagnostics.",
				"type": "string",
				"enum": [
				"None",
				"Strict"
				],
				"default": "None"
				}
				}
				}
				}
				},
				"Completion": {
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"AllScopes": {
				"description": "Whether code completion should include suggestions from scopes that are not visible. The required scope prefix will be inserted.",
				"type": "boolean"
				}
				}
				},
				"InlayHints": {
				"description": "Configures the behaviour of the inlay-hints feature.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"Enabled": {
				"description": "A boolean that enables/disables the inlay-hints feature for all kinds, when disabled, configuration for specific kinds are ignored.",
				"type": "boolean"
				},
				"ParameterNames": {
				"description": "A boolean that enables/disables inlay-hints for parameter names in function calls.",
				"type": "boolean"
				},
				"DeducedTypes": {
				"description": "A boolean that enables/disables inlay-hints for deduced types.",
				"type": "boolean"
				},
				"Designators": {
				"description": "A boolean that enables/disables inlay-hints for designators in aggregate initialization.",
				"type": "boolean"
				}
				}
				},
				"Hover": {
				"description": "Configures contents of the hover cards.",
				"type": "object",
				"additionalProperties": false,
				"properties": {
				"ShowAKA": {
				"description": "A boolean that controls printing of desugared types e.g: `vector<int>::value_type (aka int)`",
				"type": "boolean"
				}
				}
				}
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add schema for `.clangd` configAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 487706

clang-tools-extra/clangd/schema/config.json

[clangd] Add schema for `.clangd` config
AbandonedPublic