This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add index export to dexp
ClosedPublic

Authored by mnauw on Apr 3 2020, 5:57 AM.

Details

Summary

Add a command to dexp that exports index data in chosen format (e.g. YAML).

Diff Detail

Event Timeline

mnauw created this revision.Apr 3 2020, 5:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2020, 5:57 AM

Thanks for the patch!

We already have one introspection tool called dexp have you give it a try? It can read both RIFF and YAML and allows you to run queries on the index.

If it is not enough for your use case, can you describe it a little more? Maybe it would be easier to extend dexp to accommodate your use case instead of introducing a new binary.

mnauw added a comment.Apr 3 2020, 11:03 AM

Thanks for having a look at this patch. My use-case/goal would be to have a way to run/do a command that reads some index and dumps it (in e.g. YAML) to (e.g.) stdout. That way it can be inspected/viewed all at once by plain viewer. And not only symbols, refs, but also recorded paths (that form the include-graph). The latter in particular was/is useful when investigating some not-entirely-consistent symlink-resolution issues. The goal is not necessarily a separate binary, so if it's ok to extend dexp to achieve the former (e.g. by adding option(s)) then I can also update to patch that way accordingly.

Yeah, I think adding a dump/export command to dexp would be nice.
We'd have to make dexp keep the parsed input file around, or just the path and reopen it when the command is issued, but that seems OK to me.

(One of the costs of adding a new binary is you have to at least build it to keep the code from rotting, and linking binaries is often the bottleneck for iterating with check-clangd)

mnauw updated this revision to Diff 255019.Apr 4 2020, 4:00 AM
mnauw retitled this revision from [clangd] Add index inspection helper tool to [clangd] Add index export to dexp.
mnauw edited the summary of this revision. (Show Details)

Thanks, this looks pretty good!

clang-tools-extra/clangd/index/YAMLSerialization.cpp
36 ↗(On Diff #255019)

this is independent of adding the dump command - can we split this into a separate patch?
Partly because it will need to have some basic testing :-(

355 ↗(On Diff #255019)

this + 2 helpers seems a bit verbose/indirect.

FileDigest D;
if (HexString.size() == D.size() && llvm::all_of(HexString, llvm:isHexDigit)) {
  memcpy(Digest.data(), llvm::fromHex(HexString).data(), Digest.size());
} else {
  I.setError("Bad hex file digest");
}
return D;
380 ↗(On Diff #255019)

Unfortunately we don't own the CompileCommand type, so we shouldn't specialize traits for it (risk ODR violation if someone else does the same).
Wrap or inherit it in a trivial struct struct CompileCommandYAML : tooling::CompileCommand {} to make this safe.

clang-tools-extra/clangd/index/dex/dexp/Dexp.cpp
291

Writing it to stdout doesn't seem unreasonable but maybe not the right default.
What about requiring a filename arg, and creating a raw_fd_ostream - then you can pass - to get stdout.

365

I wonder if we should generalize this instead to running an arbitrary command non-interactively: dexp -c "dump"

No need to do that in this patch but maybe leave a TODO

Typical use is to read RIFF data to dump to YAML for inspection of indexed data. The YAML (de)serialization has also been extended to aid in this regard.

Just curious: what is the typical usecase for YAML index inspection? Maybe if there is something missing in Dexp, it would be worth adding the corresponding inspection tooling (not suggesting it for this patch, this is basically for me to understand how to improve Dexp).

sammccall added inline comments.Apr 8 2020, 5:03 AM
clang-tools-extra/clangd/index/dex/dexp/Dexp.cpp
365

Done in 49268a678c2f0233f71363b0988d4b159496b036 - I think we can drop this and fold ExportImpl and ExportCmd together.

mnauw updated this revision to Diff 256557.Apr 10 2020, 6:12 AM
mnauw edited the summary of this revision. (Show Details)
mnauw added a comment.Apr 10 2020, 6:16 AM

As suggested, I will provide the YAMLSerialization changes in a separate patch. Regarding test for that, I was thinking of some running some "simple/mock YAML data" through some load and save cycle(s). I presume a separate YAML file (in e.g. test/Inputs) is best for that?

This revision was not accepted when it landed; it landed in state Needs Review.Apr 19 2020, 5:52 AM
This revision was automatically updated to reflect the committed changes.