This is an archive of the discontinued LLVM Phabricator instance.

[clang-doc] Refactoring mapper to map by scope
ClosedPublic

Authored by juliehockett on Jun 19 2018, 4:54 PM.

Details

Summary

The result of this adjusted mapper pass is that all Function and Enum infos are absorbed into the info of their enclosing scope (i.e. the class or namespace in which they are defined). Namespace and Record infos are passed along to the final output, but the second pass creates a reference to each in its parent scope. As a result, the top-level final outputs are Namespaces and Records.

Diff Detail

Event Timeline

juliehockett created this revision.Jun 19 2018, 4:54 PM
ioeric added inline comments.Jun 21 2018, 1:36 AM
clang-tools-extra/clang-doc/Representation.h
173 ↗(On Diff #151999)

Comment? What is an enclong scope?

264 ↗(On Diff #151999)

"A standalone function" is redundant.

It's unclear to me what "wrap a reference in an Info of the appropriate enclosing scope" mean. How does this change the input, and what are the output?

clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
121 ↗(On Diff #151999)

This deserves a comment. It's hard to tell what this does without looking at the implementation.

147 ↗(On Diff #151999)

Output is a bit vague here. It's usually used for output stream. Maybe addToInfos?

238 ↗(On Diff #151999)

Could you add a comment explaining what the second reduction is?

238 ↗(On Diff #151999)

I think a lot of code here is too specific to be in main function. Can they live in a library?

249 ↗(On Diff #151999)

Are symbols declared in functions indexed by the tool? If not, this should probably assert or at least emit a warning message.

252 ↗(On Diff #151999)

What scopes would have empty USR? Could you add a comment?

277 ↗(On Diff #151999)

This does exactly the same thing as the case above except for the doc info type... could the code be shared?

304 ↗(On Diff #151999)

How severe is this? Should the tool bail out?

309 ↗(On Diff #151999)

It would be helpful to print more messages. What kind of reduction is this?

clang-tools-extra/test/clang-doc/yaml-comments.cpp
30 ↗(On Diff #151999)

I wouldn't check the exact value of USR in the tests. It would be very painful to fix tests when USR semantics are updated. Maybe just check that this has the expected number of characters?

Same below.

juliehockett marked 12 inline comments as done.

Refactoring second reduce code into the library, and adding bitcode reader/writer support

juliehockett added inline comments.Jul 2 2018, 2:15 PM
clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
249 ↗(On Diff #151999)

They are currently, but I'm planning to eliminate that in a future patch and will add the appropriate assertion then.

Thanks for the refactoring and the comments! They made it a lot easier to understand the changes.

I'm focusing on how the changes would fit into the ToolExecutor framework in the review and will leave tool-specific logics to another reviewer who I assume would know the tool better than me :)

clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
121 ↗(On Diff #151999)

So it would probably be clearer to call this getInfoOutputFile?

Given how path is created, it might make more sense if the parameter order is (Root, Namespaces, Name, Ext)

124 ↗(On Diff #153787)

nit: s/yaml/<Ext>/?

160 ↗(On Diff #153787)

Now that this only has one call site, it might be clearer to just inline this.

167 ↗(On Diff #153787)

mapResultsInMemory doesn't seem to describe what this does. Maybe bitcodeResultsToInfos? Please also add comment.

238 ↗(On Diff #153787)

nit: maybe USRToInfos?

244 ↗(On Diff #153787)

The ToolResults abstraction is intended to be used by ToolExecutor framework only. It seems that what you want is just a StringMap?

262 ↗(On Diff #153787)

If what we want is a mapping from enclosing scope to infos, wouldn't it be easier if we make the mapper (the ToolAction in execute) collect Scope->Infos (instead of USR->Infos) in the first place?

276 ↗(On Diff #153787)

Is there any reason why this needs to be a separate reduce pass? It seems that it should be possible to merge two reduce passes e.g. by making mergeInfos incremental. Having two reduce passes would diverge further from the potential MapReduce execution.

clang-tools-extra/test/clang-doc/yaml-comments.cpp
30 ↗(On Diff #153787)

Hmm, is this the same as [0-9A-Z]{n} ?

juliehockett marked 10 inline comments as done.
juliehockett retitled this revision from [clang-doc] Adding a second reduce pass to [clang-doc] Refactoring mapper to map by scope.
juliehockett edited the summary of this revision. (Show Details)

Refactoring the map-by-scope into the initial mapping phase.

Also updating tests to match the output of my test generation script (see D49268).

ping

clang-tools-extra/test/clang-doc/yaml-comments.cpp
30 ↗(On Diff #153787)

Yes, in most regex worlds, but it doesn't work in FileCheck regex (complains about unbalanced braces).

Map/Reduce change looks good to me.

clang-tools-extra/test/clang-doc/yaml-record.cpp
44 ↗(On Diff #155278)

Yes, in most regex worlds, but it doesn't work in FileCheck regex (complains about unbalanced braces).

Forgot this was FileCheck ;) Have you tried [0-9A-Z]{{n}}? If nothing works, I'd probably just check length of one USR and use {{.*}} to match the rest of USRs in all tests.

juliehockett added inline comments.Jul 19 2018, 7:28 AM
clang-tools-extra/test/clang-doc/yaml-record.cpp
44 ↗(On Diff #155278)

I can't wholly eliminate USRs from the tests (the bitcode ones use them as filenames), and so would the script I'm going to land from D49268 be sufficient? That way if the USR spec ever changes, it's fairly trivial to regen them all.

ioeric added inline comments.Jul 19 2018, 7:41 AM
clang-tools-extra/test/clang-doc/yaml-record.cpp
44 ↗(On Diff #155278)

Sure.

For script-generated tests, please make it's clear to other people, who might not be familiar with the tool, to figure out how to regenerate the tests, in case they make a change somewhere in clang and break your tests. E.g. a big comment on top of file indicating it's generated and pointing to the script. The script should also have clear instructions.

Updating tests

juliehockett marked 3 inline comments as done.Jul 23 2018, 11:48 AM

any further thoughts on this?

ioeric added inline comments.Jul 31 2018, 9:29 AM
clang-tools-extra/clang-doc/Representation.cpp
49 ↗(On Diff #156851)

The function name doesn't describe what it does. Maybe getChildIndexIfExists?

clang-tools-extra/clang-doc/Representation.h
190 ↗(On Diff #156851)

Why are namespaces and records References while others are actual infos. Might worth a comment.

246 ↗(On Diff #156851)

Would a record ever have namespace children? Maybe we should assert that this doesn't happen?

clang-tools-extra/clang-doc/Serialize.cpp
330 ↗(On Diff #156851)

auto IPtr = llvm::make_unique<NamespaceInfo>(); should work well I think? Converting the pointer back and forth seems unnecessary.

clang-tools-extra/test/clang-doc/bc-linkage.cpp
106 ↗(On Diff #156851)

I'm still a bit concerned about hardcoding a lot of USRs in tests. They are not interpretable and generally not interesting for testing. Also as they are auto-generated, it's hard to tell whether they are actually the desired USRs. I'm concerned because the maintenance is getting higher as number of tests grows - everyone changing USR semantics in the future has to know to regenerate clang-doc tests, this can be annoying and potentially unmanageable when a small change in clang USR requires changes to many test files in clang-tools-extra :( Comparing to the value it brings to test USRs in all tests, I'd still suggest simply matching them with a {{.*}}and only test USRs in few tests where you are actually interested in them.

juliehockett marked 5 inline comments as done.

Removing explicit USRs from tests & addressing comments

clang-tools-extra/clang-doc/Representation.h
246 ↗(On Diff #156851)

You're right, that's actually not valid in C++. Removed the field.

clang-tools-extra/test/clang-doc/bc-linkage.cpp
106 ↗(On Diff #156851)

Okay, I updated it to only check the length -- is that reasonable?

ioeric accepted this revision.Aug 2 2018, 3:37 AM
ioeric added inline comments.
clang-tools-extra/test/clang-doc/bc-linkage.cpp
106 ↗(On Diff #156851)

Thanks!

FWIW, I wouldn't check the length either as it seems to add too much overhead; I think checking length/USR in one test should get it well covered.

This revision is now accepted and ready to land.Aug 2 2018, 3:37 AM
juliehockett marked 2 inline comments as done.Aug 2 2018, 10:01 AM
juliehockett added inline comments.
clang-tools-extra/test/clang-doc/bc-linkage.cpp
106 ↗(On Diff #156851)

Possibly for the YAML tests, but for the bitcode ones I also want to check the ops, so it's not just the USR text. It's also autogenerated, so the overhead is minimal.

Also, just to reiterate, you'll still likely have to regenerate these if you dramatically change the USR spec. The bitcode tests all rely on USRs to write to files, and if those USRs change the name of the file the test needs to read will also change.

This revision was automatically updated to reflect the committed changes.
juliehockett marked an inline comment as done.
test/clang-doc/public-record.cpp