This is an archive of the discontinued LLVM Phabricator instance.

Add Metadata to Transformer tooling
ClosedPublic

Authored by asoffer on Jun 19 2020, 12:09 PM.

Details

Summary

This change adds a Metadata field to ASTEdit, Edit, and AtomicChange so that edits can have associated metadata and that metadata can be constructed with Transformer-based RewriteRules. Metadata is ignored when applying edits to source, but other consumers of AtomicChange can use this metadata to direct how they want to consume each edit.

Diff Detail

Event Timeline

asoffer created this revision.Jun 19 2020, 12:09 PM
asoffer changed the repository for this revision from rC Clang to rG LLVM Github Monorepo.Jun 19 2020, 12:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2020, 12:36 PM

Looks good! Only real question is one of design -- should we consider the (deeper) change of templating the various types rather than using dynamic typing? For that matter, the decision doesn't even have to be the same for both AtomicChange and the Transformer types. Transformer could be templated while AtomicChange uses any.

I'm inclined towards what you have, but that may just be laziness on part (since this requires minimal changes). I'm curious to hear what Dmitri thinks.

clang/include/clang/Tooling/Refactoring/AtomicChange.h
144

Please add a field comment. Something like what you have in the revision description would be good.

clang/lib/Tooling/Refactoring/AtomicChange.cpp
210

I assume that

: AtomicChange(SM, KeyPosition), Metadata(std::move(M)) {}

didn't work?

clang/unittests/Tooling/RefactoringTest.cpp
1299

any reason not to use int directly as the type carried in the metadata?

clang/unittests/Tooling/TransformerTest.cpp
460

Why expect vs assert?

asoffer marked 2 inline comments as done.Jun 25 2020, 2:03 PM

Looks good! Only real question is one of design -- should we consider the (deeper) change of templating the various types rather than using dynamic typing? For that matter, the decision doesn't even have to be the same for both AtomicChange and the Transformer types. Transformer could be templated while AtomicChange uses any.

I think the tradeoff here is
Dynamic typing -- faster compile times, type safety checked at run-time (in tests), lower maintenance cost
Templates -- Faster runtime, type safety checked at compile-time, better user expereience

I don't think the runtime speed is particularly important here in the sense that it's unlikely to be the bottleneck. Given the crash-on-invalid-type semantics, we're essentially trading off usability for maintenance cost. In this case I don't think asking users to do the extra cast is a significant enough burden to warrant the cost of templating everything.

I also don't strongly object to it. The nice thing about llvm::Any, is that this can be mostly changed incrementally if we later decide that's appropriate (only having to change the one getMetadata call at the call-site).

clang/lib/Tooling/Refactoring/AtomicChange.cpp
210

Correct. You can't use a forwarding constructor followed by a field initializer (which I learned by attempt it on this PR).

clang/unittests/Tooling/TransformerTest.cpp
460

EXPECT when the test can reasonably continue to produce useful output without crashing. So here, if the tool fails, we can still safely access Changes, but if Changes.size() != 1, we can't safely index into it.

asoffer updated this revision to Diff 273516.Jun 25 2020, 2:24 PM
asoffer marked an inline comment as done and an inline comment as not done.

Commenting fields and simplifying test.

asoffer updated this revision to Diff 273520.Jun 25 2020, 2:27 PM
asoffer marked an inline comment as done.

Updating other test.

asoffer updated this revision to Diff 273522.Jun 25 2020, 2:30 PM

Last diff was a mistake.

asoffer marked 3 inline comments as done.Jun 25 2020, 2:31 PM
Harbormaster failed remote builds in B61827: Diff 273522!
Harbormaster failed remote builds in B61825: Diff 273520!

I think the tradeoff here is
Dynamic typing -- faster compile times, type safety checked at run-time (in tests), lower maintenance cost
Templates -- Faster runtime, type safety checked at compile-time, better user expereience

For me, the more important part of the tradeoff is whether we will have one type or multiple. If we use Any, then all AtomicChange regardless of what produced them, will have the same type. If we use templates, then AtomicChange<T> is a different type from AtomicChange<U>. So based on that I think using Any is a better choice, since infrastructure code would want to handle with AtomicChange objects and not have to be implemented as a template over arbitrary metadata type.

Regarding the metadata itself, WDYT about using a map from string to any instead of just one any? This way multiple layers of the system would be able to attach metadata without interfering.

I think the tradeoff here is
Dynamic typing -- faster compile times, type safety checked at run-time (in tests), lower maintenance cost
Templates -- Faster runtime, type safety checked at compile-time, better user expereience

For me, the more important part of the tradeoff is whether we will have one type or multiple. If we use Any, then all AtomicChange regardless of what produced them, will have the same type. If we use templates, then AtomicChange<T> is a different type from AtomicChange<U>. So based on that I think using Any is a better choice, since infrastructure code would want to handle with AtomicChange objects and not have to be implemented as a template over arbitrary metadata type.

I'm convinced (both yours and Andy's arguments). Andy, please consider whether any of this belongs in comments. (I don't have much opinion).

Regarding the metadata itself, WDYT about using a map from string to any instead of just one any? This way multiple layers of the system would be able to attach metadata without interfering.

Sounds good, but I'm fine either way.

I'm not against a map, but I don't see the need for it right now. We don't have multiple stages that write to the same AtomicChange in any tool I'm aware of. Given that these are immutable after construction, I think just the Any is simpler. So long as we're willing to break things later (Does LLVM have a policy on API compatibility?) I think it's better to start with the simpler thing.

asoffer updated this revision to Diff 274183.Jun 29 2020, 10:55 AM

Signed-comparison issue in test fixed.

gribozavr2 accepted this revision.Jun 30 2020, 6:54 AM
This revision is now accepted and ready to land.Jun 30 2020, 6:54 AM
ymandel accepted this revision.Jun 30 2020, 6:55 AM
This revision was automatically updated to reflect the committed changes.