By converting Replacements by AtomicChange, clang-apply-replacements is able like clang-tidy to automatically cleanup and format changes.
This should permits to close this ticket: https://bugs.llvm.org/show_bug.cgi?id=35051 and attempt to follow hints from https://reviews.llvm.org/D43500 comments.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Thanks for the cleanup! This looks really nice!
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h | ||
---|---|---|
45 ↗ | (On Diff #135903) | I would expect this to be a map from files to a group of AtomicChanges (e.g. std::vector<AtomicChange>). In general, a single AtomicChange contains changes around a single code location which are likely to conflict with each other, and either all changes or no change is applied. A file usually corresponds to a set of atomic changes. Intuitively, I think clang-apply-replacements should simple gather a set of atomic changes for each file and let applyAtomicChanges handle the conflicts, but its current behavior is to skip conflicting replacements and keep applying other replacements. This is not ideal, but I guess I understand where this came from, and we might not be able to fix this in this patch since most tools produce replacements instead of AtomicChange. I would still suggest make this a map something like llvm::DenseMap<const clang::FileEntry *, std::vector<tooling::AtomicChange>>. To preserve the current behavior (i.e. skip conflicts), when you group all replacements for a single file, you would still put all replacements into a single AtomicChange, but when you actually put the change into the map, you put it as a vector of a single change. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
207 ↗ | (On Diff #135903) | I don't think we need to do the deduplication here anymore. AtomicChange handles duplicates for you. I think all you need to do here is to group replacements by files and later convert replacements to atomic changes. |
248 ↗ | (On Diff #135903) | (Related to my comment on FileToChangeMap) We should add FIXME here explaining why we are putting all replacements for a single file into one AtomicChange (e.g. we want to be able to detect and report each individual conflict) instead of a set of changes. |
259 ↗ | (On Diff #135903) | (Also related to my comment on FileToChangeMap) This would be something like Changes.insert(std::make_pair(Entry, {Change})); |
279 ↗ | (On Diff #135903) | You should handle the error in llvm::Expected. You could convert it to string and add to the error message with llvm::toString(NewCode.takeError()). It would be nice if we could have a test case for such cases. |
Thanks for your feedback, they are very precious to me!
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h | ||
---|---|---|
45 ↗ | (On Diff #135903) | I got your point. I will update the patch this way. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
207 ↗ | (On Diff #135903) | I think I wrongly use AtomicChange somewhere because it doesn't deduplicate same replacement automatically. |
279 ↗ | (On Diff #135903) | I will use llvm::Expected as you suggest. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
207 ↗ | (On Diff #135903) | AtomicChange does deduplicate identical replacements but not insertions, because it wouldn't know whether users really want the insertions to be deduplicated or not (e.g. imagine a tool wants to insert two ) at the same location). So it doesn't support that test case intentionally. In general, users (i.e. tools generating changes) are responsible for ensuring changes are deduplicated/applied in the expected way by using the correct interface (e.g. replace, insert etc). I think it would probably make more sense to change the test to deduplicate identical non-insertion replacements. It would also make sense to add another test case where identical insertions are both applied. For more semantics of conflicting/duplicated replacements, see https://github.com/llvm-mirror/clang/blob/master/include/clang/Tooling/Core/Replacement.h#L217 |
279 ↗ | (On Diff #135903) |
I think NewCode is already llvm::Expected<std::string>. You would only need to explicitly handle the error.
That's a good question. I think we would really need unit tests for the ApplyReplacements library in order to get better coverage. But it's probably out of the scope of this patch, so I'd also be fine without the test. Up to you :) |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
207 ↗ | (On Diff #135903) | That's make sense indeed. My confusion comes from I do not want to break existing tests. I will update them the way you describe. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
188 ↗ | (On Diff #136944) | Add a brief comment describe what this function does, specifically about TUs and TUDs. |
190 ↗ | (On Diff #136944) | nit: SM can be const. And put SM (input arg) before FileChanges (output arg). |
213–221 ↗ | (On Diff #136944) | This code block is the same as the one in the above loop. Consider pulling it into a lambda. |
276–279 ↗ | (On Diff #136944) | AtomicChange::replace also handles insertions, so I think there is no need for the distinction here. (See my comment in where ReplacementsToAtomicChanges is used; I think the class might not be needed.) |
290 ↗ | (On Diff #136944) | I might be missing something, but why do we need to pull replacements from the result change into a set of changes? |
317 ↗ | (On Diff #136944) | By default, this inserts R after the existing insertion in the same location (if there is any). But it's impossible for apply-all-replacements to know in which order they should be applied, so I would suggest treating this as conflict. Sorry that I might have confused you by asking to add a test case where two identical insertions are both applied (because they are order *independent*), but AtomicChange::replace also supports that and would report error when two insertions are order *dependent*. AtomicChange::replace and AtomicChange::insert have almost the same semantic except that you could specify the expected order of insertions on the same location with insert, but in our case, we wouldn't know which order is desired. |
353 ↗ | (On Diff #136944) | Sorry that I didn't make myself clear... what you had in the previous revision was correct (for the current use case of apply-all-replacements) i.e. store all replacements in one AtomicChange, which allows you to detect conflicts on the fly. So the code here can be simplified as: ... Entry = ...; AtomicChange FileChange; for (const auto &R : FileAndReplacements.second) { auto Err = FileChange.replace( <args from R> ); if (Err) reportConflict(Entry, std::move(Err)); // reportConflict as we go } FileChanges.emplace(Entry, {FileChange}); ... I think with this set up, you wouldn't need ReplacementsToAtomicChanges, ConflictError and reportConflicts. |
test/clang-apply-replacements/Inputs/basic/file2.yaml | ||
1 ↗ | (On Diff #136944) | Could you please add two test cases for insertions: 1) identical insertions (e.g. "AB" and "AB") at the same location are applied sucessfully and 2) order-dependent insertions (e.g. "AB" and "BA") are detected. (It might make sense to do this in a different file) |
test/clang-apply-replacements/Inputs/conflict/expected.txt | ||
8 ↗ | (On Diff #136944) | How could one replacement (Remove 12:3-12:14) conflict with itself? |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
213–221 ↗ | (On Diff #136944) | Yes, for sure. I was not confident about this part. |
290 ↗ | (On Diff #136944) | I interpret you suggestion of set of AtomicChange a little bit too much. |
353 ↗ | (On Diff #136944) | I think I have over-interpreting your previous comment :-) There are conflicting changes to XXX: The following changes conflict: Replace 8:8-8:33 with "AAA" Replace 8:8-8:33 with "BBB" Remove 8:10-8:15 Insert at 8:14 CCC With this patch, conflict are reported by pair (first replacement/conflicted one) and I am able to print the corresponding file only once (thanks to the defered reporting). There are conflicting changes to XXX: The following changes conflict: Replace 8:8-8:33 with "AAA" Replace 8:8-8:33 with "BBB" The following changes conflict: Replace 8:8-8:33 with "AAA" Remove 8:10-8:15 The following changes conflict: Replace 8:8-8:33 with "AAA" Insert at 8:14 CCC I prefer the way you suggest to report conflict but we will loose or print conflicting file at each conflict detected. The new replacement overlaps with an existing replacement. New replacement: XXX: 106:+26:"AAA" Existing replacement: XXX: 106:+26:"BBB" The new replacement overlaps with an existing replacement. New replacement: XXX: 106:+26:"AAA" Existing replacement: XXX: 112:+0:"CCC" The new replacement overlaps with an existing replacement. New replacement: XXX: 106:+26:"AAA" Existing replacement: XXX: 108:+12:"" |
test/clang-apply-replacements/Inputs/basic/file2.yaml | ||
1 ↗ | (On Diff #136944) | I will. |
test/clang-apply-replacements/Inputs/conflict/expected.txt | ||
8 ↗ | (On Diff #136944) | It is not the case, they are reported by pair now. |
Update after review.
Add tests to check identical insertions and order dependent insertions.
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
353 ↗ | (On Diff #136944) | I am inclined to changing the current behavior and reporting by pairs because
I would vote for the last approach (with llvm::toString(std::move(Err))) which is easy to implement and doesn't really regress the UI that much. WDYT? |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
353 ↗ | (On Diff #136944) | I think if we can use llvm::toString(std::move(Err)) for this patch, it will simplify the code and reuse already existing error message. Even if I found file+offset conflict location less readable than file+line+column. I have made some prototype to "enhance" error reporting but I think they should be put in dedicated patches and need further discutions.
/src/basic.h @@ 12:23-12:23 @@ <<<<<<< Existing replacement virtual void func() noexcept {} ======= virtual void func() override {} >>>>>>> New replacement
/src/basic.h @@ 12:23-12:23 @@ - virtual void func() {} + virtual void func() noexcept {} + virtual void func() override {}
/src/basic.h @@ 12:23-12:23 @@ virtual void func() {} ^ noexcept virtual void func() {} ^ override |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
353 ↗ | (On Diff #136944) | The conflict marker format looks promising! You are right regarding offset vs line+column location. I agree offset is less readable. I think we could probably build offset -> line+column translation into ReplacementError if necessary. For example, we could add a pretty-print method/helper that takes in source code and outputs readable error messages with offset translated to line+columns according to the source code, which is basically what the current replaceConflict function does. Ideally, that would also handle/consume the underlying errors so that we could switch to use that instead of llvm::toString. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
353 ↗ | (On Diff #136944) | Do you prefer to add the transformation offset -> line+column in this patch or in a future one (with an option to switch between line report and conflict marker for example)? |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
---|---|---|
353 ↗ | (On Diff #136944) | I think it would be easier to have those in a different patch. For this patch, reporting conflict pairs with either offsets or line+column (by using the transformation that is already there) would be fine (with a FIXME maybe). (These are all great improvements! Thanks for doing this!) |
This looks great! Just a few nits left.
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h | ||
---|---|---|
75 ↗ | (On Diff #139082) | If conflicts occur, no Replacements are applied. This doesn't seem to be accurate; non-conflicting replacements are still added. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
173 ↗ | (On Diff #139082) | nit: I'd probably use auto GroupedReplacements = ...;. |
179 ↗ | (On Diff #139082) | I think we are only skipping conflicts; non-conflicting replacements are still added even if the file contains other conflicts? This doesn't seem to have caused any problem because the current caller simply drops all changes if conflict is detected in any file, but we should still make it clear what the behavior of mergeAndDeduplicate is (in the API doc) e.g. what would FileChanges contain if there is conflict in some but not all files? |
218 ↗ | (On Diff #139082) | nit: redundant empty line. |
227 ↗ | (On Diff #139082) | nit: consider inline this to save one variable. |
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h | ||
---|---|---|
75 ↗ | (On Diff #139082) | You are perfectly right the description does not match the behavior. I think I will update this part to "Deduplicate, check for conflicts, and convert all Replacements stored in TUs to a single AtomicChange per file. Conflicting replacement are skipped. However, the order of converting Replacements extracted from TUs and TUDs to AtomicChange is undefined." And amend FileChanges with "Only non conflicting replacements are kept into FileChanges." |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
179 ↗ | (On Diff #139082) | I will update this comment to: |
lgtm
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h | ||
---|---|---|
75 ↗ | (On Diff #139082) |
As FileChanges are a FileToChangesMap, it's already clear that it's a set of changes per file. We don't want to say "single AtomicChange" here because this is just an implementation details. |
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp | ||
179 ↗ | (On Diff #139082) | I would drop the "To keep the current behavior" part because it's only relevant in code review. Future readers wouldn't know what the "current" behavior is. |
Thanks a lot!
Can I ask you if I need to do something as I do not have commit access?
I applied the patch and ran the tests.
There was one unexpected failure:
[100%] Running the Clang extra tools' regression tests FAIL: Clang Tools :: clang-apply-replacements/conflict.cpp (10 of 896) ******************** TEST 'Clang Tools :: clang-apply-replacements/conflict.cpp' FAILED ******************** Script: -- mkdir -p /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file1.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file1.yaml sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file2.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file2.yaml sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file3.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file3.yaml sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/expected.txt > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/expected.txt not clang-apply-replacements /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt 2>&1 diff -b /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/expected.txt ls -1 /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict | FileCheck /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/conflict.cpp --check-prefix=YAML not clang-apply-replacements /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict -remove-change-desc-files > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt 2>&1 ls -1 /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict | FileCheck /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/conflict.cpp --check-prefix=NO_YAML -- Exit Code: 1 Command Output (stdout): -- 2,4d1 < New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)" < Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:"" < The new replacement overlaps with an existing replacement. 11a9,11 > Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:"" > The new replacement overlaps with an existing replacement. > New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)" -- ******************** Testing Time: 21.09s ******************** Failing Tests (1): Clang Tools :: clang-apply-replacements/conflict.cpp Expected Passes : 893 Unsupported Tests : 2 Unexpected Failures: 1
Thank you for applying the patch.
I suspect some undefined order when applying replacements as directly dependant in which order OS reads file1.yaml, file2.yamland file3.yaml.
As I am not able to reproduce it on my machine, can you share with me the content of output.txt?
$ ls -f /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/
file1.yaml .. output.txt . file3.yaml file2.yaml expected.txt
As I am not able to reproduce it on my machine, can you share with me the content of output.txt?
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 106:+26:"int & elem : ints"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 106:+26:"auto & i : ints"
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 140:+7:"elem"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 140:+7:"i"
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+1:"nullptr"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""
Ok, I understand the problem.
Previously, during the deduplication step, replacements per file where sorted and it is not the case anymore.
That means now, clang-apply-replacement is highly dependant of reading file order during error reporting.
I mainly see 2 options (I can miss others), rewrite the test (e.g merging yaml file together) or sort the replacements per file to fix the order of application.
Can I ask you what solution you prefer?
I'd prefer sorting, so that results will be consistent when the tool is used on real data.