This is an archive of the discontinued LLVM Phabricator instance.

[clang-apply-replacements] Convert tooling::Replacements to tooling::AtomicChange for conflict resolving of changes, code cleanup, and code formatting.
ClosedPublic

Authored by jdemeule on Feb 26 2018, 7:39 AM.

Details

Summary

By converting Replacements by AtomicChange, clang-apply-replacements is able like clang-tidy to automatically cleanup and format changes.
This should permits to close this ticket: https://bugs.llvm.org/show_bug.cgi?id=35051 and attempt to follow hints from https://reviews.llvm.org/D43500 comments.

Diff Detail

Event Timeline

jdemeule created this revision.Feb 26 2018, 7:39 AM

Thanks for the cleanup! This looks really nice!

clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h
47

I would expect this to be a map from files to a group of AtomicChanges (e.g. std::vector<AtomicChange>). In general, a single AtomicChange contains changes around a single code location which are likely to conflict with each other, and either all changes or no change is applied. A file usually corresponds to a set of atomic changes.

Intuitively, I think clang-apply-replacements should simple gather a set of atomic changes for each file and let applyAtomicChanges handle the conflicts, but its current behavior is to skip conflicting replacements and keep applying other replacements. This is not ideal, but I guess I understand where this came from, and we might not be able to fix this in this patch since most tools produce replacements instead of AtomicChange.

I would still suggest make this a map something like llvm::DenseMap<const clang::FileEntry *, std::vector<tooling::AtomicChange>>. To preserve the current behavior (i.e. skip conflicts), when you group all replacements for a single file, you would still put all replacements into a single AtomicChange, but when you actually put the change into the map, you put it as a vector of a single change.

clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
193

I don't think we need to do the deduplication here anymore. AtomicChange handles duplicates for you. I think all you need to do here is to group replacements by files and later convert replacements to atomic changes.

234

(Related to my comment on FileToChangeMap)

We should add FIXME here explaining why we are putting all replacements for a single file into one AtomicChange (e.g. we want to be able to detect and report each individual conflict) instead of a set of changes.

245

(Also related to my comment on FileToChangeMap)

This would be something like

Changes.insert(std::make_pair(Entry, {Change}));
369

You should handle the error in llvm::Expected. You could convert it to string and add to the error message with llvm::toString(NewCode.takeError()). It would be nice if we could have a test case for such cases.

Thanks for your feedback, they are very precious to me!

clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h
47

I got your point. I will update the patch this way.

clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
193

I think I wrongly use AtomicChange somewhere because it doesn't deduplicate same replacement automatically.
For exemple, in the test suite, basic test defines 2 time the same replacement (adding 'override ' at offset 148) and I do not manage to avoid AtomicChange to add 'override override '. This is why I have kept the deduplicate step.

369

I will use llvm::Expected as you suggest.
Do you have some ideas to made a test failed at this level?

ioeric added inline comments.Mar 2 2018, 2:49 PM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
193

AtomicChange does deduplicate identical replacements but not insertions, because it wouldn't know whether users really want the insertions to be deduplicated or not (e.g. imagine a tool wants to insert two ) at the same location). So it doesn't support that test case intentionally. In general, users (i.e. tools generating changes) are responsible for ensuring changes are deduplicated/applied in the expected way by using the correct interface (e.g. replace, insert etc). I think it would probably make more sense to change the test to deduplicate identical non-insertion replacements. It would also make sense to add another test case where identical insertions are both applied.

For more semantics of conflicting/duplicated replacements, see https://github.com/llvm-mirror/clang/blob/master/include/clang/Tooling/Core/Replacement.h#L217

369

I will use llvm::Expected as you suggest.

I think NewCode is already llvm::Expected<std::string>. You would only need to explicitly handle the error.

Do you have some ideas to made a test failed at this level?

That's a good question. I think we would really need unit tests for the ApplyReplacements library in order to get better coverage. But it's probably out of the scope of this patch, so I'd also be fine without the test. Up to you :)

jdemeule added inline comments.Mar 4 2018, 9:00 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
193

That's make sense indeed. My confusion comes from I do not want to break existing tests. I will update them the way you describe.

jdemeule updated this revision to Diff 136944.Mar 4 2018, 2:27 PM

Update patch after reviewer comments.

ioeric added inline comments.Mar 9 2018, 8:44 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
188

Add a brief comment describe what this function does, specifically about TUs and TUDs.

190

nit: SM can be const. And put SM (input arg) before FileChanges (output arg).

213–221

This code block is the same as the one in the above loop. Consider pulling it into a lambda.

276–279

AtomicChange::replace also handles insertions, so I think there is no need for the distinction here.

(See my comment in where ReplacementsToAtomicChanges is used; I think the class might not be needed.)

290

I might be missing something, but why do we need to pull replacements from the result change into a set of changes?

317

By default, this inserts R after the existing insertion in the same location (if there is any). But it's impossible for apply-all-replacements to know in which order they should be applied, so I would suggest treating this as conflict. Sorry that I might have confused you by asking to add a test case where two identical insertions are both applied (because they are order *independent*), but AtomicChange::replace also supports that and would report error when two insertions are order *dependent*.

AtomicChange::replace and AtomicChange::insert have almost the same semantic except that you could specify the expected order of insertions on the same location with insert, but in our case, we wouldn't know which order is desired.

353

Sorry that I didn't make myself clear... what you had in the previous revision was correct (for the current use case of apply-all-replacements) i.e. store all replacements in one AtomicChange, which allows you to detect conflicts on the fly. So the code here can be simplified as:

...
Entry = ...;
AtomicChange FileChange;
for (const auto &R : FileAndReplacements.second) {
  auto Err = FileChange.replace( <args from R> );
  if (Err)
    reportConflict(Entry, std::move(Err));  // reportConflict as we go
}
FileChanges.emplace(Entry, {FileChange});
...

I think with this set up, you wouldn't need ReplacementsToAtomicChanges, ConflictError and reportConflicts.

test/clang-apply-replacements/Inputs/basic/file2.yaml
1

Could you please add two test cases for insertions: 1) identical insertions (e.g. "AB" and "AB") at the same location are applied sucessfully and 2) order-dependent insertions (e.g. "AB" and "BA") are detected.

(It might make sense to do this in a different file)

test/clang-apply-replacements/Inputs/conflict/expected.txt
8

How could one replacement (Remove 12:3-12:14) conflict with itself?

jdemeule added inline comments.Mar 12 2018, 11:48 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
213–221

Yes, for sure. I was not confident about this part.

290

I interpret you suggestion of set of AtomicChange a little bit too much.

353

I think I have over-interpreting your previous comment :-)
However, I am still confused about error reporting.
Currently, clang-apply-replacements reports conflicting replacement per file and per conflict.
For example:

There are conflicting changes to XXX:
The following changes conflict:
  Replace 8:8-8:33 with "AAA"
  Replace 8:8-8:33 with "BBB"
  Remove 8:10-8:15
  Insert at 8:14 CCC

With this patch, conflict are reported by pair (first replacement/conflicted one) and I am able to print the corresponding file only once (thanks to the defered reporting).

There are conflicting changes to XXX:
The following changes conflict:
  Replace 8:8-8:33 with "AAA"
  Replace 8:8-8:33 with "BBB"
The following changes conflict:
  Replace 8:8-8:33 with "AAA"
  Remove 8:10-8:15
The following changes conflict:
  Replace 8:8-8:33 with "AAA"
  Insert at 8:14 CCC

I prefer the way you suggest to report conflict but we will loose or print conflicting file at each conflict detected.
I even more prefer to use llvm::toString(std::move(Err)) but this will fully change the reporting and will also be reported by pair.

The new replacement overlaps with an existing replacement.
New replacement: XXX: 106:+26:"AAA"
Existing replacement: XXX: 106:+26:"BBB"
The new replacement overlaps with an existing replacement.
New replacement: XXX: 106:+26:"AAA"
Existing replacement: XXX: 112:+0:"CCC"
The new replacement overlaps with an existing replacement.
New replacement: XXX: 106:+26:"AAA"
Existing replacement: XXX: 108:+12:""
test/clang-apply-replacements/Inputs/basic/file2.yaml
1

I will.

test/clang-apply-replacements/Inputs/conflict/expected.txt
8

It is not the case, they are reported by pair now.

jdemeule updated this revision to Diff 138504.Mar 15 2018, 1:51 AM

Update after review.
Add tests to check identical insertions and order dependent insertions.

ioeric added inline comments.Mar 15 2018, 3:07 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
353

I am inclined to changing the current behavior and reporting by pairs because

  1. this would simplify the code a lot.because we could basically reuse the conflict detection from replacement library.
  2. IMO, pairs are easier to read than grouping multiple conflicts - users would still need to interpret the conflicts and figure out how all replacements conflict in a group, while reporting as pairs already gives you this information.
  3. in practice, it's uncommon to see more than two conflicting replacements at the same code location.

I would vote for the last approach (with llvm::toString(std::move(Err))) which is easy to implement and doesn't really regress the UI that much. WDYT?

jdemeule added inline comments.Mar 15 2018, 4:06 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
353

I think if we can use llvm::toString(std::move(Err)) for this patch, it will simplify the code and reuse already existing error message. Even if I found file+offset conflict location less readable than file+line+column.

I have made some prototype to "enhance" error reporting but I think they should be put in dedicated patches and need further discutions.
During my "research" I found we can use:

  • Conflict marker format.
/src/basic.h
@@ 12:23-12:23 @@
<<<<<<< Existing replacement
  virtual void func() noexcept {}
=======
  virtual void func() override {}
>>>>>>> New replacement
  • A unified diff like.
/src/basic.h
@@ 12:23-12:23 @@
-   virtual void func() {}
+   virtual void func() noexcept {}
+   virtual void func() override {}
  • A clang like diagnostic message.
/src/basic.h
@@ 12:23-12:23 @@
  virtual void func() {}
                      ^
                      noexcept 
  virtual void func() {}
                      ^
                      override
ioeric added inline comments.Mar 15 2018, 4:20 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
353

The conflict marker format looks promising!

You are right regarding offset vs line+column location. I agree offset is less readable. I think we could probably build offset -> line+column translation into ReplacementError if necessary. For example, we could add a pretty-print method/helper that takes in source code and outputs readable error messages with offset translated to line+columns according to the source code, which is basically what the current replaceConflict function does. Ideally, that would also handle/consume the underlying errors so that we could switch to use that instead of llvm::toString.

jdemeule added inline comments.Mar 16 2018, 6:15 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
353

Do you prefer to add the transformation offset -> line+column in this patch or in a future one (with an option to switch between line report and conflict marker for example)?

ioeric added inline comments.Mar 16 2018, 6:50 AM
clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
353

I think it would be easier to have those in a different patch. For this patch, reporting conflict pairs with either offsets or line+column (by using the transformation that is already there) would be fine (with a FIXME maybe).

(These are all great improvements! Thanks for doing this!)

jdemeule updated this revision to Diff 139082.Mar 20 2018, 12:31 AM

Update after review.
Change error reporting to use ReplacementError message.

jdemeule marked 19 inline comments as done.Mar 20 2018, 12:33 AM

This looks great! Just a few nits left.

clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h
75

If conflicts occur, no Replacements are applied. This doesn't seem to be accurate; non-conflicting replacements are still added.

clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
207

nit: I'd probably use auto GroupedReplacements = ...;.

223

I think we are only skipping conflicts; non-conflicting replacements are still added even if the file contains other conflicts?

This doesn't seem to have caused any problem because the current caller simply drops all changes if conflict is detected in any file, but we should still make it clear what the behavior of mergeAndDeduplicate is (in the API doc) e.g. what would FileChanges contain if there is conflict in some but not all files?

230–248

nit: redundant empty line.

400

nit: consider inline this to save one variable.

jdemeule added inline comments.Mar 20 2018, 2:04 PM
clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h
75

You are perfectly right the description does not match the behavior.

I think I will update this part to "Deduplicate, check for conflicts, and convert all Replacements stored in TUs to a single AtomicChange per file. Conflicting replacement are skipped. However, the order of converting Replacements extracted from TUs and TUDs to AtomicChange is undefined."

And amend FileChanges with "Only non conflicting replacements are kept into FileChanges."

clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
223

I will update this comment to:
"To keep current clang-apply-replacement behavior, report conflicting replacements on corresponding file, all replacements are stored into 1 big AtomicChange."

jdemeule updated this revision to Diff 139889.Mar 26 2018, 11:26 PM

Update after review (cleanup and improve comments).

jdemeule marked 2 inline comments as done.Mar 26 2018, 11:26 PM
ioeric accepted this revision.Mar 27 2018, 7:25 PM

lgtm

clang-apply-replacements/include/clang-apply-replacements/Tooling/ApplyReplacements.h
75

a single AtomicChange per file

As FileChanges are a FileToChangesMap, it's already clear that it's a set of changes per file. We don't want to say "single AtomicChange" here because this is just an implementation details.

clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp
223

I would drop the "To keep the current behavior" part because it's only relevant in code review. Future readers wouldn't know what the "current" behavior is.

This revision is now accepted and ready to land.Mar 27 2018, 7:25 PM

Thanks a lot!
Can I ask you if I need to do something as I do not have commit access?

Can I help in some way to close this PR?

Can I help in some way to close this PR?

I applied the patch and ran the tests.

There was one unexpected failure:

[100%] Running the Clang extra tools' regression tests
FAIL: Clang Tools :: clang-apply-replacements/conflict.cpp (10 of 896)
******************** TEST 'Clang Tools :: clang-apply-replacements/conflict.cpp' FAILED ********************
Script:
--
mkdir -p /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict
sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file1.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file1.yaml
sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file2.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file2.yaml
sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/file3.yaml > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/file3.yaml
sed "s#\$(path)#/mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict#" /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/expected.txt > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/expected.txt
not clang-apply-replacements /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt 2>&1
diff -b /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/expected.txt
ls -1 /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict | FileCheck /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/conflict.cpp --check-prefix=YAML
not clang-apply-replacements /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict -remove-change-desc-files > /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/output.txt 2>&1
ls -1 /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict | FileCheck /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/conflict.cpp --check-prefix=NO_YAML
--
Exit Code: 1

Command Output (stdout):
--
2,4d1
< New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)"
< Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""
< The new replacement overlaps with an existing replacement.
11a9,11
> Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""
> The new replacement overlaps with an existing replacement.
> New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)"

--

********************
Testing Time: 21.09s
********************
Failing Tests (1):
    Clang Tools :: clang-apply-replacements/conflict.cpp

  Expected Passes    : 893
  Unsupported Tests  : 2
  Unexpected Failures: 1

Thank you for applying the patch.

I suspect some undefined order when applying replacements as directly dependant in which order OS reads file1.yaml, file2.yamland file3.yaml.
As I am not able to reproduce it on my machine, can you share with me the content of output.txt?

I suspect some undefined order when applying replacements as directly dependent on which order OS reads file1.yaml, file2.yamland file3.yaml.

$ ls -f /mparsons/llvm/llvm-build/tools/clang/tools/extra/test/clang-apply-replacements/Output/Inputs/conflict/
file1.yaml .. output.txt . file3.yaml file2.yaml expected.txt

As I am not able to reproduce it on my machine, can you share with me the content of output.txt?

The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+0:"(int*)"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 106:+26:"int & elem : ints"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 106:+26:"auto & i : ints"
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 140:+7:"elem"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 140:+7:"i"
The new replacement overlaps with an existing replacement.
New replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 169:+1:"nullptr"
Existing replacement: /mparsons/llvm/llvm/tools/clang/tools/extra/test/clang-apply-replacements/Inputs/conflict/common.h: 160:+12:""

Ok, I understand the problem.
Previously, during the deduplication step, replacements per file where sorted and it is not the case anymore.
That means now, clang-apply-replacement is highly dependant of reading file order during error reporting.
I mainly see 2 options (I can miss others), rewrite the test (e.g merging yaml file together) or sort the replacements per file to fix the order of application.

Can I ask you what solution you prefer?

Ok, I understand the problem.
Previously, during the deduplication step, replacements per file where sorted and it is not the case anymore.
That means now, clang-apply-replacement is highly dependant of reading file order during error reporting.
I mainly see 2 options (I can miss others), rewrite the test (e.g merging yaml file together) or sort the replacements per file to fix the order of application.

Can I ask you what solution you prefer?

I'd prefer sorting, so that results will be consistent when the tool is used on real data.

jdemeule updated this revision to Diff 141969.Apr 11 2018, 2:51 AM
jdemeule marked 3 inline comments as done.

Sort replacements before converting them to AtomicChange.

This revision was automatically updated to reflect the committed changes.

Third commit introducing linking errors.
Did no buildbot catch those?

Third commit introducing linking errors.
Did no buildbot catch those?

2 buildbots sent me an email, but they were unrelated failures.

unittests/clang-apply-replacements/ApplyReplacementsTest.cpp