This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Minimize the damage caused by AlignArrayOfStructures when working on non square arrays
ClosedPublic

Authored by MyDeveloperDay on Mar 6 2022, 1:59 AM.

Details

Summary

I have lost count of the number of times this has been reported, but it fundamentally comes down to the fact that the "AlignArrayLeft/Right" function is fundamentally broken for non-square arrays.

As a result, a pointer can end up running off the end of the array structure, I've spent the last 2 weekends trying to rewrite this algorithm but I've struggled to get it aligned correctly.

This is an interim fix, that ignores all array that are non-square and leaves them alone. I think this can allow us to close out most of the crashes (if not all).

I think this can help reduce the number of bugs coming in that are duplicates.

Fixes https://github.com/llvm/llvm-project/issues/53748
Fixes https://github.com/llvm/llvm-project/issues/51767
Fixes https://github.com/llvm/llvm-project/issues/51277

Diff Detail

Event Timeline

MyDeveloperDay created this revision.Mar 6 2022, 1:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2022, 1:59 AM
MyDeveloperDay requested review of this revision.Mar 6 2022, 1:59 AM

Update to always use the first row number of columns or unit tests crash

Add more producers

MyDeveloperDay edited the summary of this revision. (Show Details)
curdeius added inline comments.Mar 6 2022, 5:29 AM
clang/lib/Format/WhitespaceManager.h
204

Don't really like the name...
Some suggestions:

  • HasEqualRows
  • HasEqualRowLengths
  • HasSameRowLengths
  • IsRectangular
clang/unittests/Format/FormatTest.cpp
25346

Not sure I understand the comment here, this formatting looks ok, no?

25377

This one is "square", isn't it?
I think you wanted to omit 3 in the first row

curdeius edited the summary of this revision. (Show Details)Mar 6 2022, 5:33 AM
MyDeveloperDay updated this revision to Diff 413298.EditedMar 6 2022, 7:12 AM

Fix review comments and clang-format

NOTE: part of me wants to remove this feature completely, it doesn't work correctly in a number of corner cases.

This is really a compromise to stabilize its use.

There should also be a warning or at least note on the option description about the current limitations.

clang/lib/Format/WhitespaceManager.h
202–203

This should also be adapted.
And a full stop at the end.

MyDeveloperDay marked 4 inline comments as done.

Lets get the comment aligned with the code

MyDeveloperDay added a comment.EditedMar 6 2022, 7:42 AM
NOTE: I've tried to collate all the reported crashing examples and run this fix through them (both Left and Right) all pass except this one below
void foo()
{
auto thing = test{
    {
       {13},
       {something}, //A
    }
  };
}

This is not related to it being non-square because it's not.

clang/unittests/Format/FormatTest.cpp
25346

The test is testing that it's not changing it.. with the old code it would (but incorrectly or crash)

curdeius accepted this revision.Mar 6 2022, 8:08 AM

Nits only.
+1 for adding a note in the option documentation.

clang/lib/Format/WhitespaceManager.h
203
clang/unittests/Format/FormatTest.cpp
25396

Nit: should be updated as above I guess.

This revision is now accepted and ready to land.Mar 6 2022, 8:08 AM
MyDeveloperDay marked 2 inline comments as done.

There were no checks for if the cell would run off the end of the CellDescs structure, this is part of the reason for the crashes that the PrevIter->Index becomes invalid and is then used to index into the Changes.

The code just assumed it knew how many rows and columns there were, but in some circumstances it gets it wrong and just keeps going. This MaxRows check should at least prevent that by ensuring we just don't keep going off the end of the rows. (this was the final crash)

Added a warning to the documentation.

MyDeveloperDay added inline comments.Mar 6 2022, 8:41 AM
clang/lib/Format/WhitespaceManager.cpp
1117

I don't understand in Left alignment why we ignore the first cell, but in right alignment, we don't!

clang/lib/Format/WhitespaceManager.h
319

Ultimately these calculations are incorrect unless every CellCount for every row is the same, and for non-square they are not, this always had me confused as to why it starts at 1?

feg208 accepted this revision.Mar 6 2022, 8:44 AM

I have been watching the issues pile up around this but work has prevented me from focusing on them. This seems like a reasonable compromise but the note of disabling this feature is also reasonable. Given where it had to be placed in the formatting flow I suspect it will be challenging to resolve most of the troublesome corner cases. I did investigate one bug I picked up but ran into the problems you described earlier and was going down the same compromise path.

MyDeveloperDay added a comment.EditedMar 6 2022, 8:51 AM

@feg208 I could do with some clarity on the algorithm, am I correct in thinking it requires that the first row, to contain at least the maximum number of columns for the rest of the array. I sort of noticed it was fine if the first row was larger

{
    {1,2,3 }
    {4,5}
    {6}
}

This seems like a reasonable compromise

I kind of feel this might be the best initial option, I don't want to disable it completely because it looks like people are trying to use it, (and it seems to work for the square array case).

At present, I want to stop the "crash" haemorrhaging.

curdeius added inline comments.Mar 6 2022, 10:44 AM
clang/lib/Format/WhitespaceManager.h
317

You can elide braces. Personally I use RemoveBraces locally.

feg208 added a comment.Mar 6 2022, 2:32 PM

@feg208 I could do with some clarity on the algorithm, am I correct in thinking it requires that the first row, to contain at least the maximum number of columns for the rest of the array. I sort of noticed it was fine if the first row was larger

{
    {1,2,3 }
    {4,5}
    {6}
}

Honestly it really is best with the square option. The assumption is an array of simple initializers though default args to constructors complicate that in the case of an array of single constructors. I think that you are seeing it be okay with the maximum as the first row is just an accident of the implementation. The direction I was heading in in the bug I picked up was to start restricting the set of things it would consider as array of initializers. Does that answer your question?

owenpan added inline comments.Mar 6 2022, 2:49 PM
clang/lib/Format/WhitespaceManager.cpp
1073

Can we use CellCounts[0] instead? The outer parentheses are superfluous. Same in alignArrayInitializersLeftJustified below.

1117

Because the first cell is already left-aligned by default?

clang/lib/Format/WhitespaceManager.h
204

My top choice by far is isRectangular.

211

We can omit the braces here.

317

Why do we need this check? Is it because CellStop might not be null terminated? Or is it because CellCount may be zero? If the latter, we can check CellCount before the loop instead of this check in the loop.

clang/unittests/Format/FormatTest.cpp
25353

Please remove \n here and in other places below when it's the last line of a test case.

NOTE: I've tried to collate all the reported crashing examples and run this fix through them (both Left and Right) all pass except this one below
void foo()
{
auto thing = test{
    {
       {13},
       {something}, //A
    }
  };
}

This is not related to it being non-square because it's not.

I think had a simple fix for this. I will revisit it after you land this patch.

clang/lib/Format/WhitespaceManager.h
202–203

Do they fit in one line?

owenpan added inline comments.Mar 6 2022, 3:05 PM
clang/lib/Format/WhitespaceManager.h
317

Ok, you already explained why you added the check, but I still wonder what causes it.

owenpan added inline comments.Mar 6 2022, 3:12 PM
clang/lib/Format/WhitespaceManager.h
319

Ultimately these calculations are incorrect unless every CellCount for every row is the same, and for non-square they are not, this always had me confused as to why it starts at 1?

Because the loop starts at CellStop->NextColumnElement, the 2nd row of the column?

MyDeveloperDay marked 10 inline comments as done.

Fixing final review comments prior to commit

This revision was landed with ongoing or failed builds.Mar 12 2022, 9:24 AM
This revision was automatically updated to reflect the committed changes.