This is an archive of the discontinued LLVM Phabricator instance.

Add support for foreach macros to clang-format
Needs ReviewPublic

Authored by briangreenery on Mar 2 2014, 1:48 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

I added a new configuration option to clang-format called "ForEachMacros". This
allows macros that work like a foreach loop, like BOOST_FOREACH, to be formatted
correctly. Currently they're formatted like a function call.

This would fix:

http://llvm.org/bugs/show_bug.cgi?id=18534

Which these bugs could then be duped on:

http://llvm.org/bugs/show_bug.cgi?id=17518
http://llvm.org/bugs/show_bug.cgi?id=17242

Diff Detail

Event Timeline

Would be neat if the option could take regular expressions. The linux kernel contains many foreach macros and it would be nice to concisely match them. :)

lib/Format/FormatToken.cpp
67 ↗(On Diff #7460)

Can't this be std::find(std::begin(Style.ForEachMacros), std::end(Style.ForEachMacros), Name) != std::end(Style.ForEachMacros); ?

briangreenery updated this revision to Unknown Object (????).Mar 4 2014, 1:17 AM

Change ForEachMacros to be a regex instead of a list of strings

Using a regex instead of a list of strings sounds like a much better idea.

I also added another new configuration option SBPO_ControlStatementsAndForEachMacros since it seems like some people put spaces after foreach macros (like the example in bug 17242, bug 18534, and the code I normally work on), and some people don't (like the linux kernel).

Is there something I can do to help get this reviewed?

I don't think using regular expressions is the right way to go here:

  • We'll want to precisely control which macros are a "foreach". Otherwise, it is too likely to accidentally come up with a method name that matches the regex.
  • Regular expressions have a performance impact, especially in the way they are used here (not precomputed once per token).

I would argue for making this a (e.g. comma-separated) list of macros, which gets parsed into the identifier table used to lookup most of the other identifiers.

docs/ClangFormatStyleOptions.rst
311

Do we really want to treat those separate from control statements? I don't see a good/valid reason for doing so...

@djasper, there are over 500 such macros in the linux kernel. I think a regex is more appropriate than a complete list of such macros.

  1. I don't.
  2. How would you verify that your regex addresses all of those 500 macros.
  3. If you are loose enough, how do you guarantee that you don't also include a function that is not a foreach macro?
  4. Are there really 500 different foreach macros? I couldn't come up with that many different ways of spelling it.

Here is a small subset of such macros to give you an example:
list_for_each_cookie
list_for_each_entry
list_for_each_entry_continue
list_for_each_entry_continue_rcu
list_for_each_entry_continue_reverse
list_for_each_entry_from
list_for_each_entry_rcu
list_for_each_entry_reverse
list_for_each_entry_safe
list_for_each_entry_safe_continue
list_for_each_entry_safe_from
list_for_each_entry_safe_reverse
list_for_each_from
list_for_each_prev
list_for_each_prev_safe
list_for_each_safe
list_for_each_xattr
radix_tree_for_each_chunk
radix_tree_for_each_chunk_slot
radix_tree_for_each_contig
radix_tree_for_each_slot
radix_tree_for_each_tagged

I used the following to find these:
git grep for_each | grep define | grep -P "\w+_for_each\w+" -o | sort -u

That macro gave me no false positives.

  • Not all of those qualify for what this CL implements. They don't expect a variable declaration as the first parameter and don't expect users to add a space between macro name and opening parenthesis. These are false positives (I think actually almost all of the ones you are mentioning).
  • The regex has false positives, e.g. in the git codebase, which defines functions with such names.
briangreenery updated this revision to Unknown Object (????).Mar 30 2014, 11:30 PM

I tried to go with the approach that djasper outlined: support a static list of macros parsed into an identifier table used for lookup.

I'm sorry for the delay in updating this.

A few comments, but otherwise this looks good.

Thank you for working on this. Do you have commit access?

lib/Format/UnwrappedLineParser.cpp
663

Don't fall through implicitly, add "break;".

1051

Nit: 'for', 'while' or foreach macro expected.

lib/Format/UnwrappedLineParser.h
88

I don't see any reason to rename this. ForEachLoops are for loops (which also include range-based for loops).

briangreenery updated this revision to Unknown Object (????).Mar 31 2014, 11:54 PM

Okay, I made those changes. I don't have commit access.

It looks like phabricator only picked up one of the changes to FormatTest.cpp. Here's the diff generated by 'svn diff', which has all changes in it.{F51395}

Added basic testing and minor tweaks and submitted as r205307.

Thank you!