This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Enable FormatTokenSource to insert tokens.

Authored by klimek on Feb 1 2023, 7:17 AM.



In preparation for configured macro replacements in formatting,
add the ability to insert tokens to FormatTokenSource, and implement
token insertion in IndexedTokenSource.

Diff Detail

Event Timeline

klimek created this revision.Feb 1 2023, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 7:17 AM
klimek requested review of this revision.Feb 1 2023, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 7:17 AM

Only serious concern is getPreviousToken().


As I understand it, this parameter is used:

  • to provide the initial set of input tokens the source will iterate over
  • as a common address space for input + synthesized tokens, to allow the jump mechanism to work
  • to make the caller responsible for ownership/lifetime of the synthesized tokens too

This simplifies the implementation, my only problem with this is it seems unusual and confusing.
A comment explaining the roles of this Tokens param would help a bit.

Alternatively you could consider slightly different data structures just for the purpose of making interfaces more obvious: e.g. pass in a BumpPtrAllocator&, allocate scratch tokens there, and use pointers instead of indices (jumps become a ptr->ptr map etc)


this no longer seems valid, immediately after calling insertTokens(), Position is at the start of a jump range, and Tokens[Position - 1] will be the EOF at the end of the main stream or previous jump range.

if this is never going to happen, you can detect it (I don't think Tokens[Position - 1] can be EOF in any other way)


maybe add a comment that positions don't have meaningful order and this is only useful to restore the position?

(All the callsites look good, it just feels like a trap)


nit: move into the LLVM_DEBUG block below? or are you worried about a crash?


nit: const

I wasn't sure if this method advanced or not. "successor" might be clearer in this respect



klimek updated this revision to Diff 497415.Feb 14 2023, 12:57 PM
klimek marked 3 inline comments as done.

Address reviewer comments.


Noticing none of this makes sense. We should really just copy the tokens, given that we modify the vector.


Added comment in the interface and assert.


Added a comment.

klimek updated this revision to Diff 497416.Feb 14 2023, 1:03 PM

Undo changes to ownership of initial set of FormatTokens.

sammccall accepted this revision.Feb 15 2023, 2:36 AM
sammccall added inline comments.

Oops, somehow I missed that this was an array of *pointers* and assumed that copying it was no good. This is much better.


I'm not really sure how to read the combination of these two FIXMEs... does it mean "we wanted to do this differently one day, but now we never can"?

Maybe either delete both, or if you think it's still potentially actionable, something like FIXME: it would be better to have tokens created and owned outside because XYZ, but it's hard because macro expansion mutates the stream

(I don't really understand what the prev comment is about: the tokens *are* handed into the UnwrappedLineParser constructor. So I may be off base here)


nit: parens on (Tok) are redundant given Tok is a local rather than the macro arg

This revision is now accepted and ready to land.Feb 15 2023, 2:36 AM
klimek updated this revision to Diff 497637.Feb 15 2023, 4:38 AM
klimek marked an inline comment as done.

Address review comments.


Yeah, I think I don't fully understand what I wanted to fix, so deleted both.

This revision was landed with ongoing or failed builds.Feb 15 2023, 4:43 AM
This revision was automatically updated to reflect the committed changes.
owenpan added a project: Restricted Project.Mar 29 2023, 5:32 PM