This is an archive of the discontinued LLVM Phabricator instance.

[libc++][format] Fixes UTF-8 continuation.
ClosedPublic

Authored by Mordante on May 2 2023, 10:59 AM.

Details

Reviewers
ldionne
tahonermann
Group Reviewers
Restricted Project
Commits
rG09addf9cbe0a: [libc++][format] Fixes UTF-8 continuation.
Summary

The mask used to check whether a code unit is a valid continuation was
incorrect and accepts non-continuation code points. This fixes the
issue.

Diff Detail

Event Timeline

Mordante created this revision.May 2 2023, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2023, 10:59 AM
Mordante requested review of this revision.May 2 2023, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2023, 10:59 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript

@tahonermann when you have time can you have a look?

tahonermann accepted this revision.Jun 2 2023, 2:40 PM

The change looks right to me and the added tests look good. I left a couple of suggested edits with comments for you to do with as you please.

libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
506–511

I'm not sure what purpose is served by the reference to PR-121 here. The test appears to demonstrate that code unit values are preserved as escaped values which, I think, is the right behavior according to the C++ standard, but is not a match for any of the PR-121 policies.

libcxx/test/std/utilities/format/format.functions/unicode.pass.cpp
272–274

I eventually figured out how the mask in the comment correlated with the format string and why each was relevant, but I wouldn't expect many people to understand the intended relevance without a more descriptive comment. I suggest just removing these comments.

Mordante marked 2 inline comments as done.Jun 3 2023, 6:24 AM

Thanks for the review!

libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
506–511

That is a left over debug message for me, removed it.

Mordante updated this revision to Diff 528102.Jun 3 2023, 6:25 AM
Mordante marked an inline comment as done.

Rebased and address review comments.

tahonermann accepted this revision.Jun 5 2023, 8:39 AM

Looks good!

Mordante updated this revision to Diff 532198.Jun 16 2023, 9:25 AM

Rebased so D150044 can be rebased.

Mordante updated this revision to Diff 532249.Jun 16 2023, 11:33 AM

Rebased to fix CI.

ldionne accepted this revision.Jun 20 2023, 8:42 AM
This revision is now accepted and ready to land.Jun 20 2023, 8:42 AM
This revision was automatically updated to reflect the committed changes.