This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Improve updating data files.
ClosedPublic

Authored by Mordante on Jul 13 2022, 11:04 AM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project
Commits
rG130b1816c54d: [libc++] Improve updating data files.
Summary

This changes makes it easier to update the Unicode data files used for
the Extended Graphme Clustering as added in D126971.

Diff Detail

Event Timeline

Mordante created this revision.Jul 13 2022, 11:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2022, 11:04 AM
Mordante added inline comments.
libcxx/utils/CMakeLists.txt
2

For developers and the CI a hard requirement is fine. But this might not be good for vendors.
We could make the target using formatting conditional on whether or not the binary is found.
That would however mean that if clang-format isn't found the CI will pass.

So I'm considering to make all these targets to depend on clang-format.
@ldionne Let's discuss what the best approach is.

Mordante updated this revision to Diff 444369.Jul 13 2022, 11:53 AM

Changed the properties without updating the tables.
This should result in a failing CI run.

Mordante updated this revision to Diff 444377.Jul 13 2022, 12:13 PM

Finalize the patch for review.

Mordante published this revision for review.Jul 15 2022, 8:40 AM
Mordante added a reviewer: ldionne.
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2022, 8:40 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne requested changes to this revision.Jul 26 2022, 9:38 AM

I really like this except for the clang-format requirement -- thanks!

libcxx/utils/CMakeLists.txt
2

I don't think we want to add this hard requirement. Is it possible to tweak the generating scripts a tiny bit to obtain slightly (even though not perfect) output?

16
19–21

I would instead use the style that we use in other scripts and hardcode the path to the file that we generate in generate_extended_grapheme_cluster_table.py. I know it's not as modular as outputting to stdout and I would never recommend that for a general purpose tool, however I'd like this to work on Windows, and I'm pretty sure the current > in the CMake target won't. If you can confirm that redirection will work on Windows (I wasn't able to after a quick search), then I'd be fine with this approach as well.

28
libcxx/utils/data/unicode/README.txt
1 ↗(On Diff #444377)

Also, can you please include a copy-pasteable command to update all of those files at once? Something like a curl invocation.

This is helpful to quickly update those files without having to think too much.

This revision now requires changes to proceed.Jul 26 2022, 9:38 AM
Mordante marked 4 inline comments as done.Aug 3 2022, 10:20 AM

Thanks for the review comments!

libcxx/utils/CMakeLists.txt
2

The generating script's output has been updated in D126971.

19–21

I really dislike hard-coding the output path in this script. Neither do I have access to Windows to validate whether > works.

Instead I will change the script, when it's called with one argument it will use that argument as filename to write to. Otherwise it will still write to stdout.

Mordante updated this revision to Diff 449700.Aug 3 2022, 10:23 AM
Mordante marked an inline comment as done.

Rebased and addresses review comments.

ldionne accepted this revision.Aug 16 2022, 9:21 AM
This revision is now accepted and ready to land.Aug 16 2022, 9:21 AM
This revision was automatically updated to reflect the committed changes.