This is an archive of the discontinued LLVM Phabricator instance.

[flang] Initial UTF-8 support in runtime I/O
ClosedPublic

Authored by klausler on Mar 18 2022, 1:48 PM.

Details

Summary

Implements UTF-8 encoding and decoding for external units
with OPEN(ENCODING='UTF-8'). This encoding applies to default
CHARACTER values that are not 7-bit ASCII as well as to
the wide CHARACTER kinds 2 and 4. Basic testing is in place
via direct calls to the runtime I/O APIs, but serious checkout
awaits lowering support of the wide CHARACTER kinds.

Diff Detail

Event Timeline

klausler created this revision.Mar 18 2022, 1:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2022, 1:48 PM
klausler requested review of this revision.Mar 18 2022, 1:48 PM

The code looks good to me, but something might be wrong since the added test is failing.

flang/unittests/Runtime/ExternalIOTest.cpp
844

This test is failing with the built bots.

klausler added inline comments.Mar 21 2022, 9:19 AM
flang/unittests/Runtime/ExternalIOTest.cpp
844

Yes, I see; will figure it out.

klausler updated this revision to Diff 417114.Mar 21 2022, 3:07 PM

Fix bugs exposed by testing in other environments

jeanPerier accepted this revision.Mar 22 2022, 2:49 AM

LGTM

flang/runtime/utf.cpp
14

You could use // clang-format off and // clang-format on around UTF8FirstByteTable to prevent anyone from destroying the human readable format of this table when running clang-format.

This revision is now accepted and ready to land.Mar 22 2022, 2:49 AM
klausler updated this revision to Diff 417349.Mar 22 2022, 11:15 AM

Silence messages about clang-format for a table that shouldn't be automatically reformatted.

This revision was automatically updated to reflect the committed changes.