This is an archive of the discontinued LLVM Phabricator instance.

[YAMLTraits] Add trait for char
ClosedPublic

Authored by JDevlieghere on May 11 2020, 1:56 PM.

Details

Summary

Add a YAML trait for char.

Diff Detail

Event Timeline

JDevlieghere created this revision.May 11 2020, 1:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2020, 1:56 PM

Test formatting

thegameg accepted this revision.May 11 2020, 3:06 PM

LGTM.

This revision is now accepted and ready to land.May 11 2020, 3:06 PM
This revision was automatically updated to reflect the committed changes.
ro added a subscriber: ro.May 13 2020, 4:40 AM

This change broke the Solaris buildbots:

[2/2416] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/AMDGPUMetadata.cpp.o
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/AMDGPUMetadata.cpp.o 
/usr/gcc/7/bin/c++  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/lib/Support -Iinclude -I/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/include -I/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/include/llvm/Support/Solaris -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -O3    -UNDEBUG -std=c++14  -fno-exceptions -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/AMDGPUMetadata.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/AMDGPUMetadata.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/AMDGPUMetadata.cpp.o -c /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/lib/Support/AMDGPUMetadata.cpp
In file included from /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/lib/Support/AMDGPUMetadata.cpp:17:0:
/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/include/llvm/Support/YAMLTraits.h:1211:8: error: redefinition of ‘struct llvm::yaml::ScalarTraits<char>’
 struct ScalarTraits<int8_t> {
        ^~~~~~~~~~~~~~~~~~~~
/opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/llvm/include/llvm/Support/YAMLTraits.h:1162:20: note: previous definition of ‘struct llvm::yaml::ScalarTraits<char>’
 template <> struct ScalarTraits<char> {
                    ^~~~~~~~~~~~~~~~~~

since int8_t is char here.

ro added a comment.May 15 2020, 12:25 AM

I haven't even received a response in two days, the buildbot breakage remains for four days now with no action whatsoever.

What's the way forward here? Reversal?

ro added a comment.May 21 2020, 8:44 AM

Given the Solaris buildbots had been broken for a week and a half, I had reverted the patch.

Unfortunately, this breaks the lldb-x86_64-debian buildbot:

FAILED: tools/lldb/source/Utility/CMakeFiles/lldbUtility.dir/Args.cpp.o 
/usr/bin/clang++  -DGTEST_HAS_RTTI=0 -DHAVE_ROUND -DLLDB_CONFIGURATION_RELEASE -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/lldb/source/Utility -I/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/source/Utility -Itools/lldb/source -I/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/include -Itools/lldb/include -Iinclude -I/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/include -I/usr/include/python3.7m -I/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/../clang/include -Itools/lldb/../clang/include -I/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/source/. -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-declarations -Wno-unknown-pragmas -Wno-strict-aliasing -Wno-deprecated-register -Wno-vla-extension -O3     -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT tools/lldb/source/Utility/CMakeFiles/lldbUtility.dir/Args.cpp.o -MF tools/lldb/source/Utility/CMakeFiles/lldbUtility.dir/Args.cpp.o.d -o tools/lldb/source/Utility/CMakeFiles/lldbUtility.dir/Args.cpp.o -c /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/source/Utility/Args.cpp
In file included from /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/source/Utility/Args.cpp:9:
In file included from /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/include/lldb/Utility/Args.h:17:
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:1123:36: error: implicit instantiation of undefined template 'llvm::yaml::MissingTrait<char>'
  char missing_yaml_trait_for_type[sizeof(MissingTrait<T>)];
                                   ^
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:945:7: note: in instantiation of function template specialization 'llvm::yaml::yamlize<char>' requested here
      yamlize(*this, Val, Required, Ctx);
      ^
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:851:11: note: in instantiation of function template specialization 'llvm::yaml::IO::processKey<char, llvm::yaml::EmptyContext>' requested here
    this->processKey(Key, Val, true, Ctx);
          ^
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/source/Utility/Args.cpp:692:6: note: in instantiation of function template specialization 'llvm::yaml::IO::mapRequired<char>' requested here
  io.mapRequired("quote", keys->quote);
     ^
/home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/llvm/include/llvm/Support/YAMLTraits.h:307:8: note: template is declared here
struct MissingTrait;
       ^
1 error generated.

Any suggestions on how to move forward from here?

Hey Rainer,

I'm sorry I didn't see your comment earlier. Because of your e-mail address your replies skipped my inbox and ended up in a GCC folder... I'll need to fix that!

If the Solaris build bot is public then reverting is indeed fair game. To the correctness of this patch, if it's allowed for char and uint8_t to be the same type then this patch is wrong but I'd have to double check this stuff. Regardless, it seems like there's at least one platform where that's the case. Let me see if I can simply work around it and get lldb to build by casting to uint8_t.

ro added a comment.May 21 2020, 9:19 AM

Hey Rainer,

I'm sorry I didn't see your comment earlier. Because of your e-mail address your replies skipped my inbox and ended up in a GCC folder... I'll need to fix that!

Ah, that explains a lot ;-)

If the Solaris build bot is public then reverting is indeed fair game. To the correctness of this patch, if it's allowed for char and uint8_t to be the same type then this patch is wrong but I'd have to double check this stuff. Regardless, it seems like there's at least one platform where that's the case. Let me see if I can simply work around it and get lldb to build by casting to uint8_t.

The builders are at Builder clang-solaris11-sparcv9 and Builder clang-solaris11-amd64, so yes, they are public, although currently on
the silent buildmaster.

I've checked the gcc configs and of those, only on Solaris and BPF is int8_t plain char. Elsewhere they use explicit signed char or
unsigned char. Haven't checked what the C standard requires/allows here, but AFAIK Sun had a couple of guys on the C committee at
the time, so they should know what they're doing...

In D79745#2049287, @ro wrote:

Hey Rainer,

I'm sorry I didn't see your comment earlier. Because of your e-mail address your replies skipped my inbox and ended up in a GCC folder... I'll need to fix that!

Ah, that explains a lot ;-)

If the Solaris build bot is public then reverting is indeed fair game. To the correctness of this patch, if it's allowed for char and uint8_t to be the same type then this patch is wrong but I'd have to double check this stuff. Regardless, it seems like there's at least one platform where that's the case. Let me see if I can simply work around it and get lldb to build by casting to uint8_t.

The builders are at Builder clang-solaris11-sparcv9 and Builder clang-solaris11-amd64, so yes, they are public, although currently on
the silent buildmaster.

Okay, I was worried I had missed the build failure e-mails too.

I've checked the gcc configs and of those, only on Solaris and BPF is int8_t plain char. Elsewhere they use explicit signed char or
unsigned char. Haven't checked what the C standard requires/allows here, but AFAIK Sun had a couple of guys on the C committee at
the time, so they should know what they're doing...

The C++ standard [1] says that char, unsigned char and signed char are distinctive types. However, I doesn't specify how uint8_t is implemented. I think maybe the right thing to do is wrap the YAML trait in a ifdef that excludes Solaris and BPF. What do you think?

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf ([basic.fundamental])

ro added a comment.May 21 2020, 2:31 PM

I've checked the gcc configs and of those, only on Solaris and BPF is int8_t plain char. Elsewhere they use explicit signed char or
unsigned char. Haven't checked what the C standard requires/allows here, but AFAIK Sun had a couple of guys on the C committee at
the time, so they should know what they're doing...

The C++ standard [1] says that char, unsigned char and signed char are distinctive types. However, I doesn't specify how uint8_t is implemented. I think maybe the right thing to do is wrap the YAML trait in a ifdef that excludes Solaris and BPF. What do you think?

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf ([basic.fundamental])

TBH, feels like a hack to me: if int8_t can be either char (if signed) or signed char, the code should deal with both possibilities.

That said, it would certainly unbreak the Solaris builds which for now is my primary concern. AFAIK there's no lldb port to Solaris, though
I vaguely recall that there was an attempt in the Illumos (OpenSolaris derivative) community. On top of that, those static lists are quite
fragile IMO: something like this should be feature-based instead, but that is an all too common problem all over LLVM.

What about your idea of simply casting to int8_t in lldb instead?

In D79745#2050040, @ro wrote:

I've checked the gcc configs and of those, only on Solaris and BPF is int8_t plain char. Elsewhere they use explicit signed char or
unsigned char. Haven't checked what the C standard requires/allows here, but AFAIK Sun had a couple of guys on the C committee at
the time, so they should know what they're doing...

The C++ standard [1] says that char, unsigned char and signed char are distinctive types. However, I doesn't specify how uint8_t is implemented. I think maybe the right thing to do is wrap the YAML trait in a ifdef that excludes Solaris and BPF. What do you think?

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf ([basic.fundamental])

TBH, feels like a hack to me: if int8_t can be either char (if signed) or signed char, the code should deal with both possibilities.

It just seems like such a weird decision, if you must have to have 3 distinct types according to the standard, why would you choose char (for which it's implementation defined whether it can be negative or not) instead of unsigned char (which is guaranteed to be unsigned). Maybe I'm missing something?

That said, it would certainly unbreak the Solaris builds which for now is my primary concern. AFAIK there's no lldb port to Solaris, though
I vaguely recall that there was an attempt in the Illumos (OpenSolaris derivative) community. On top of that, those static lists are quite
fragile IMO: something like this should be feature-based instead, but that is an all too common problem all over LLVM.

What about your idea of simply casting to int8_t in lldb instead?

Given that the types are distinct this would require a reinterpret_cast which feels more hacky than ifdef'ing the trait for the "odd" choice in solaris.

ro added a comment.May 22 2020, 1:24 AM

The C++ standard [1] says that char, unsigned char and signed char are distinctive types. However, I doesn't specify how uint8_t is implemented. I think maybe the right thing to do is wrap the YAML trait in a ifdef that excludes Solaris and BPF. What do you think?

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf ([basic.fundamental])

TBH, feels like a hack to me: if int8_t can be either char (if signed) or signed char, the code should deal with both possibilities.

It just seems like such a weird decision, if you must have to have 3 distinct types according to the standard, why would you choose char (for which it's implementation defined whether it can be negative or not) instead of unsigned char (which is guaranteed to be unsigned). Maybe I'm missing something?

I've done some digging: <sys/int_types.h> has

/*
 * Basic / Extended integer types
 *
 * The following defines the basic fixed-size integer types.
 *
 * Implementations are free to typedef them to Standard C integer types or
 * extensions that they support. If an implementation does not support one
 * of the particular integer data types below, then it should not define the
 * typedefs and macros corresponding to that data type.  Note that int8_t
 * is not defined in -Xs mode on ISAs for which the ABI specifies "char"
 * as an unsigned entity because there is no way to define an eight bit
 * signed integral.
 */
#if defined(_CHAR_IS_SIGNED)
typedef char                    int8_t;
#else
#if defined(__STDC__)
typedef signed char             int8_t;
#endif
#endif

Both sparc and x86 define _CHAR_IS_SIGNED in accordance with the respective psABIs, and when those definitions were made,
Solaris needed to support both K&C and ANSI C. Once the definition was made this way, it had to stick, otherwise e.g. C++ mangling
changes.

That's what you get on a system that has been around for almost 3 decades :-)

That said, it would certainly unbreak the Solaris builds which for now is my primary concern. AFAIK there's no lldb port to Solaris, though
I vaguely recall that there was an attempt in the Illumos (OpenSolaris derivative) community. On top of that, those static lists are quite
fragile IMO: something like this should be feature-based instead, but that is an all too common problem all over LLVM.

What about your idea of simply casting to int8_t in lldb instead?

Given that the types are distinct this would require a reinterpret_cast which feels more hacky than ifdef'ing the trait for the "odd" choice in solaris.

However, it would work everywhere instead of leaving systems with char for int8_t in the cold.

Your choice, of course.

Btw., I happened to notice some formatting inconsistencies in your patch that you may want to correct.

I don't believe any solution based on conditional compilation (ifdefs, feature detection, etc.) would help here. The reason for that is that it's impossible to assign consistent semantics to char and uint8_t everywhere.

  • Currently, the "desired" semantics for uint8_t is to treat it as a number. In fact the implementation contains already contains a small hack to achieve it:
void ScalarTraits<uint8_t>::output(const uint8_t &Val, void *,
                                   raw_ostream &Out) {
  // use temp uin32_t because ostream thinks uint8_t is a character
  uint32_t Num = Val;
  Out << Num;
}
  • The proposed desired semantics for char is to treat it as a character.

Both of these semantics are "reasonable", but (as we've seen) they are not actually implementable in C(++). Conditionally compiling ScalarTraits<char> will make things build on solaris, but it will mean that the yaml serialization of char variables will be different there. That does not really matter for the current lldb use case, but generally speaking, it's not a good situation to be in.

For the time being, I think it's perfectly reasonable to work around this problem in lldb. If you don't like reinterpret_casts (which are fairly safe in this case, as one of the types is char), you can use a solution like:

uint8_t TheInt = TheChar;
IO.mapRequired("whatever", TheInt);
TheChar = TheInt;

If we wanted to define a global yaml serialization for char, then I think the first step should be to change the existing specializations to be defined in terms of primitive types ((un)signed (char|short|int|...) and char), because mixing primitive types and the stdint typedefs just doesn't work. Then we can discuss the appropriate serialization for each primitive type. I think it would be reasonable if all flavours of char printed out as a character, as it would e.g. match what raw_ostream does (whose operator<<s are also defined in terms of fundamental types), but I'm not sure how many dependencies we have on the current serialization of (u)int8_t (though given that the yaml code is mainly used for tests, I wouldn't expect any of those dependencies to be too critical)...

For the time being, I think it's perfectly reasonable to work around this problem in lldb.

Alright, let's go with that then.

ro added a comment.May 26 2020, 2:21 PM

Both Solaris buildbots are working again now. Thanks a lot.