This is an archive of the discontinued LLVM Phabricator instance.

lldb/include/lldb/lldb-enumerations.h
170	add to the end, or all LLDBRPC.framework calls that use this enumeration will fail for you.
lldb/source/Symbol/ClangASTContext.cpp
1383–1384	are "const(T)" and "immutable(T)" the actualy type names or are they layers on top of a base "wchar" type? These shouldn't be needed if so as the base "wchar" type should end up handling the base type correctly. Remove?
1388–1389	ditto
1393–1394	ditto

This revision now requires changes to proceed.Aug 19 2019, 3:19 PM

Feedback Greg.
Remove dlang types.

JDevlieghere edited the summary of this revision. (Show Details)Aug 19 2019, 3:46 PM

jblachly added a subscriber: jblachly.Aug 19 2019, 6:07 PM

Thank you for creating a revision and reviewing this.

I made inline comments on the test harness and Dlang types / qualifiers.

With removal of the Dlang types, where is the appropriate place to put them?
It is not clear to me whether language plugins can replace the functionality in ClangASTContext::GetBuiltinTypeForDWARFEncodingAndBitSize.

Thanks again

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/TestCxxChar8_t.py
26	I believe clang7 requires -fchar8_t, whereas the test harness here passes -std=c++2a ; char8_t is enabled via -std=c++2a beginning in clang-8 "(11): Prior to Clang 8, this feature is not enabled by -std=c++2a, but can be enabled with -fchar8_t. " https://clang.llvm.org/cxx_status.html#p0482
lldb/source/Symbol/ClangASTContext.cpp
1383–1384	In LDC (the LLVM D compiler), application of type qualifier immutable or const to a type T defines a new type. I am not a DWARF expert, but running a sample program through lldb seems to confirm this: error: need to add support for DW_TAG_base_type 'char' encoded with DW_ATE = 0x10, bit_size = 8 error: need to add support for DW_TAG_base_type 'const(char)' encoded with DW_ATE = 0x10, bit_size = 8 error: need to add support for DW_TAG_base_type 'immutable(char)' encoded with DW_ATE = 0x10, bit_size = 8 Whereas -- interestingly -- the reference compiler DMD encodes them differently (as char, const char, and char; respectively as the DWARF spec I guess has no qualifier for immutable). The LDC behavior is IMO more true to language spec.

labath added a subscriber: labath.Aug 20 2019, 1:43 AM

labath added inline comments.

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/Makefile
5–7	Replace with `CFLAGS_EXTRAS+=-std=c++2a`
lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/TestCxxChar8_t.py
47	This comment looks misplaced.
lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/main.cpp
2–8	It doesn't look like you actually need a running process for this at all. I think you could just make `c8` a global variable and inspect it from the executable image directly (`target variable c8`).
lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
857–862	It looks like you're also adding formatters for `char8_t *` and `char8_t[N]`. I guess those should be tested too...

Address Pavel's comments.

In D66447#1636456, @jblachly wrote:

Thank you for creating a revision and reviewing this.

I made inline comments on the test harness and Dlang types / qualifiers.

With removal of the Dlang types, where is the appropriate place to put them?
It is not clear to me whether language plugins can replace the functionality in ClangASTContext::GetBuiltinTypeForDWARFEncodingAndBitSize.

Thanks again

Thanks James. I removed the DLang part from this patch because it's orthogonal to the char8_t support. The changes might be fine but they really should have their own differential and test. I'll try to put up another patch later today if I find the time.

This looks good to me, but why are we using a nul character to test utf8 support? Shouldn't we insert some funnier characters too? I mean, one of the advantages of unicode is that it should not be affected by the system code pages and such, so hopefully this would not cause problems even on some more exotic setups. (And I am pretty sure I remember already seeing some chinese chars in some of our data formatter tests)

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/Makefile
7	I'm pretty sure this part isn't needed too, particularly as we now don't even run the clean actions.

Remove clean rule
Use more readable names

In D66447#1637640, @labath wrote:

This looks good to me, but why are we using a nul character to test utf8 support? Shouldn't we insert some funnier characters too? I mean, one of the advantages of unicode is that it should not be affected by the system code pages and such, so hopefully this would not cause problems even on some more exotic setups. (And I am pretty sure I remember already seeing some chinese chars in some of our data formatter tests)

I only glanced at the proposal, but unless I misunderstand the type only fits UTF-8 characters representable in 1 byte, which are basically just ASCII.

In D66447#1638047, @JDevlieghere wrote:

In D66447#1637640, @labath wrote:

This looks good to me, but why are we using a nul character to test utf8 support? Shouldn't we insert some funnier characters too? I mean, one of the advantages of unicode is that it should not be affected by the system code pages and such, so hopefully this would not cause problems even on some more exotic setups. (And I am pretty sure I remember already seeing some chinese chars in some of our data formatter tests)

I only glanced at the proposal, but unless I misunderstand the type only fits UTF-8 characters representable in 1 byte, which are basically just ASCII.

I have now too glanced at the proposal (just the cppreference page, really :) ). I think I understand where you got this impression from, but I don't think that is fully correct. It is true that a *single* char8_t variable can hold only 8 bit UTF8 code units (*not* characters), but that is not surprising since UTF8 is a variable length encoding, so you can't have a type that matches one character exactly. However, an *array* of char8_t is a completely different thing, and I am pretty sure that these are intended to hold utf8 strings containing any utf8 characters (otherwise, it wouldn't really deserve to call itself a utf8 type), and so we should print (and test) it as regular utf8.

However, this actually surfaces the question of how should we format single char8_t variables. It makes sense to display the character value if the value happens to be ASCII, but I guess we shouldn't print something like "invalid utf8 character" if it does contain one unit of the multibyte characters.

In D66447#1638783, @labath wrote:

In D66447#1638047, @JDevlieghere wrote:

In D66447#1637640, @labath wrote:

This looks good to me, but why are we using a nul character to test utf8 support? Shouldn't we insert some funnier characters too? I mean, one of the advantages of unicode is that it should not be affected by the system code pages and such, so hopefully this would not cause problems even on some more exotic setups. (And I am pretty sure I remember already seeing some chinese chars in some of our data formatter tests)

I only glanced at the proposal, but unless I misunderstand the type only fits UTF-8 characters representable in 1 byte, which are basically just ASCII.

I have now too glanced at the proposal (just the cppreference page, really :) ). I think I understand where you got this impression from, but I don't think that is fully correct. It is true that a *single* char8_t variable can hold only 8 bit UTF8 code units (*not* characters), but that is not surprising since UTF8 is a variable length encoding, so you can't have a type that matches one character exactly. However, an *array* of char8_t is a completely different thing, and I am pretty sure that these are intended to hold utf8 strings containing any utf8 characters (otherwise, it wouldn't really deserve to call itself a utf8 type), and so we should print (and test) it as regular utf8.

Sounds like I simply misunderstood your earlier comment. I thought you meant putting a full UTF-8 *character* in a `char8_t.

However, this actually surfaces the question of how should we format single char8_t variables. It makes sense to display the character value if the value happens to be ASCII, but I guess we shouldn't print something like "invalid utf8 character" if it does contain one unit of the multibyte characters.

What about the current implementation that prints both the hex and the ASCII value?

Use UTF8 string

In D66447#1638783, @labath wrote:

In D66447#1638047, @JDevlieghere wrote:

In D66447#1637640, @labath wrote:

This looks good to me, but why are we using a nul character to test utf8 support? Shouldn't we insert some funnier characters too? I mean, one of the advantages of unicode is that it should not be affected by the system code pages and such, so hopefully this would not cause problems even on some more exotic setups. (And I am pretty sure I remember already seeing some chinese chars in some of our data formatter tests)

I only glanced at the proposal, but unless I misunderstand the type only fits UTF-8 characters representable in 1 byte, which are basically just ASCII.

I have now too glanced at the proposal (just the cppreference page, really :) ). I think I understand where you got this impression from, but I don't think that is fully correct. It is true that a *single* char8_t variable can hold only 8 bit UTF8 code units (*not* characters), but that is not surprising since UTF8 is a variable length encoding, so you can't have a type that matches one character exactly. However, an *array* of char8_t is a completely different thing, and I am pretty sure that these are intended to hold utf8 strings containing any utf8 characters (otherwise, it wouldn't really deserve to call itself a utf8 type), and so we should print (and test) it as regular utf8.

However, this actually surfaces the question of how should we format single char8_t variables. It makes sense to display the character value if the value happens to be ASCII, but I guess we shouldn't print something like "invalid utf8 character" if it does contain one unit of the multibyte characters.

You may find the the C++ Evolution Working Groups entry on [N4197 Adding u8 character literals, [tiny] Why no u8 character literals?](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4540.html#119) and the proposal that add char8_t helpful in understanding the rationale and the proposal for char8_t runs through a lot of examples.

I changed the test to use frame variable again. With target variable the UTF-8 formatting doesn't work. Given that this patch just copies the Char16 and Char32 implementation, I think that's something for a different patch. I'll file a PR when this gets in.

(lldb) target variable ab
(const char8_t *) ab = 0x0000000100000fa6
(lldb) target variable abc
(char8_t [9]) abc = {
  [0] = 0xe4 u8'
                  [1] = 0xbd u8''                                                                                                                                                                                                                                                                                                                     [2] = 0xa0 u8''                                                                                                                                                                                                                                                                                                                                     [3] = 0xe5 u8'
                  [4] = 0xa5 u8''
  [5] = 0xbd u8''
  [6] = 0x00 u8'\0'
  [7] = 0x00 u8'\0'
  [8] = 0x00 u8'\0'
}

In D66447#1639490, @JDevlieghere wrote:

Sounds like I simply misunderstood your earlier comment. I thought you meant putting a full UTF-8 *character* in a `char8_t.

Ah yes, I can see how that request could have been interpreted this way. I'm glad that we understand each other.

However, this actually surfaces the question of how should we format single char8_t variables. It makes sense to display the character value if the value happens to be ASCII, but I guess we shouldn't print something like "invalid utf8 character" if it does contain one unit of the multibyte characters.

What about the current implementation that prints both the hex and the ASCII value?

I think that's fine. lgtm.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 21 2019, 2:34 PM

Closed by commit rL369582: Add char8_t support (C++20) (authored by JDevlieghere). · Explain Why

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2019, 2:34 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

A couple inline comments. I think this is looking pretty good.

lldb/trunk/packages/Python/lldbsuite/test/lang/cpp/char8_t/main.cpp
1 ↗	(On Diff #216481)	Is this include necessary?
lldb/trunk/source/Symbol/ClangASTContext.cpp
1391 ↗	(On Diff #216481)	I think the current style is to omit the `else` keywords on these, since each `if` returns. That would at least be consistent with several cases above.

ljmf00 mentioned this in D112564: [lldb] Add support for UTF-8 unicode formatting.Oct 26 2021, 11:02 AM

ljmf00 mentioned this in rG46cdcf087300: [lldb] Add support for UTF-8 unicode formatting.Dec 25 2021, 12:21 PM

Revision Contents

Path

Size

lldb/

include/

lldb/

lldb-enumerations.h

1 line

packages/

Python/

lldbsuite/

test/

lang/

cpp/

char8_t/

Makefile

8 lines

TestCxxChar8_t.py

48 lines

main.cpp

7 lines

source/

Commands/

CommandObjectMemory.cpp

2 lines

Plugins/

Language/

CPlusPlus/

CPlusPlusLanguage.cpp

11 lines

CxxStringTypes.h

6 lines

CxxStringTypes.cpp

51 lines

Symbol/

ClangASTContext.cpp

15 lines

Diff 216002

lldb/include/lldb/lldb-enumerations.h

Show First 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	enum Format {
eFormatDecimal,		eFormatDecimal,
eFormatEnum,		eFormatEnum,
eFormatHex,		eFormatHex,
eFormatHexUppercase,		eFormatHexUppercase,
eFormatFloat,		eFormatFloat,
eFormatOctal,		eFormatOctal,
eFormatOSType, // OS character codes encoded into an integer 'PICT' 'text'		eFormatOSType, // OS character codes encoded into an integer 'PICT' 'text'
// etc...		// etc...
		eFormatUnicode8,
		clayborgUnsubmitted Not Done Reply Inline Actions add to the end, or all LLDBRPC.framework calls that use this enumeration will fail for you. clayborg: add to the end, or all LLDBRPC.framework calls that use this enumeration will fail for you.
eFormatUnicode16,		eFormatUnicode16,
eFormatUnicode32,		eFormatUnicode32,
eFormatUnsigned,		eFormatUnsigned,
eFormatPointer,		eFormatPointer,
eFormatVectorOfChar,		eFormatVectorOfChar,
eFormatVectorOfSInt8,		eFormatVectorOfSInt8,
eFormatVectorOfUInt8,		eFormatVectorOfUInt8,
eFormatVectorOfSInt16,		eFormatVectorOfSInt16,
▲ Show 20 Lines • Show All 884 Lines • Show Last 20 Lines

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/Makefile

This file was added.

				LEVEL = ../../../make

				CXX_SOURCES := main.cpp
				CFLAGS := -g -O0 -std=c++2a

				clean: OBJECTS+=$(wildcard main.d.*)

				labathUnsubmitted Not Done Reply Inline Actions Replace with `CFLAGS_EXTRAS+=-std=c++2a` labath: Replace with `CFLAGS_EXTRAS+=-std=c++2a`
				labathUnsubmitted Not Done Reply Inline Actions I'm pretty sure this part isn't needed too, particularly as we now don't even run the clean actions. labath: I'm pretty sure this part isn't needed too, particularly as we now don't even run the clean…
				include $(LEVEL)/Makefile.rules

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/TestCxxChar8_t.py

This file was added.

				# coding=utf8
				"""
				Test that C++ supports char8_t correctly.
				"""

				from __future__ import print_function

				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				import lldbsuite.test.lldbutil as lldbutil


				class CxxChar8_tTestCase(TestBase):

				mydir = TestBase.compute_mydir(__file__)

				def setUp(self):
				# Call super's setUp().
				TestBase.setUp(self)
				# Find the line number to break for main.cpp.
				self.source = 'main.cpp'
				self.line = line_number(self.source,
				'// Set break point at this line.')

				@skipIf(compiler="clang", compiler_version=['<', '5.0'])
				JDevlieghereAuthorUnsubmitted Done Reply Inline Actions This should be 7 JDevlieghere: This should be 7
				jblachlyUnsubmitted Not Done Reply Inline Actions I believe clang7 requires -fchar8_t, whereas the test harness here passes -std=c++2a ; char8_t is enabled via -std=c++2a beginning in clang-8 "(11): Prior to Clang 8, this feature is not enabled by -std=c++2a, but can be enabled with -fchar8_t. " https://clang.llvm.org/cxx_status.html#p0482 jblachly: I believe clang7 requires -fchar8_t, whereas the test harness here passes -std=c++2a ; char8_t…
				def test(self):
				"""Test that C++ supports wchar_t correctly."""
				self.build()
				exe = self.getBuildArtifact("a.out")

				# Create a target by the debugger.
				target = self.dbg.CreateTarget(exe)
				self.assertTrue(target, VALID_TARGET)

				# Break on the struct declration statement in main.cpp.
				lldbutil.run_break_set_by_file_and_line(self, "main.cpp", self.line)

				# Now launch the process, and do not stop at entry point.
				process = target.LaunchSimple(None, None,
				self.get_process_working_directory())

				if not process:
				self.fail("SBTarget.Launch() failed")

				# Check that we correctly report templates on wchar_t
				self.expect(
				labathUnsubmitted Not Done Reply Inline Actions This comment looks misplaced. labath: This comment looks misplaced.
				"frame variable c8", substrs=["(char8_t) c8 = 0x00 u8'\\0'"])

lldb/packages/Python/lldbsuite/test/lang/cpp/char8_t/main.cpp

This file was added.

				#include <cstring>

				int main (int argc, char const *argv[])
				{
				char8_t c8 = u8'\0';
				return 0; // Set break point at this line.
				}

lldb/source/Commands/CommandObjectMemory.cpp

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	case eFormatPointer:
format_options.GetCountValue() = 8;		format_options.GetCountValue() = 8;
break;		break;

case eFormatBinary:		case eFormatBinary:
case eFormatFloat:		case eFormatFloat:
case eFormatOctal:		case eFormatOctal:
case eFormatDecimal:		case eFormatDecimal:
case eFormatEnum:		case eFormatEnum:
		case eFormatUnicode8:
case eFormatUnicode16:		case eFormatUnicode16:
case eFormatUnicode32:		case eFormatUnicode32:
case eFormatUnsigned:		case eFormatUnsigned:
case eFormatHexFloat:		case eFormatHexFloat:
if (!byte_size_option_set)		if (!byte_size_option_set)
byte_size_value = 4;		byte_size_value = 4;
if (!num_per_line_option_set)		if (!num_per_line_option_set)
m_num_per_line = 1;		m_num_per_line = 1;
▲ Show 20 Lines • Show All 1,232 Lines • ▼ Show 20 Lines	bool DoExecute(Args &command, CommandReturnObject &result) override {
for (auto &entry : command) {		for (auto &entry : command) {
switch (m_format_options.GetFormat()) {		switch (m_format_options.GetFormat()) {
case kNumFormats:		case kNumFormats:
case eFormatFloat: // TODO: add support for floats soon		case eFormatFloat: // TODO: add support for floats soon
case eFormatCharPrintable:		case eFormatCharPrintable:
case eFormatBytesWithASCII:		case eFormatBytesWithASCII:
case eFormatComplex:		case eFormatComplex:
case eFormatEnum:		case eFormatEnum:
		case eFormatUnicode8:
case eFormatUnicode16:		case eFormatUnicode16:
case eFormatUnicode32:		case eFormatUnicode32:
case eFormatVectorOfChar:		case eFormatVectorOfChar:
case eFormatVectorOfSInt8:		case eFormatVectorOfSInt8:
case eFormatVectorOfUInt8:		case eFormatVectorOfUInt8:
case eFormatVectorOfSInt16:		case eFormatVectorOfSInt16:
case eFormatVectorOfUInt16:		case eFormatVectorOfUInt16:
case eFormatVectorOfSInt32:		case eFormatVectorOfSInt32:
▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp

Show First 20 Lines • Show All 848 Lines • ▼ Show 20 Lines	string_array_flags.SetCascades(true)
.SetDontShowChildren(true)		.SetDontShowChildren(true)
.SetDontShowValue(true)		.SetDontShowValue(true)
.SetShowMembersOneLiner(false)		.SetShowMembersOneLiner(false)
.SetHideItemNames(false);		.SetHideItemNames(false);

// FIXME because of a bug in the FormattersContainer we need to add a summary		// FIXME because of a bug in the FormattersContainer we need to add a summary
// for both X* and const X* (<rdar://problem/12717717>)		// for both X* and const X* (<rdar://problem/12717717>)
AddCXXSummary(		AddCXXSummary(
		cpp_category_sp, lldb_private::formatters::Char8StringSummaryProvider,
		"char8_t * summary provider", ConstString("char8_t *"), string_flags);
		AddCXXSummary(cpp_category_sp,
		lldb_private::formatters::Char8StringSummaryProvider,
		"char8_t [] summary provider",
		ConstString("char8_t \\[[0-9]+\\]"), string_array_flags, true);
		labathUnsubmitted Not Done Reply Inline Actions It looks like you're also adding formatters for `char8_t ` and `char8_t[N]`. I guess those should be tested too... labath:* It looks like you're also adding formatters for `char8_t *` and `char8_t[N]`. I guess those…

		AddCXXSummary(
cpp_category_sp, lldb_private::formatters::Char16StringSummaryProvider,		cpp_category_sp, lldb_private::formatters::Char16StringSummaryProvider,
"char16_t * summary provider", ConstString("char16_t *"), string_flags);		"char16_t * summary provider", ConstString("char16_t *"), string_flags);
AddCXXSummary(cpp_category_sp,		AddCXXSummary(cpp_category_sp,
lldb_private::formatters::Char16StringSummaryProvider,		lldb_private::formatters::Char16StringSummaryProvider,
"char16_t [] summary provider",		"char16_t [] summary provider",
ConstString("char16_t \\[[0-9]+\\]"), string_array_flags, true);		ConstString("char16_t \\[[0-9]+\\]"), string_array_flags, true);

AddCXXSummary(		AddCXXSummary(
Show All 20 Lines	static void LoadSystemFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {
widechar_flags.SetDontShowValue(true)		widechar_flags.SetDontShowValue(true)
.SetSkipPointers(true)		.SetSkipPointers(true)
.SetSkipReferences(false)		.SetSkipReferences(false)
.SetCascades(true)		.SetCascades(true)
.SetDontShowChildren(true)		.SetDontShowChildren(true)
.SetHideItemNames(true)		.SetHideItemNames(true)
.SetShowMembersOneLiner(false);		.SetShowMembersOneLiner(false);

		AddCXXSummary(cpp_category_sp, lldb_private::formatters::Char8SummaryProvider,
		"char8_t summary provider", ConstString("char8_t"),
		widechar_flags);
AddCXXSummary(		AddCXXSummary(
cpp_category_sp, lldb_private::formatters::Char16SummaryProvider,		cpp_category_sp, lldb_private::formatters::Char16SummaryProvider,
"char16_t summary provider", ConstString("char16_t"), widechar_flags);		"char16_t summary provider", ConstString("char16_t"), widechar_flags);
AddCXXSummary(		AddCXXSummary(
cpp_category_sp, lldb_private::formatters::Char32SummaryProvider,		cpp_category_sp, lldb_private::formatters::Char32SummaryProvider,
"char32_t summary provider", ConstString("char32_t"), widechar_flags);		"char32_t summary provider", ConstString("char32_t"), widechar_flags);
AddCXXSummary(cpp_category_sp, lldb_private::formatters::WCharSummaryProvider,		AddCXXSummary(cpp_category_sp, lldb_private::formatters::WCharSummaryProvider,
"wchar_t summary provider", ConstString("wchar_t"),		"wchar_t summary provider", ConstString("wchar_t"),
▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

lldb/source/Plugins/Language/CPlusPlus/CxxStringTypes.h

	Show All 10 Lines
	#define liblldb_CxxStringTypes_h_			#define liblldb_CxxStringTypes_h_

	#include "lldb/Core/ValueObject.h"			#include "lldb/Core/ValueObject.h"
	#include "lldb/DataFormatters/TypeSummary.h"			#include "lldb/DataFormatters/TypeSummary.h"
	#include "lldb/Utility/Stream.h"			#include "lldb/Utility/Stream.h"

	namespace lldb_private {			namespace lldb_private {
	namespace formatters {			namespace formatters {
				bool Char8StringSummaryProvider(ValueObject &valobj, Stream &stream,
				const TypeSummaryOptions &options); // char8_t*

	bool Char16StringSummaryProvider(			bool Char16StringSummaryProvider(
	ValueObject &valobj, Stream &stream,			ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // char16_t* and unichar*			const TypeSummaryOptions &options); // char16_t* and unichar*

	bool Char32StringSummaryProvider(			bool Char32StringSummaryProvider(
	ValueObject &valobj, Stream &stream,			ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // char32_t*			const TypeSummaryOptions &options); // char32_t*

	bool WCharStringSummaryProvider(ValueObject &valobj, Stream &stream,			bool WCharStringSummaryProvider(ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // wchar_t*			const TypeSummaryOptions &options); // wchar_t*

				bool Char8SummaryProvider(ValueObject &valobj, Stream &stream,
				const TypeSummaryOptions &options); // char8_t

	bool Char16SummaryProvider(			bool Char16SummaryProvider(
	ValueObject &valobj, Stream &stream,			ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // char16_t and unichar			const TypeSummaryOptions &options); // char16_t and unichar

	bool Char32SummaryProvider(ValueObject &valobj, Stream &stream,			bool Char32SummaryProvider(ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // char32_t			const TypeSummaryOptions &options); // char32_t

	bool WCharSummaryProvider(ValueObject &valobj, Stream &stream,			bool WCharSummaryProvider(ValueObject &valobj, Stream &stream,
	const TypeSummaryOptions &options); // wchar_t			const TypeSummaryOptions &options); // wchar_t

	} // namespace formatters			} // namespace formatters
	} // namespace lldb_private			} // namespace lldb_private

	#endif // liblldb_CxxStringTypes_h_			#endif // liblldb_CxxStringTypes_h_

lldb/source/Plugins/Language/CPlusPlus/CxxStringTypes.cpp

Show All 26 Lines
#include "lldb/Utility/Stream.h"		#include "lldb/Utility/Stream.h"

#include <algorithm>		#include <algorithm>

using namespace lldb;		using namespace lldb;
using namespace lldb_private;		using namespace lldb_private;
using namespace lldb_private::formatters;		using namespace lldb_private::formatters;

		bool lldb_private::formatters::Char8StringSummaryProvider(
		ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {
		ProcessSP process_sp = valobj.GetProcessSP();
		if (!process_sp)
		return false;

		lldb::addr_t valobj_addr = GetArrayAddressOrPointerValue(valobj);
		if (valobj_addr == 0 \|\| valobj_addr == LLDB_INVALID_ADDRESS)
		return false;

		StringPrinter::ReadStringAndDumpToStreamOptions options(valobj);
		options.SetLocation(valobj_addr);
		options.SetProcessSP(process_sp);
		options.SetStream(&stream);
		options.SetPrefixToken("u8");

		if (!StringPrinter::ReadStringAndDumpToStream<
		StringPrinter::StringElementType::UTF8>(options)) {
		stream.Printf("Summary Unavailable");
		return true;
		}

		return true;
		}

bool lldb_private::formatters::Char16StringSummaryProvider(		bool lldb_private::formatters::Char16StringSummaryProvider(
ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {		ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {
ProcessSP process_sp = valobj.GetProcessSP();		ProcessSP process_sp = valobj.GetProcessSP();
if (!process_sp)		if (!process_sp)
return false;		return false;

lldb::addr_t valobj_addr = GetArrayAddressOrPointerValue(valobj);		lldb::addr_t valobj_addr = GetArrayAddressOrPointerValue(valobj);
if (valobj_addr == 0 \|\| valobj_addr == LLDB_INVALID_ADDRESS)		if (valobj_addr == 0 \|\| valobj_addr == LLDB_INVALID_ADDRESS)
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	return StringPrinter::ReadStringAndDumpToStream<
StringPrinter::StringElementType::UTF32>(options);		StringPrinter::StringElementType::UTF32>(options);
default:		default:
stream.Printf("size for wchar_t is not valid");		stream.Printf("size for wchar_t is not valid");
return true;		return true;
}		}
return true;		return true;
}		}

		bool lldb_private::formatters::Char8SummaryProvider(
		ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {
		DataExtractor data;
		Status error;
		valobj.GetData(data, error);

		if (error.Fail())
		return false;

		std::string value;
		valobj.GetValueAsCString(lldb::eFormatUnicode8, value);
		if (!value.empty())
		stream.Printf("%s ", value.c_str());

		StringPrinter::ReadBufferAndDumpToStreamOptions options(valobj);
		options.SetData(data);
		options.SetStream(&stream);
		options.SetPrefixToken("u8");
		options.SetQuote('\'');
		options.SetSourceSize(1);
		options.SetBinaryZeroIsTerminator(false);

		return StringPrinter::ReadBufferAndDumpToStream<
		StringPrinter::StringElementType::UTF8>(options);
		}

bool lldb_private::formatters::Char16SummaryProvider(		bool lldb_private::formatters::Char16SummaryProvider(
ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {		ValueObject &valobj, Stream &stream, const TypeSummaryOptions &) {
DataExtractor data;		DataExtractor data;
Status error;		Status error;
valobj.GetData(data, error);		valobj.GetData(data, error);

if (error.Fail())		if (error.Fail())
return false;		return false;
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

lldb/source/Symbol/ClangASTContext.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,372 Lines • ▼ Show 20 Lines	case DW_ATE_unsigned_char:
return CompilerType(this, ast->UnsignedShortTy.getAsOpaquePtr());		return CompilerType(this, ast->UnsignedShortTy.getAsOpaquePtr());
break;		break;

case DW_ATE_imaginary_float:		case DW_ATE_imaginary_float:
break;		break;

case DW_ATE_UTF:		case DW_ATE_UTF:
if (type_name) {		if (type_name) {
if (streq(type_name, "char16_t")) {		if (streq(type_name, "char16_t") \|\|
		streq(type_name, "wchar") \|\| // dlang
		streq(type_name, "const(wchar)") \|\|
		streq(type_name, "immutable(wchar)")) {
		clayborgUnsubmitted Not Done Reply Inline Actions are "const(T)" and "immutable(T)" the actualy type names or are they layers on top of a base "wchar" type? These shouldn't be needed if so as the base "wchar" type should end up handling the base type correctly. Remove? clayborg: are "const(T)" and "immutable(T)" the actualy type names or are they layers on top of a base…
		jblachlyUnsubmitted Not Done Reply Inline Actions In LDC (the LLVM D compiler), application of type qualifier immutable or const to a type T defines a new type. I am not a DWARF expert, but running a sample program through lldb seems to confirm this: error: need to add support for DW_TAG_base_type 'char' encoded with DW_ATE = 0x10, bit_size = 8 error: need to add support for DW_TAG_base_type 'const(char)' encoded with DW_ATE = 0x10, bit_size = 8 error: need to add support for DW_TAG_base_type 'immutable(char)' encoded with DW_ATE = 0x10, bit_size = 8 Whereas -- interestingly -- the reference compiler DMD encodes them differently (as char, const char, and char; respectively as the DWARF spec I guess has no qualifier for immutable). The LDC behavior is IMO more true to language spec. jblachly: In LDC (the LLVM D compiler), application of type qualifier immutable or const to a type T…
return CompilerType(this, ast->Char16Ty.getAsOpaquePtr());		return CompilerType(this, ast->Char16Ty.getAsOpaquePtr());
} else if (streq(type_name, "char32_t")) {		} else if (streq(type_name, "char32_t") \|\|
		streq(type_name, "dchar") \|\| // dlang
		streq(type_name, "const(dchar") \|\|
		streq(type_name, "immutable(dchar)")) {
		clayborgUnsubmitted Not Done Reply Inline Actions ditto clayborg: ditto
return CompilerType(this, ast->Char32Ty.getAsOpaquePtr());		return CompilerType(this, ast->Char32Ty.getAsOpaquePtr());
		} else if (streq(type_name, "char8_t") \|\| // C++20
		streq(type_name, "char") \|\| // dlang
		streq(type_name, "const(char)") \|\|
		streq(type_name, "immutable(char)")) {
		clayborgUnsubmitted Not Done Reply Inline Actions ditto clayborg: ditto
		return CompilerType(this, ast->Char8Ty.getAsOpaquePtr());
}		}
}		}
break;		break;
}		}
}		}
// This assert should fire for anything that we don't catch above so we know		// This assert should fire for anything that we don't catch above so we know
// to fix any issues we run into.		// to fix any issues we run into.
if (type_name) {		if (type_name) {
▲ Show 20 Lines • Show All 9,095 Lines • Show Last 20 Lines