Download Raw Diff

Details

Reviewers

muiez
abhina.sreeskantharajan
SeanP
zibi
Mordante
• Quuxplusone

Group Reviewers

Restricted Project

Summary

This patch is to fix issue related to charconv , the problem is in EBCDIC table on z/OS, nth letters are not continuous, they are grouped into a few groups and each group has 9 character. so need new logic in EBCDIC mode to return a correct integer value for nth letters.

Diff Detail

Event Timeline

NancyWang2222 requested review of this revision.Feb 2 2022, 2:28 PM

NancyWang2222 created this revision.

NancyWang2222 edited the summary of this revision. (Show Details)

NancyWang2222 retitled this revision from [SystemZ]:[libcxx]: fix nthLetter issue for charconv header to [SystemZ]:[z/OS]:[libcxx]: fix nthLetter issue for charconv header .Feb 2 2022, 2:51 PM

Harbormaster completed remote builds in B147259: Diff 405449.Feb 2 2022, 3:59 PM

move #include <locale> up one line according to alphabetical order

Harbormaster completed remote builds in B147305: Diff 405512.Feb 2 2022, 8:38 PM

only include <locale> when _LIBCPP_HAS_NO_LOCALIZATION is not defined

Harbormaster completed remote builds in B147405: Diff 405648.Feb 3 2022, 9:54 AM

put guard in __locale when include locale.h

Harbormaster completed remote builds in B147434: Diff 405694.Feb 3 2022, 12:45 PM

abhina.sreeskantharajan added inline comments.Feb 3 2022, 12:59 PM

libcxx/include/charconv
92	I think we need a guard here as well to fix the CI failure.

need a guard in libcxx/include/charconv as well.

Harbormaster completed remote builds in B147488: Diff 405768.Feb 3 2022, 3:35 PM

include <cctype> so tolower function can be recognized when localization is disabled

Harbormaster completed remote builds in B147538: Diff 405840.Feb 3 2022, 6:57 PM

ping -:)

LGTM

This revision is now accepted and ready to land.Feb 4 2022, 7:21 AM

LGMT modulo some nits. Please wait for a libc++ approval before committing.

libcxx/include/charconv
436	Since none of the code used in this header is `constexpr`, it has no effect here.
437	Please use the style already used. I also think the name isn't that clear, maybe `__letter_value`?
443	I'm not familiar with EBCDIC so I just assume this is correct.
448	Please remove one blank line.

This revision now requires review to proceed.Feb 9 2022, 11:02 AM

Mordante mentioned this in D118851: [SystemZ]:[z/OS]:[libcxx]: fix isascii function to work for z/OS.Feb 9 2022, 11:22 AM

NancyWang2222 added inline comments.Feb 9 2022, 6:39 PM

libcxx/include/charconv
436	Thanks will remove it
437	OK. I will use __letter_value
443	it was tested on z/OS which has both ASCII and EBCDIC
448	will remove blank line. thanks

address comment: change function name from LetterNum to letter_value, and remove extra blank line , also remove constexpr.

Harbormaster completed remote builds in B148641: Diff 407371.Feb 9 2022, 7:10 PM

I rebased with latest master. hope it will fix error.

NancyWang2222 marked 2 inline comments as done.Feb 10 2022, 9:47 AM

NancyWang2222 marked 2 inline comments as done.

Harbormaster completed remote builds in B148784: Diff 407586.Feb 10 2022, 2:47 PM

abhina.sreeskantharajan added a project: Restricted Project.Feb 11 2022, 6:25 AM

match the format suggested by git-clang-format

ldionne added a subscriber: ldionne.Feb 11 2022, 9:50 AM

ldionne added inline comments.

libcxx/include/__locale
18–19	I don't understand why that's necessary. The intent is for `__locale` not to be includable when `_LIBCPP_HAS_NO_LOCALIZATION` is enabled.
libcxx/include/charconv
436	I'm not sure I understand what that function is actually doing. Is it taking a character and returning an integer that represents the position of that character in a usual human-understandable ordering from `a` to `z`? It doesn't handle uppercase characters and that's why you use `tolower` below, right? In that case, I think it would make sense to rename it to something like `__alphabetical_index` or something like that, and it should return a real integer, not the character type taken as an input. In other words, if that's a function from "characters" to "integers", let's be explicit about it -- I suspect it might be useful in other places. We might also want to make it support uppercase letters.

Harbormaster completed remote builds in B148992: Diff 407889.Feb 11 2022, 9:56 AM

remove extra space

NancyWang2222 added inline comments.Feb 11 2022, 1:15 PM

libcxx/include/charconv
436	function returns re-ordered index, unlike in ASCIi the letter is continuous, EBCDIC letters are grouped in 3 groups for both lower case and upper case. upper case letter and lower letter in EBCDIC table are different value but similar pattern after order [a-z] and [A-Z]. I will change name to __alphabetical_index . change return type to int.

NancyWang2222 added inline comments.Feb 11 2022, 1:20 PM

libcxx/include/charconv
436	forgot to mention, use tolower function to reduce duplicate code, since both fall into similar patter , just different letter value.

• Quuxplusone added a subscriber: • Quuxplusone.Feb 11 2022, 1:31 PM

• Quuxplusone added inline comments.

libcxx/include/charconv
437	FWIW, I found `__letter_value` confusing on first reading, because I would think that the "value" of `a` should be 10 and the "value" of `f` should be 15. In fact, this function returns 0 for `a` and 6 for `f`; i.e., it's returning the letter's zero-indexed index in the alphabet. It might be too cutesy, but if this were my code, I'd name this function simply `__letter_minus_a`. It basically returns `(__c - 'a')`, modulo any relativistic effects due to EBCDIC.

NancyWang2222 added inline comments.Feb 11 2022, 2:18 PM

libcxx/include/charconv
437	@ldionne suggests __alphabetical_index which match what function does. i will use that name. we order letters to be new alphabetical index

change function name letter_value to alphabetical_index

Harbormaster completed remote builds in B149109: Diff 408052.Feb 11 2022, 3:57 PM

@ldionne @Quuxplusone I have addressed comment, Can you help review again. Thanks for the feedback.

NancyWang2222 set the repository for this revision to rG LLVM Github Monorepo.Feb 14 2022, 8:59 AM

Herald added a subscriber: libcxx-commits. · View Herald TranscriptFeb 14 2022, 8:59 AM

NancyWang2222 marked an inline comment as done.Feb 14 2022, 6:51 PM

SeanP added inline comments.Feb 15 2022, 10:46 AM

libcxx/include/charconv
443	fyi ... if you want a reference to EBCDIC code points. http://en.wikipedia.org/wiki/Ebcdic. Note EBCDIC has these "variant" code points that change value in different code pages (eg. 1047 vs others). Some characters that change code point are '[', ']', '{', '}', '#' which is just awesome for C/C++ programming.

ping :)

libcxx/include/charconv
443	Thanks Sean. good information for anyone who wants to know about about EBCDIC.

Can I have 2nd review from libcxx? Thanks

please kindly review this patch again. thanks.

• Quuxplusone requested changes to this revision.Mar 1 2022, 8:31 AM

• Quuxplusone added inline comments.

libcxx/include/__locale

18–19

@NancyWang2222: I don't think this comment from Louis has been addressed/answered. What goes wrong if you remove this diff?
(But I fully expect this will be moot and you can remove this diff, after adopting my suggested rewrite below.)

libcxx/include/charconv

472–473

Is this the only reason you #include <locale> at the top of this file? I don't think we should do that. We should keep the nice fast arithmetic (the old code) for ASCII platforms, at least. Is there any way to express the EBCDIC codepath in speedy terms, or must we call out to tolower when #ifdef __MVS__?
IOW, I suggest lines 474-476 become this instead:

#if defined(__MVS__) && !defined(__NATIVE_ASCII_F)
    if ('a' <= __c && __c <= 'i')
        return {__c - 'a' + 10 < __base, __c - 'a' + 10};
    else if ('j' <= __c && __c <= 'r')
        return {__c - 'j' + 19 < __base, __c - 'j' + 19};
    else if ('s' <= __c && __c <= 'z') 
        return {__c - 's' + 28 < __base, __c - 's' + 28};
    else if ('A' <= __c && __c <= 'I')
        return {__c - 'A' + 10 < __base, __c - 'A' + 10};
    else if ('J' <= __c && __c <= 'R')
        return {__c - 'J' + 19 < __base, __c - 'J' + 19};
    else if ('S' <= __c && __c <= 'Z') 
        return {__c - 'S' + 28 < __base, __c - 'S' + 28};
    else
        return {false, 0};
#else
    else if ('a' <= __c && __c < 'a' + __base - 10)
        return {true, __c - 'a' + 10};
    else
        return {'A' <= __c && __c < 'A' + __base - 10, __c - 'A' + 10};
#endif

This revision now requires changes to proceed.Mar 1 2022, 8:31 AM

Diff 407371

libcxx/include/__locale

	// -- C++ --			// -- C++ --
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef _LIBCPP___LOCALE			#ifndef _LIBCPP___LOCALE
	#define _LIBCPP___LOCALE			#define _LIBCPP___LOCALE

	#include <__availability>			#include <__availability>
	#include <__config>			#include <__config>
	#include <cctype>			#include <cctype>
	#include <cstdint>			#include <cstdint>
				#if !defined(_LIBCPP_HAS_NO_LOCALIZATION)
	#include <locale.h>			#include <locale.h>
				#endif
				ldionneUnsubmitted Not Done Reply Inline Actions I don't understand why that's necessary. The intent is for `__locale` not to be includable when `_LIBCPP_HAS_NO_LOCALIZATION` is enabled. ldionne: I don't understand why that's necessary. The intent is for `__locale` not to be includable when…
				QuuxplusoneUnsubmitted Not Done Reply Inline Actions @NancyWang2222: I don't think this comment from Louis has been addressed/answered. What goes wrong if you remove this diff? (But I fully expect this will be moot and you can remove this diff, after adopting my suggested rewrite below.) Quuxplusone: @NancyWang2222: I don't think this comment from Louis has been addressed/answered. What goes…
	#include <memory>			#include <memory>
	#include <mutex>			#include <mutex>
	#include <string>			#include <string>
	#include <utility>			#include <utility>

	#if defined(_LIBCPP_MSVCRT_LIKE)			#if defined(_LIBCPP_MSVCRT_LIKE)
	# include <cstring>			# include <cstring>
	# include <__support/win32/locale_win32.h>			# include <__support/win32/locale_win32.h>
	▲ Show 20 Lines • Show All 1,787 Lines • Show Last 20 Lines

libcxx/include/charconv

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines

#include <__charconv/to_chars_result.h> #include <__charconv/to_chars_result.h>

#include <__config> #include <__config>

#include <__errc> #include <__errc>

#include <cmath> // for log2f #include <cmath> // for log2f

#include <cstdint> #include <cstdint>

#include <cstdlib> // for _LIBCPP_UNREACHABLE #include <cstdlib> // for _LIBCPP_UNREACHABLE

#include <cstring> #include <cstring>

#include <limits> #include <limits>

#include <type_traits> #include <type_traits>

abhina.sreeskantharajanUnsubmitted

Done

I think we need a guard here as well to fix the CI failure.

abhina.sreeskantharajan: I think we need a guard here as well to fix the CI failure.

#include <__debug> #include <__debug>

#if defined(_LIBCPP_HAS_NO_LOCALIZATION)

#include <cctype>

#else

#include <locale>

#endif

#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER) #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)

#pragma GCC system_header #pragma GCC system_header

#endif #endif

_LIBCPP_PUSH_MACROS _LIBCPP_PUSH_MACROS

#include <__undef_macros> #include <__undef_macros>

_LIBCPP_BEGIN_NAMESPACE_STD _LIBCPP_BEGIN_NAMESPACE_STD

▲ Show 20 Lines • Show All 318 Lines • ▼ Show 20 Lines else

return __r; return __r;

} }

return {__r.ptr, errc::result_out_of_range}; return {__r.ptr, errc::result_out_of_range};

} }

template <typename _Tp> template <typename _Tp>

inline _LIBCPP_INLINE_VISIBILITY _Tp

MordanteUnsubmitted

Done

template <typename _Tp>

- inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR _Tp

+ inline _LIBCPP_INLINE_VISIBILITY _Tp

__LetterNum(_Tp __c)

Since none of the code used in this header is constexpr, it has no effect here.

Mordante: Since none of the code used in this header is `constexpr`, it has no effect here.

NancyWang2222AuthorUnsubmitted

Done

Thanks will remove it

NancyWang2222: Thanks will remove it

ldionneUnsubmitted

Not Done

I'm not sure I understand what that function is actually doing. Is it taking a character and returning an integer that represents the position of that character in a usual human-understandable ordering from a to z? It doesn't handle uppercase characters and that's why you use tolower below, right?

In that case, I think it would make sense to rename it to something like __alphabetical_index or something like that, and it should return a real integer, not the character type taken as an input. In other words, if that's a function from "characters" to "integers", let's be explicit about it -- I suspect it might be useful in other places. We might also want to make it support uppercase letters.

ldionne: I'm not sure I understand what that function is actually doing. Is it taking a character and…

NancyWang2222AuthorUnsubmitted

Done

function returns re-ordered index, unlike in ASCIi the letter is continuous, EBCDIC letters are grouped in 3 groups for both lower case and upper case. upper case letter and lower letter in EBCDIC table are different value but similar pattern after order [a-z] and [A-Z]. I will change name to __alphabetical_index . change return type to int.

NancyWang2222: function returns re-ordered index, unlike in ASCIi the letter is continuous, EBCDIC letters are…

NancyWang2222AuthorUnsubmitted

Done

forgot to mention, use tolower function to reduce duplicate code, since both fall into similar patter , just different letter value.

NancyWang2222: forgot to mention, use tolower function to reduce duplicate code, since both fall into similar…

__letter_value(_Tp __c)

MordanteUnsubmitted

Done

inline _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR _Tp

- __LetterNum(_Tp __c)

+ __letter_num(_Tp __c)

{

#if defined(__MVS__) && !defined(__NATIVE_ASCII_F)

Please use the style already used. I also think the name isn't that clear, maybe __letter_value?

Mordante: Please use the style already used. I also think the name isn't that clear, maybe…

NancyWang2222AuthorUnsubmitted

Done

OK. I will use __letter_value

NancyWang2222: OK. I will use __letter_value

QuuxplusoneUnsubmitted

Done

FWIW, I found __letter_value confusing on first reading, because I would think that the "value" of a should be 10 and the "value" of f should be 15. In fact, this function returns 0 for a and 6 for f; i.e., it's returning the letter's zero-indexed index in the alphabet.
It might be too cutesy, but if this were my code, I'd name this function simply __letter_minus_a. It basically returns (__c - 'a'), modulo any relativistic effects due to EBCDIC.

Quuxplusone: FWIW, I found `__letter_value` confusing on first reading, because I would think that the…

NancyWang2222AuthorUnsubmitted

Done

@ldionne suggests __alphabetical_index which match what function does. i will use that name. we order letters to be new alphabetical index

NancyWang2222: @ldionne suggests __alphabetical_index which match what function does. i will use that name. we…

{

#if defined(__MVS__) && !defined(__NATIVE_ASCII_F)

if ('a' <= __c && __c <= 'i') return __c - 'a';

else if ('j' <= __c && __c <= 'r') return __c - 'j' + 9;

else if ('s' <= __c && __c <= 'z') return __c - 's' + 18;

else return 27;

MordanteUnsubmitted

Done

I'm not familiar with EBCDIC so I just assume this is correct.

Mordante: I'm not familiar with EBCDIC so I just assume this is correct.

NancyWang2222AuthorUnsubmitted

Done

it was tested on z/OS which has both ASCII and EBCDIC

NancyWang2222: it was tested on z/OS which has both ASCII and EBCDIC

SeanPUnsubmitted

Done

fyi ... if you want a reference to EBCDIC code points. http://en.wikipedia.org/wiki/Ebcdic. Note EBCDIC has these "variant" code points that change value in different code pages (eg. 1047 vs others). Some characters that change code point are '[', ']', '{', '}', '#' which is just awesome for C/C++ programming.

SeanP: fyi ... if you want a reference to EBCDIC code points. http://en.wikipedia.org/wiki/Ebcdic.

NancyWang2222AuthorUnsubmitted

Done

Thanks Sean. good information for anyone who wants to know about about EBCDIC.

NancyWang2222: Thanks Sean. good information for anyone who wants to know about about EBCDIC.

#else

return __c - 'a';

#endif

}

MordanteUnsubmitted

Done

Please remove one blank line.

Mordante: Please remove one blank line.

NancyWang2222AuthorUnsubmitted

Done

will remove blank line. thanks

NancyWang2222: will remove blank line. thanks

template <typename _Tp>

inline _LIBCPP_INLINE_VISIBILITY bool inline _LIBCPP_INLINE_VISIBILITY bool

__in_pattern(_Tp __c) __in_pattern(_Tp __c)

{ {

return '0' <= __c && __c <= '9'; return '0' <= __c && __c <= '9';

} }

struct _LIBCPP_HIDDEN __in_pattern_result struct _LIBCPP_HIDDEN __in_pattern_result

{ {

bool __ok; bool __ok;

int __val; int __val;

explicit _LIBCPP_INLINE_VISIBILITY operator bool() const { return __ok; } explicit _LIBCPP_INLINE_VISIBILITY operator bool() const { return __ok; }

}; };

template <typename _Tp> template <typename _Tp>

inline _LIBCPP_INLINE_VISIBILITY __in_pattern_result inline _LIBCPP_INLINE_VISIBILITY __in_pattern_result

__in_pattern(_Tp __c, int __base) __in_pattern(_Tp __c, int __base)

{ {

if (__base <= 10) if (__base <= 10)

return {'0' <= __c && __c < '0' + __base, __c - '0'}; return {'0' <= __c && __c < '0' + __base, __c - '0'};

else if (__in_pattern(__c)) else if (__in_pattern(__c))

return {true, __c - '0'}; return {true, __c - '0'};

else if ('a' <= __c && __c < 'a' + __base - 10) __c = _VSTD::tolower(__c);

return {true, __c - 'a' + 10}; return {'a' <= __c && __letter_value(__c) < __base - 10, __letter_value(__c) + 10};

QuuxplusoneUnsubmitted

Not Done

#if defined(__MVS__) && !defined(__NATIVE_ASCII_F)
    if ('a' <= __c && __c <= 'i')
        return {__c - 'a' + 10 < __base, __c - 'a' + 10};
    else if ('j' <= __c && __c <= 'r')
        return {__c - 'j' + 19 < __base, __c - 'j' + 19};
    else if ('s' <= __c && __c <= 'z') 
        return {__c - 's' + 28 < __base, __c - 's' + 28};
    else if ('A' <= __c && __c <= 'I')
        return {__c - 'A' + 10 < __base, __c - 'A' + 10};
    else if ('J' <= __c && __c <= 'R')
        return {__c - 'J' + 19 < __base, __c - 'J' + 19};
    else if ('S' <= __c && __c <= 'Z') 
        return {__c - 'S' + 28 < __base, __c - 'S' + 28};
    else
        return {false, 0};
#else
    else if ('a' <= __c && __c < 'a' + __base - 10)
        return {true, __c - 'a' + 10};
    else
        return {'A' <= __c && __c < 'A' + __base - 10, __c - 'A' + 10};
#endif

Quuxplusone: Is this the //only// reason you `#include <locale>` at the top of this file? I don't think we…

else

return {'A' <= __c && __c < 'A' + __base - 10, __c - 'A' + 10};

} }

template <typename _It, typename _Tp, typename _Fn, typename... _Ts> template <typename _It, typename _Tp, typename _Fn, typename... _Ts>

inline _LIBCPP_INLINE_VISIBILITY from_chars_result inline _LIBCPP_INLINE_VISIBILITY from_chars_result

__subject_seq_combinator(_It __first, _It __last, _Tp& __value, _Fn __f, __subject_seq_combinator(_It __first, _It __last, _Tp& __value, _Fn __f,

_Ts... __args) _Ts... __args)

{ {

auto __find_non_zero = [](_It __first, _It __last) { auto __find_non_zero = [](_It __first, _It __last) {

▲ Show 20 Lines • Show All 176 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ]:[z/OS]:[libcxx]: fix nthLetter issue for charconv header
Needs RevisionPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 407371

libcxx/include/__locale

libcxx/include/charconv

This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ]:[z/OS]:[libcxx]: fix nthLetter issue for charconv header Needs RevisionPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 407371

libcxx/include/__locale

libcxx/include/charconv

[SystemZ]:[z/OS]:[libcxx]: fix nthLetter issue for charconv header
Needs RevisionPublic