This is an archive of the discontinued LLVM Phabricator instance.

[support] Provide overload to PrintNumber that use C++ types
ClosedPublic

Authored by paulkirth on Mar 23 2023, 4:48 PM.

Details

Summary

This attempts to address an issue with overload resolution for PrintNumber
with size_t parameters on Darwin, brought up in
https://reviews.llvm.org/D146492.

On Aarch64 Darwin, uint64_t has a different typedef than size_t
(e.g., unsigned long long vs. unsigned long), whereas on Linux and
Windows they are the same.

This commit also reverts the static_cast's added in
064e2497e2ebe9ac30ac96923a26a52484300fdf, since they are no longer
needed.

Diff Detail

Event Timeline

paulkirth created this revision.Mar 23 2023, 4:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 4:48 PM
paulkirth requested review of this revision.Mar 23 2023, 4:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 4:48 PM
paulkirth added inline comments.Mar 23 2023, 4:58 PM
llvm/include/llvm/Support/JSON.h
351–352

I didn't love that I had to change these, but the non-narrowing issue forced me to add unsigned long long. I looked into just pushing all of the unsigned integers into the uint64_t version, but ran into some strange behavior and test failures that I couldn't explain.

For example an MCA test failed, but the JSON appeared equivalent. The difference was that despite being equivalent "numbers", the values were not integers that the getInteger() API for json::Objects could resolve (i.e. it relies on them being int64_t, and some were now uint64_t).

Some of that likely is tied to the JSON rules for numeric types, but someone with more knowledge here should probably weigh in. Also template metaprogramming isn't something I do often, so there may be better ways to express these constraints.

paulkirth added inline comments.Mar 23 2023, 5:02 PM
llvm/include/llvm/Support/ScopedPrinter.h
214

Another potential approach would be to rely on conversion, and just pic a select few overloads for integer types. I usually try to avoid things like that, but 1) this is a lot of duplicated code, and 2) for many of these types the conversion is likely happening when writing to the stream.

I don't have a strong opinion on relying on conversions versus the duplicate code, so I'm happy for you to do what you prefer.

llvm/include/llvm/Support/JSON.h
351–352

I don't think the changes you've made will work in a cross-platform manner for the same sort of reasoning as why you need the unsigned long overloads of printNumber: an unsigned long is a 64-bit value on some platforms, but uint64_t might be defined to unsigned long long. In such cases, unsigned long will be neither uin64_t, nor unsigned long long, and so the enable_ifs won't work.

A 64-bit integral will have the following traits: std::is_integral, and sizeof(T) == sizeof(uint64_t) (you could hard code the latter to 8). So the current behaviour of comparing the type against uint64_t isn't really correct. I think you want to change the unsigned case to:

template <
    typename T,
    typename = std::enable_if_t<std::is_integral<T>::value &&
                                std::is_unsigned<T>::value &&
                                !std::is_same<T, bool>::value &&
                                sizeof(T) == sizeof(uint64_t)>>

The signed one should look like the following:

template <
    typename T,
    typename = std::enable_if_t<std::is_integral<T>::value &&
                                std::is_signed<T>::value &&
                                sizeof(T) == sizeof(int64_t)>>

Honestly, I'm not sure why the second one is listed as multiple enable_if_t args: I would have that that would be unnecessary, but if it is necessary (my own template metaprogramming sometimes needs a little bit of trial and error!), you could split the above into three separate ones.

Note that the comparison against bool is moved because is_signed explicitly excludes bool, but is_unsigned includes it.

You could, if you wanted, defined your own type trait that folds together the is_integral and sizeof terms, to avoid the duplication, but I'm not sure it would add enough to be worth it.

Is there any form of testing that shows that the different fundamental types create the correct underlying type?

llvm/include/llvm/Support/ScopedPrinter.h
214

Looking at the Value class above, it only has explicit overloads for uint64_t and int64_t (effectively). Other integer types happen to work because these two overloads have ended up being instantiated and so the types can convert. I might be wrong, but I suspect if you tried to use that class with e.g. an int value, but without any usage with an int64_t, you'd get a missing overload error, or something stranger (like it converting to a pointer or something)! I'm a little unsure which one char will use, and I suspect it's platform dependent.

In your case, I think you could get away with long long and unsigned long long, but in your testing, I'd test with every fundamental integer type to make sure it works without a problem.

paulkirth added inline comments.Mar 24 2023, 9:47 AM
llvm/include/llvm/Support/JSON.h
351–352

I don't think the changes you've made will work in a cross-platform manner for the same sort of reasoning as why you need the unsigned long overloads of printNumber: an unsigned long is a 64-bit value on some platforms, but uint64_t might be defined to unsigned long long. In such cases, unsigned long will be neither uin64_t, nor unsigned long long, and so the enable_ifs won't work.

A 64-bit integral will have the following traits: std::is_integral, and sizeof(T) == sizeof(uint64_t) (you could hard code the latter to 8). So the current behaviour of comparing the type against uint64_t isn't really correct. I think you want to change the unsigned case to:

template <
    typename T,
    typename = std::enable_if_t<std::is_integral<T>::value &&
                                std::is_unsigned<T>::value &&
                                !std::is_same<T, bool>::value &&
                                sizeof(T) == sizeof(uint64_t)>>

The signed one should look like the following:

template <
    typename T,
    typename = std::enable_if_t<std::is_integral<T>::value &&
                                std::is_signed<T>::value &&
                                sizeof(T) == sizeof(int64_t)>>

Huh, I forgot you could use sizeof in template expressions. That certainly cleans things up. I think for the signed case we will want: sizeof(T) <= sizeof(int64_t), since this code assumes everything smaller than uint64 is signed. It also seems that some dependent code makes similar assumptions.

Honestly, I'm not sure why the second one is listed as multiple enable_if_t args: I would have that that would be unnecessary, but if it is necessary (my own template metaprogramming sometimes needs a little bit of trial and error!), you could split the above into three separate ones.

I just went with the existing convention. I think your suggestion should work, but I haven't tried it yet.

Note that the comparison against bool is moved because is_signed explicitly excludes bool, but is_unsigned includes it.

You could, if you wanted, defined your own type trait that folds together the is_integral and sizeof terms, to avoid the duplication, but I'm not sure it would add enough to be worth it.

Is there any form of testing that shows that the different fundamental types create the correct underlying type?

The existing tests in JSONTest.cpp all seem to use int64_t/uint64_t. That makes sense to an extent, since they more or less forced anything that isn't uint64_t into int64_t. But there isn't any testing of the underlying C++ types.

llvm/include/llvm/Support/ScopedPrinter.h
214

Yeah, I the more I think about this, the more I feel like this will be simpler to implement. let me give this a shot and see if that's actually the case. And you're right, char is _probably_ going to need some kind of special handling.

Fix enable_if and update getInteger to allow more unsigned types

I may want to revisit how we split signed/unsigned.
I also still need to investigate implicit conversion.

after looking at this for a bit, I think this is probably the way to go. Using a template in ScopedPrinter.h has some problems, because there are types like ulittle32_t that won't satisfy is_integral. Using conversion also has its issues, like making some calls ambiguous.

jhenderson added inline comments.Mar 27 2023, 12:43 AM
llvm/include/llvm/Support/JSON.h
343–344

You may need to update these comments.

344–346

It's probably worth calling out that you've changed the behaviour somewhat. I fully expect the behaviour change to be harmless, but it's worth pointing out in the commit message.

351–352

I think there's a lot to be said for having testing for the fundamental types, because ultimately those are what are actually used by the compiler.

427

Are there existing unit tests for this code? I'm slightly concerned that we might accidentally break the intended behaviour without them.

llvm/include/llvm/Support/ScopedPrinter.h
215

I think C++-style casts are generally preferred.

junhee-yoo added inline comments.Mar 27 2023, 8:39 PM
llvm/include/llvm/Support/JSON.h
355

I guess this sizeof(T) <= sizeof(int64_t) and sizeof(T) <= sizeof(uint64_t) are not affect because all types are equal or smaller than 64bits types.

In my humble opinion, how about keep the intention of original code:

  • for unsigned 64bits types - uint64_t:
template <typename T,
          typename = std::enable_if_t<!std::is_same_v<T, bool>&&
                                      std::is_integral_v<T> &&
                                      std::is_unsigned_v<T> &&
                                      sizeof(T) == sizeof(uint64_t)>>
  • for rests - int64_t:
template <typename T,
          typename = std::enable_if_t<std::is_integral_v<T>>,
          typename = std::enable_if_t<!std::is_same_v<T, bool>>,
          typename = std::enable_if_t<std::is_integral_v<T> &&
                                      !(std::is_unsigned_v<T> &&
                                        sizeof(T) == sizeof(uint64_t))>>

If it looks too verbose, I recommend to use:

template<typenameT>
constexpr bool is_uint_64_bits_v = 
  std::is_integral_v<T> && std::is_unsigned_v<T> && sizeof(T) == sizeof(uint64_t);
paulkirth updated this revision to Diff 511767.Apr 7 2023, 12:20 PM
paulkirth marked 4 inline comments as done.

Rebase and address comments.

  • Add test for C++ types in JSONTest.cpp
  • Restore original behavior in converting all integers representable as int64_t
  • Simplify template expression.
  • git clang-format
paulkirth marked 4 inline comments as done.Apr 7 2023, 12:35 PM

Sorry of the delay on this, I was a bit under the weather and had some other work take up more time than I planned.

I think this version makes reasonable tradeoffs between the old semantics and making use of the underlying C++ types. If there is something else that needs to be addressed, please let me know.

llvm/include/llvm/Support/JSON.h
77

I'm not 100% on using a namespace here (or naming it internal), but I didn't want to introduce this into llvm::json by accident.

If there is a preference, or an llvm coding preference I've missed, please let me know. https://llvm.org/docs/CodingStandards.html#anonymous-namespaces didn't seem to be relevant from my reading, but I'm happy to adopt something else here that is more in line w/ the rest of the codebase.

344–346

Since I've restored the behavior, I think its best to leave them as is.

427

Yes, these are tested in https://github.com/llvm/llvm-project/blob/0f9842c21c76a1c36866589770f92cc63619d36a/llvm/unittests/Support/JSONTest.cpp#L352.

There are other relevant tests in the file too, for JSON Numbers, the asUINT64() API, and conversions between them.

jhenderson added inline comments.Apr 11 2023, 2:08 AM
llvm/include/llvm/Support/JSON.h
77

I'm not sure it matters if this ends up in the llvm::json namespace, although perhaps it should just be in the llvm namespace, since it's not JSON specific.

I don't know of any specific rules here. Generally, I'd avoid anonymous namespaces (and static functions/variables) in header files, because that would result in a copy of it in every object file that uses the header, but in this particular case, I suspect it wouldn't since this is a constexpr value, so using an anonymous namespace (or possibly static - not sure if that can be used in this context), should be okay, I think?

FWIW, I'm not convinced about using an "internal" namespace.

343–344

Can we drop "long" from the comment, please?

llvm/include/llvm/Support/ScopedPrinter.h
214

I think you need a specific char overload too? char is a separate fundamental type to signed char and unsigned char.

228

Nit: missing blank line between methods.

llvm/unittests/Support/JSONTest.cpp
459

Nits. (I'm pretty sure "C++" is the "correct" way of writing it, but I could be wrong).

462

You need a signed char test case too.

paulkirth updated this revision to Diff 512476.Apr 11 2023, 8:39 AM
paulkirth marked 7 inline comments as done.

Address comments

  • update code comments
  • add overload for char
  • add missing test case for signed char
  • remove internal namespace
jhenderson accepted this revision.Apr 12 2023, 1:04 AM

Looks good to me.

This revision is now accepted and ready to land.Apr 12 2023, 1:04 AM