This is an archive of the discontinued LLVM Phabricator instance.

[LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed.
ClosedPublic

Authored by bzcheeseman on Apr 16 2022, 11:38 AM.

Details

Summary

This patch expands the expressive capability of the casting utilities in LLVM by introducing several levels of configurability. By creating modular CastInfo classes we can enable projects like MLIR that need more fine-grained control over how a cast is actually performed to retain that control, while making it easy to express the easy cases (like a checked pointer to pointer cast).

The current implementation of Casting.h doesn't make it clear where the entry points for customizing the cast behavior are, so part of the motivation for this patch is adding that documentation. Another part of the motivation is to support using LLVM RTTI with a wider set of use cases, such as nullable value to value casts, or pointer to value casts (as in MLIR).

Diff Detail

Event Timeline

bzcheeseman created this revision.Apr 16 2022, 11:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 16 2022, 11:38 AM
Herald added a subscriber: dexonsmith. · View Herald Transcript
bzcheeseman requested review of this revision.Apr 16 2022, 11:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 16 2022, 11:38 AM

It may be helpful to also show the patch with the intended use-case.

nikic added a subscriber: nikic.Apr 16 2022, 12:14 PM

Not a big fan of this overload. dyn_cast is such a core operation, I'd rather not muddy things by making it return Optional<> -- but only sometimes. I expect this is going to cause a lot of confusion when people work on an Instruction &, but really want to be working on Instruction *.

Not a big fan of this overload. dyn_cast is such a core operation, I'd rather not muddy things by making it return Optional<> -- but only sometimes.

I agree with you that we need to tread lightly here, but dyn_cast being such a core operation means that we need to continue to invest in it. There are a bunch of cases that it doesn't handle well, which means LLVM projects using this (e.g. MLIR) end up resorting to similar-but-different things, like x.dyn_cast<foo>() which have similar but different behavior on null. It would be great for the project to pull this together.

These three functions provide forwarding for non-optional types passed into
dyn_cast_or_none.

I don't understand the addition of the "or_none" helper. I get what it "does": it allows none propagation for Optional arguments (but it also works with nullptr propagation from pointers as well). In practice, this seems strictly stronger than the existing dyn_cast_or_null function - can they be merged into a single null propagating thing? Maybe we should rename it to "dyn_cast_if_present" or some other name that is more generic?

Channeling the comments above, we need to be careful with such a core API like this, it would be good to send a "head's up" note to the llvm forum when the design point settles.

-Chris

It may be helpful to also show the patch with the intended use-case.

The intended use case is to be used for projects like MLIR that have t.dyn_cast<IntegerType>() so they can instead have dyn_cast<IntegerType>(t) for consistency, and to reduce the RTTI burden.

I expect this is going to cause a lot of confusion when people work on an Instruction &, but really want to be working on Instruction *.

I'm maybe missing something here, but my understanding is that if you have Value *v; auto instruction = dyn_cast<Instruction>(v); you're going to end up with an Instruction *. Am I reading the enable_if/conditional rules incorrectly?

can they be merged into a single null propagating thing?

I like this idea - I originally had dyn_cast_or_null with an overload for Optional but it looked goofy. I like dyn_cast_if_present with maybe some forwarding dyn_cast_or_null to avoid breaking everyone.

dyn_cast is such a core operation <>
...
we need to be careful with such a core API like this, it would be good to send a "head's up" note to the llvm forum when the design point settles

Agreed 100%. Everyone uses dyn_cast and that's why I wanted to put up the patch, so we can talk about it!

Going forward I see 2 options (would love to hear about it if someone has other ideas):

  1. Keep the overload, maybe swap the dyn_cast_or_none for dyn_cast_if_present
  2. Have a new function optional_dyn_cast/optional_dyn_cast_if_present (or something, not enamored with that name) so that folks can opt-in to the new behavior with optionals

I would personally prefer option 1 here because it makes the most sense to me right now.

I'd prefer we have one thing. Re:

I like this idea - I originally had dyn_cast_or_null with an overload for Optional but it looked goofy. I like dyn_cast_if_present with maybe some forwarding dyn_cast_or_null to avoid breaking everyone.

it is super awesome to stage things in and give people a migration window, but if it is the right thing to rename this, then we should drop the old name, not carry it around forever.

I want to +1 a general need for something better than what we currently have. The current casting infra kind of assumes a very specific use case, which means that anything outside of that needs to duplicate a bunch of logic and/or handle two different ways of casting. An easy example of this is TypeSwitch(https://github.com/llvm/llvm-project/blob/bdabe505f417a9a6636913e8ea253ff9dc42f32e/llvm/include/llvm/ADT/TypeSwitch.h#L77), which has to SFINAE which cast it's using. TypeSwitch definitely isn't alone here, this kind of logic permeates throughout the codebase and makes some things more annoying than they need to be. In MLIR we use "value-typed" classes
for many things, and that effectively breaks the current casting infras assumption that everything goes through pointer/reference.

All of that being said, I don't think that Optional helps for the MLIR case (or other cases in the codebase that are like it). In most of the cases we have today, the result of the dyn_cast is already "nullable" so we wouldn't want to reroute through Optional. IMO I think one of the best paths forward would be to completely rework the abstractions
of how types opt-in to support casting. The current combination of isa_impl + cast_retty + cast_convert_val is quite clunky, and it feels like an evolution of that would both be cleaner than what we have today, and more extensible for "exotic" use cases. Having suffered from this a lot whenever trying to expand beyond the traditional LLVM
use case, I would love to see movement in this area.

yurai007 added inline comments.
llvm/unittests/Support/Casting.cpp
285

I guess braces are redundant here.

291

Missing test cases for pointer overloads?

Working on updating this in accordance with comments (so folks know I'm not ignoring anyone). Thanks all for the reviews, hope to have an updated patch soon.

bzcheeseman retitled this revision from [LLVM][Casting.h] Allow dyn_cast to return Optional for types that are not constructble from nullptr. to [LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed..
bzcheeseman edited the summary of this revision. (Show Details)
bzcheeseman edited the summary of this revision. (Show Details)Apr 27 2022, 4:54 PM

Overall, this looks really great to me. Please get River's review, then (when any feedback is incorporated) ping an LLVM forum for visibility. This isn't normally needed but casting.h is very pervasive, so extra discussion is helpful.

llvm/include/llvm/Support/Casting.h
18

Is there any reasonable way to avoid STLExtras.h? It is a heavy weight header that pulls in tons of stuff, and Casting.h is used roughly everywhere.

Optional is nearly but not quite as bad, and is more core to what you're doing here, so ok.

bzcheeseman added inline comments.Apr 27 2022, 5:03 PM
llvm/include/llvm/Support/Casting.h
18

I was using is_detected in an earlier iteration, but it's now been removed, so I think I can remove this too. Thanks for catching it!

bzcheeseman edited the summary of this revision. (Show Details)

Sorry for all the traffic on this last night - my machine is a bit slow so I was partly using CI to help run all the llvm-project tests. Should be ready for review at this point, all passing :)

bzcheeseman edited the summary of this revision. (Show Details)
bzcheeseman marked 3 inline comments as done.
lattner accepted this revision.May 10 2022, 11:54 PM

This looks good to me, the community discussion seems quiet. Feel free to land when the CI tests pass, thanks!

This revision is now accepted and ready to land.May 10 2022, 11:54 PM
rriddle accepted this revision.May 11 2022, 9:12 AM

Ran through and this LGTM as an iteration point. It would be nice to add improvements to the user facing documentation on how to add RTTI to a class hierarchy as well: https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html, but I'm fine with that being a follow up.

llvm/include/llvm/Support/Casting.h
404

Can we put this in detail? or a cast_impl namespace?

468

/// here please.

501

Can you indent these? It's hard to pick them out when reading the comment. Maybe also add a newline between each customization.

bzcheeseman marked 3 inline comments as done.May 11 2022, 9:42 AM
bzcheeseman added inline comments.
llvm/include/llvm/Support/Casting.h
404

Yes nice catch, thanks!

bzcheeseman edited the summary of this revision. (Show Details)
bzcheeseman marked an inline comment as done.

I'd prefer we have one thing. Re:

I like this idea - I originally had dyn_cast_or_null with an overload for Optional but it looked goofy. I like dyn_cast_if_present with maybe some forwarding dyn_cast_or_null to avoid breaking everyone.

it is super awesome to stage things in and give people a migration window, but if it is the right thing to rename this, then we should drop the old name, not carry it around forever.

FWIW, having a migration window makes a massive difference for downstream projects. Especially with the currently ongoing discussions about just how big the monorepo should be, this is a consideration to take seriously.

In many cases a migration window helps even if it is "only a microsecond long", i.e. if you have two patches, one to add the new name (and make the old name a plain forward), and a separate follow-up to remove the old name, that can help migrations even if the two commits are added back-to-back.

Which the current version does, so thank you very much for that :)

I'd prefer we have one thing. Re:

I like this idea - I originally had dyn_cast_or_null with an overload for Optional but it looked goofy. I like dyn_cast_if_present with maybe some forwarding dyn_cast_or_null to avoid breaking everyone.

it is super awesome to stage things in and give people a migration window, but if it is the right thing to rename this, then we should drop the old name, not carry it around forever.

FWIW, having a migration window makes a massive difference for downstream projects. Especially with the currently ongoing discussions about just how big the monorepo should be, this is a consideration to take seriously.

In many cases a migration window helps even if it is "only a microsecond long", i.e. if you have two patches, one to add the new name (and make the old name a plain forward), and a separate follow-up to remove the old name, that can help migrations even if the two commits are added back-to-back.

Which the current version does, so thank you very much for that :)

Oh definitely, removing *_or_null will be an ENORMOUS patch so we can do it piecemeal and stage it in NFC patches :) I don't like breaking folks when we don't have to!

Did you notice this:

FAILED: unittests/Support/CMakeFiles/SupportTests.dir/Casting.cpp.o 
/home/buildbots/clang.11.0.0/bin/clang++ --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iunittests/Support -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/unittests/Support -Iinclude -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG    -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT unittests/Support/CMakeFiles/SupportTests.dir/Casting.cpp.o -MF unittests/Support/CMakeFiles/SupportTests.dir/Casting.cpp.o.d -o unittests/Support/CMakeFiles/SupportTests.dir/Casting.cpp.o -c /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/unittests/Support/Casting.cpp
In file included from /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/unittests/Support/Casting.cpp:9:
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Casting.h:216:12: error: returning reference to local temporary object [-Werror,-Wreturn-stack-address]
    return Res2;
           ^~~~
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Casting.h:453:72: note: in instantiation of member function 'llvm::cast_convert_val<llvm::foo, const llvm::bar, const llvm::bar>::doit' requested here
                            typename simplify_type<From>::SimpleType>::doit(f);
                                                                       ^
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Casting.h:564:36: note: in instantiation of member function 'llvm::CastInfo<llvm::foo, const llvm::bar, void>::doCast' requested here
  return CastInfo<To, const From>::doCast(Val);
                                   ^
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Casting.h:214:47: note: binding reference variable 'Res2' here
    typename cast_retty<To, FromTy>::ret_type Res2 =
                                              ^
1 error generated.

from
https://lab.llvm.org/buildbot/#/builders/57/builds/17829/steps/6/logs/stdio

I see the same thing compiling locally with clang 8.

from
https://lab.llvm.org/buildbot/#/builders/57/builds/17829/steps/6/logs/stdio

I see the same thing compiling locally with clang 8.

We are seeing the same error in our internal upstream ppc RHEL buildbot as well. @bzcheeseman could you take a look? If you need access to a PPC machine we should be able to arrange that. FYI @nemanjai

Thanks so much!

I'm just now seeing this (just woke up) so I'll take a look and hopefully resolve it shortly! Thanks for bringing this to my attention.

https://reviews.llvm.org/D125482 doesn't exhibit the problem in the unit test, but now I have to wait for all the other tests to complete so hopefully nothing else breaks!

https://reviews.llvm.org/D125482 doesn't exhibit the problem in the unit test, but now I have to wait for all the other tests to complete so hopefully nothing else breaks!

The warning went away with your fix. Thanks!

@bzcheeseman FYI, this is breaking the build with gcc 6 (for Debian stretch)
if you are aware of a simple fix, it would be appreciated
See https://github.com/llvm/llvm-project/issues/55626

(I am not sure I want to backport gcc 7 or 8 to Debian stretch ;)

thanks

@bzcheeseman FYI, this is breaking the build with gcc 6 (for Debian stretch)
if you are aware of a simple fix, it would be appreciated
See https://github.com/llvm/llvm-project/issues/55626

(I am not sure I want to backport gcc 7 or 8 to Debian stretch ;)

thanks

Moving discussion to the GH issue so it's a little clearer what's up :)

reames added a subscriber: reames.Jun 6 2022, 4:57 PM

Is there any chance that this change removed the assert on cast<X> that the thing being pointed to isa<X>? I'm looking at a local case where I have cast<FixedVectorType(T) succeeding on something which clearly *isn't* a fixed vector, and isa<FixedVectorType>(T) returning false on the same pointer. Its possible that I'm looking at some weird bit of undefined behavior, but since I can't find where the corresponding assert is in the new version of the code, I wanted to ask.

Is there any chance that this change removed the assert on cast<X> that the thing being pointed to isa<X>? I'm looking at a local case where I have cast<FixedVectorType(T) succeeding on something which clearly *isn't* a fixed vector, and isa<FixedVectorType>(T) returning false on the same pointer. Its possible that I'm looking at some weird bit of undefined behavior, but since I can't find where the corresponding assert is in the new version of the code, I wanted to ask.

That looks like what happened, yep. I meant to add it to the default implementation of doCast (so that you don't have to pay the price of isa) but apparently forgot to do so. I'll try and get to this when I get the chance - and clean up all the assert usage at once. If you need it urgently, asserting isa<T> inside cast is fine for now and I can come back and clean things up later.

Is there any chance that this change removed the assert on cast<X> that the thing being pointed to isa<X>? I'm looking at a local case where I have cast<FixedVectorType(T) succeeding on something which clearly *isn't* a fixed vector, and isa<FixedVectorType>(T) returning false on the same pointer. Its possible that I'm looking at some weird bit of undefined behavior, but since I can't find where the corresponding assert is in the new version of the code, I wanted to ask.

That looks like what happened, yep. I meant to add it to the default implementation of doCast (so that you don't have to pay the price of isa) but apparently forgot to do so. I'll try and get to this when I get the chance - and clean up all the assert usage at once. If you need it urgently, asserting isa<T> inside cast is fine for now and I can come back and clean things up later.

See https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033

If you could help with the cleanup here, it would be appreciated.

I got here from D131319, but I am not sure why this change is only touching comments but not the hand-written documentation. Is this CastInfo an internal detail that should not be publicly documented to the everyday documentation reader? Are the old-style "classof" method and enums still applicable?

I got here from D131319, but I am not sure why this change is only touching comments but not the hand-written documentation. Is this CastInfo an internal detail that should not be publicly documented to the everyday documentation reader? Are the old-style "classof" method and enums still applicable?

The old style classof is still valid.

Here is the document about the new CastInfo :

https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html#advanced-use-cases