This is an archive of the discontinued LLVM Phabricator instance.

Introduce isoneof<T0, T1, ...> as an extension of isa<T>
AbandonedPublic

Authored by serge-sans-paille on May 12 2017, 5:38 AM.

Details

Summary

In many places, the idiom:

if(isa<T0>(V) || isa<T1>(V) || isa<T2>(V)) //...

is used. This commit introduces a new function, isoneof, that captures this idiom:

if(isa<anyof<T0, T1, T2>>(V)) //...

That's somehow similar to the tuple version of isinstance in Python.

Note to the reviewers: I wonder whether isa<> should be made variadic instead?

Diff Detail

Repository
rL LLVM

Event Timeline

Added documentation in `docs/ProgrammersManual.rst`

sanjoy added a subscriber: sanjoy.May 12 2017, 9:32 AM

Overall, I think this is a really good abstraction, but given the scope of the change I'll wait for other stakeholders to chime in.

Right now it looks like isoneof<C> won't do the right thing -- we should definitely have it "gracefully degrade" to isa.

I won't vote for making isa variadic since that would read weird. isoneof, OTOH, is very understandable so my suggestion would be to stick with that.

grandinj added inline comments.
lib/Analysis/MemoryDependenceAnalysis.cpp
747 ↗(On Diff #98788)

accidental delete?

lib/Analysis/MemoryDependenceAnalysis.cpp
747 ↗(On Diff #98788)

No, there's a `operator bool() on CallSite that basically checks isoneof<CallInst, InvolkeInst>(...)` so I removed the duplicate here.

Make `isoneof<T> compatible with isa<T>`

@sanjoy I updated the `isoneof implementation to behave like isa` when only one parameter is provided

The only issue i have with this is that for a bunch of these class hierarchies, multiple isa is inefficient compared to switch based.
Unfortunately, i don't think C++ gives us a good way to specialize for that.

Oh well.
Still LGTM

@dberlin I agree. Not that to that respect, this commit does not change anything to the performance issue. Still it opens funny opportunities: one could imagine a constexpr function that would pre-compute the type lookup based on the argument type information, but that's definitively not part of this commit!

Any extra comments on this review? ready for merging ?

chandlerc edited edge metadata.May 20 2017, 4:51 PM

I'd like to request that you make this patch just introduce the tool, and maybe add a few usages. I think rewriting this much in a single patch seems like a bad idea. It'll make reverts if anything goes wrong pointlessly hard, etc.

Then when the build bots are all happy you can migrate chunks of code in more focused patches.

Another high-level comment: I don't like the name isoneof. To me, that parses to the relationship being tested is rather than isa which loses an important distinction. I'm not really sure what would be the best name though.

include/llvm/Support/Casting.h
145–153

Please make this a doxygen comment, and no need to repeat the name in it.

Also, your example code calls this isa instead of isonef.

Another high-level comment: I don't like the name isoneof. To me, that parses to the relationship being tested is rather than isa which loses an important distinction. I'm not really sure what would be the best name though.

One possibility is to figure out the C++ to make code like this work:

if (isa<oneof<T0, T1, T2>>(X))
  ...;

That's a bit more verbose, but we can read oneof<T0, T1, T2> as a sum type of T0, T1 and T2. I'd even be up for calling it sum or sumty, but not sure how others feel about that.

serge-sans-paille edited the summary of this revision. (Show Details)

@chandlerc Here is a minimal version of the patch, with the minimal hackery to support the `isa<anyof<...>> syntax. Note that it should support isa<anyof<>> (which is always false) and isa<anyof<T>> which should be quivalent to isa<T>`.

Up :-) All remarks have been taken into account and only a minimal patch is provided, I'll apply the change next in separated commits.

@chandlerc @mehdi_amini @sanjoy if that's okay with you, can you approve the review?

Honestly, the more I think about this, the more I think that the complexity it involves doesn't pull its weight.

First, we either have to deal with the semantic confusion of isa<T0, T1, T2, ...> or the readability burden of inventing and using a somewhat terrible name like isa_anyof or the technical complexity of magical tag templates that are *unqualified* and jammed in here for isa<anyof<...>>. Note that we have llvm::any_of already and it is very different! =[ None of these are insurmountable issues, but they add cost to this abstraction.

Second, the benefit is relatively small. Until the list of types is loooooong, isa<T0> || isa<T1> || isa<T2> really doesn't seem that bad. It's a bit more characters to type, but it neatly solves any and all of the semantic / syntactic ambiguity with zero complex implementation. Kinda nice.

But even worse, when these lists *do* get long, using isa, even with the new tool, is usually *the wrong approach*. People should instead be writing a proper switch over whatever data field is available rather than a series of tests. Once that series is long, I worry will not be reliable. And we have historically seen places where isa and dyn_cast did show up on the profile.

I feel like if we want to make chains of these things easier to code, that easier to code pattern should actually provide more value such as succinctly packaging up the single-dispatch approach. But currently, I don't think our system really exposes enough information to implement that more pattern-match-y approach, and it isn't clear that we really want to make such a big change as to introduce it.

Anyways, long story short, I'm no longer sure we should pursue this direction. Curious what others think though.

@chandlerc the extra complexity is worth 15 lines of codes, and thanks to your `anyof<>` proposal, it's very local.

As you noted, the benefits of the `anyof<...> syntax compared to a chaining of ||` is not on the performance side. It's slightly more concise, and when I did the full replacement, I often met example like this:

if(isa<T0>(Inst->getOperand(0)) || isa<T1>(Inst->getOperand(0))
    /* stuff*/;

With this proposal, it naturally turns into

if(isa<anyof<T0, T1>>(Inst->getOperand(0)))
    /* stuff*/;

which avoids the redundancy. Not a big deal yet it avoids some kind of redundancy. In other examples, you had

if (isa<T0>(V0) || isa <T1>(V1>)) 
    /* stuff*/

which cannot be reduced, but could be confusing while introducing `anyof<>` make the two situations visually different.

Again I admit the added value is arguable, so let see what the other think about it.

I actually like this syntax, because i think it's more concise *and* significantly less prone to errors (for the reason serge pointed out).
We *could* make it faster if we wanted (I think we could metaprogram a version that used switch if it could and there are greater than X args).
The other way requires every caller to be cognizant of when switch becomes more performant, and implement it themselves.

Bottom line: I'd rather see it abstracted than not abstracted.

mehdi_amini edited edge metadata.Jun 1 2017, 9:31 AM

Honestly, the more I think about this, the more I think that the complexity it involves doesn't pull its weight.

I'm fairly neutral on this.

First, we either have to deal with the semantic confusion of isa<T0, T1, T2, ...> or the readability burden of inventing and using a somewhat terrible name like isa_anyof

For a non-native speaker, I still don't understand why the correct form isn't is_anyof ( "is any of")? You may explain this to me tonight :)

or the technical complexity of magical tag templates that are *unqualified* and jammed in here for isa<anyof<...>>. Note that we have llvm::any_of already and it is very different! =[ None of these are insurmountable issues, but they add cost to this abstraction.

I thought exactly about these after the last update!

Second, the benefit is relatively small. Until the list of types is loooooong, isa<T0> || isa<T1> || isa<T2> really doesn't seem that bad. It's a bit more characters to type, but it neatly solves any and all of the semantic / syntactic ambiguity with zero complex implementation. Kinda nice.

Well I can't refrain but remembering all the people that didn't want to use the STL algorithm and write raw for loop using this same argument.
What's missing is that "any_of" express clearly the intent and is a lot easier to parse for a human. As Serge mentioned in his last example, DRY can be annoying to attain without such construct.

But even worse, when these lists *do* get long, using isa, even with the new tool, is usually *the wrong approach*. People should instead be writing a proper switch over whatever data field is available rather than a series of tests. Once that series is long, I worry will not be reliable.

Reliable on which aspect? Optimization?

And we have historically seen places where isa and dyn_cast did show up on the profile.

I see the positive aspect of it: more opportunities to improve compiler optimizations ;)
Driving syntactic changes because of optimizer failure makes me sad...

So three options:

  1. We give up with this idea
  2. We introduce the `isa<anyof<>>` syntax
  3. We introduce the `is_anyof<>` syntax

I leave the choice to the reviewers, here is my preference : 3 > 2 > 1

@mehdi_amini so what's the status? No consensus /reaction :-/

I'm fine either way with this patch, and @chandlerc didn't answer my last comment.

@chandlerc any thoughts on @mehdi_amini ? i also think that the `is_anyof<T0,...>` construct raises the abstraction without any gory implementation detail.

According to @chandlerc tehre is not much benefit in this idiom, closing the patch.