This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAGBuilder] Adds support for landingpads of token type
ClosedPublic

Authored by chenli on Dec 9 2015, 9:44 PM.

Details

Summary

This patch adds a check in visitLandingPad to see if landingpad's result type is token type. If so, do not create DAG nodes for its exception pointer and selector value. This patch enables the back end to handle landingpads of token type.

Diff Detail

Event Timeline

chenli updated this revision to Diff 42380.Dec 9 2015, 9:44 PM
chenli retitled this revision from to [EHPersonality] Add a new personality enum to represent langindPad of token type.
chenli updated this object.
chenli added a reviewer: JosephTremoulet.
chenli added a subscriber: llvm-commits.

We want the new personality to generally be treated like Unknown, correct? You should check for places in the code that look for EHPersonality::Unknown and make sure they agree for Token_LP. From a quick look, I think you'd want to update isNoOpWithoutInvoke in EHPersonalities.h and isCatchAll in InstructionCombining.cpp.

include/llvm/Analysis/EHPersonalities.h
29 ↗(On Diff #42380)

Since this one isn't actually some runtime's personality function, I think a comment explaining it would be helpful. Something along the lines of "similar to Unknown, except that LandingPads have token type", maybe?

lib/Analysis/EHPersonalities.cpp
32 ↗(On Diff #42380)

nit: the name "ProcessTokenLandingPad" seems funny to me. I presume you were modeling it after the line above? That's the name of the method in the CLR runtime, and is so named because it processes an exception, but "process" sounds funny to me in your case since the landingpad isn't being processed. Maybe something like "Token_LP_Personality"?

We want the new personality to generally be treated like Unknown, correct? You should check for places in the code that look for EHPersonality::Unknown and make sure they agree for Token_LP. From a quick look, I think you'd want to update isNoOpWithoutInvoke in EHPersonalities.h and isCatchAll in InstructionCombining.cpp.

Well, I don't think I want the new personality to be treated exactly like Unknown. I've looked at isNoOpWithoutInvoke when creating the patch, and I think in that case it should be treated as other known personalities and be safely removed (Unknown might have side effects that prevent the removal but we know this new personality doesn't have any, at least for now). I think I've missed 'isCatchAll', and I will update there.

include/llvm/Analysis/EHPersonalities.h
29 ↗(On Diff #42380)

Will do.

lib/Analysis/EHPersonalities.cpp
32 ↗(On Diff #42380)

Thanks for the correction. I will modify to a proper name here.

Well, I don't think I want the new personality to be treated exactly like Unknown. I've looked at isNoOpWithoutInvoke when creating the patch, and I think in that case it should be treated as other known personalities and be safely removed

Ah, ok. That makes sense, though it means that "similar to Unknown, except that LandingPads have token type" isn't exactly correct as a description for Token_LP. Since you're wanting both token-valued landingpads and to opt into isNoOpWithoutInvoke, I wonder if it wouldn't be better to go ahead and be specific about your runtime like the other non-Unknown enum values ... call it AzulJVM or whatever in the enum and use the name of an appropriate function in your runtime as the string. If you want the code in SelectionDAG to be more general, you could add a property like usesTokenLandingPads(EHPersonality Pers) to EHPersonalities.h which returns true for AzulJVM, and have SelectionDAG query through that.

reames edited edge metadata.Dec 10 2015, 4:38 PM

Well, I don't think I want the new personality to be treated exactly like Unknown. I've looked at isNoOpWithoutInvoke when creating the patch, and I think in that case it should be treated as other known personalities and be safely removed

Ah, ok. That makes sense, though it means that "similar to Unknown, except that LandingPads have token type" isn't exactly correct as a description for Token_LP. Since you're wanting both token-valued landingpads and to opt into isNoOpWithoutInvoke, I wonder if it wouldn't be better to go ahead and be specific about your runtime like the other non-Unknown enum values ... call it AzulJVM or whatever in the enum and use the name of an appropriate function in your runtime as the string. If you want the code in SelectionDAG to be more general, you could add a property like usesTokenLandingPads(EHPersonality Pers) to EHPersonalities.h which returns true for AzulJVM, and have SelectionDAG query through that.

I agree with Joseph's point here, but might suggest a slightly different approach. I don't see any value in having our specific personality function upstream. (I don't really have any objection to having it upstream either.)

Would it make sense to add a personality function to our downstream tree and then add an upstream personality function for testing purposes which really is "unknown but with token return"? The hook in isNoOpWithoutInvoke could then live in our local tree only. Since we're probably going to want other similar hooks over time, maybe this makes sense.

Ah, ok. That makes sense, though it means that "similar to Unknown, except that LandingPads have token type" isn't exactly correct as a description for Token_LP. Since you're wanting both token-valued landingpads and to opt into isNoOpWithoutInvoke, I wonder if it wouldn't be better to go ahead and be specific about your runtime like the other non-Unknown enum values ... call it AzulJVM or whatever in the enum and use the name of an appropriate function in your runtime as the string. If you want the code in SelectionDAG to be more general, you could add a property like usesTokenLandingPads(EHPersonality Pers) to EHPersonalities.h which returns true for AzulJVM, and have SelectionDAG query through that.

I don't know if we should tie the token-valued landingpad to a specific runtime. I think it should be same as any other landingpads. The only reason we need to make it special today is that we don't have the support of extracting exception pointers or selector values from it, and therefore we need a way to recognize it and not assign those registers to it. In a long term, I think we should remove it from EHPersonality enum after having the support of extracting exception pointers and selector values, and make it the same as landingpad of struct type.

sanjoy added a subscriber: sanjoy.Dec 10 2015, 5:13 PM

Does it make sense to see a token returning personality as "wrapping" another existing personality? I mean something like (psuedo code):

struct PersonalityKind {
  EHPersonality InnerPersonality;
  bool HasToken;
};

and then introduce an {i8*, i32} @llvm.get_exception_pointer_and_selector(token) and a token @llvm.get_gc_token(token). @llvm.get_exception_pointer_and_selector then returns whatever the landing pad would have directly returned for InnerPersonality, while @llvm.get_gc_token returns something we can hang gc.relocate and gc.result off of.

I don't know if we should tie the token-valued landingpad to a specific runtime

I don't think of it as tying them together. I think of it as your runtime/personality being the first, and currently only, runtime that uses token-valued landingpads. Another personality that also wants token-valued landingpads could add its entry to the enum and update the usesTokenLandingPads method to include itself.

see a token returning personality as "wrapping" another existing personality

Introducing a hierarchy here when we aren't for the other varying properties of personalities seems awkward to me.

add a personality function to our downstream tree and then add an upstream personality function for testing purposes which really is "unknown but with token return"

I don't see a reason why that wouldn't work, but it seems like it could be a hassle to maintain your out-of-tree personality.
_

I think several of these comments are getting at the underlying pre-existing issue of "isn't having an enum of recognized personalities with string matching to identify them a scaling issue?". I wouldn't mind having a long-term direction in mind for addressing that. I haven't seen any feedback on the scheme I suggested on the dev alias:

since we have llvm::Functions for the personality functions, maybe we could make them more self-describing via attributes? We'd have to come up with the set of relevant properties that characterize personality functions (the predicates in Analysis/EHPersonalities would be a start, but we'd also have to look at other places testing/switching on the enum), and for each define an attribute that a front-end could either attach to the personality Function or not to indicate whether that property applies, and we could get rid of the string matching and hard-coded enum

I think that in reality we have few enough personalities that it's fine if we don't address the scaling issue right now, but then if we're not addressing it I'd prefer we stick to the current pattern which is to have a hard-coded list of recognized personalities, plus "Unknown".

If you don't like adding your runtime's personality to the enum, perhaps we should go back to Chen's original suggestion of just checking the type of the landingpad in SelectionDAG. The reason we started going in this direction instead was "It seems a little backwards to check the landingpad's type as the first check, since the personality dictates the landingpad's type". But maybe it makes sense to allow that much variation within Unknown, since it is after all a bucket for "all the other personalities" -- if we wanted to be pedantic, SelectionDAG could allow either token or struct for Unknown, but require struct for all the known personalities that use landingpads.

If you don't like adding your runtime's personality to the enum, perhaps we should go back to Chen's original suggestion of just checking the type of the landingpad in SelectionDAG. The reason we started going in this direction instead was "It seems a little backwards to check the landingpad's type as the first check, since the personality dictates the landingpad's type". But maybe it makes sense to allow that much variation within Unknown, since it is after all a bucket for "all the other personalities" -- if we wanted to be pedantic, SelectionDAG could allow either token or struct for Unknown, but require struct for all the known personalities that use landingpads.

I dont think we mind adding our own runtime's personality, but I just don't know if that's necessary. Personally, I'd still prefer to have a generic enum. Maybe having a Unknown_With_Token_Type and make it similar to Unknown is the best choice for now? I don't think we need to care about isNoOpWithoutInvoke downstream since our current personality is Unknown anyway.

I dont think we mind adding our own runtime's personality, but I just don't know if that's necessary. Personally, I'd still prefer to have a generic enum. Maybe having a Unknown_With_Token_Type and make it similar to Unknown is the best choice for now? I don't think we need to care about isNoOpWithoutInvoke downstream since our current personality is Unknown anyway.

That sounds fine to me.

chenli updated this revision to Diff 42581.Dec 11 2015, 2:21 PM

Update patch to make the new enum similar to Unknown in all cases, except that it has landingpads of token types.

This LGTM but I'd feel better if @majnemer or someone could weigh in too.

rnk edited edge metadata.Dec 14 2015, 1:37 PM

Does it make sense to see a token returning personality as "wrapping" another existing personality? I mean something like (psuedo code):

Joseph, what did you think was wrong with checking for token type? I think the original approach is simpler. It's a little funky when you have to materialize resume {i8*, i32} from separate values, but oh well.

I can imagine wanting to use some kind of relocating GC scheme in a function that uses C++ EH, and needing to name out the right __gxx_personality* symbol for the platform.

In D15405#310252, @rnk wrote:

Joseph, what did you think was wrong with checking for token type? I think the original approach is simpler. It's a little funky when you have to materialize resume {i8*, i32} from separate values, but oh well.

It just seemed backwards to snoop the landingpad's type when we expect landingpad type to be dictated by the personality, nothing deeper than that.

I can imagine wanting to use some kind of relocating GC scheme in a function that uses C++ EH, and needing to name out the right __gxx_personality* symbol for the platform.

That's an interesting point. I do still wonder, if we were inventing landingpad over again today, whether we'd just always give it token type (though again, even if that seems the most sensible I don't think we need to switch to that right now).

I can imagine wanting to use some kind of relocating GC scheme in a function that uses C++ EH, and needing to name out the right __gxx_personality* symbol for the platform.

I think in general, we should allow all personality functions to have a token type. And adding a token type alternative for each of them (for example _gxx_personality_token_lp) might not be the best choice. But I don't mind taking this approach as a starting point (I don't minding taking the original approach either).

rnk added a comment.Dec 14 2015, 2:11 PM

It just seemed backwards to snoop the landingpad's type when we expect landingpad type to be dictated by the personality, nothing deeper than that.

I think it's fine. landingpad personalities don't impose a lot of constraints on the compiler. It seems nicer to treat them as more of an open set of well-behaved things.

That's an interesting point. I do still wonder, if we were inventing landingpad over again today, whether we'd just always give it token type (though again, even if that seems the most sensible I don't think we need to switch to that right now).

I guess if I were designing landingpad from scratch around a language with relocating GC, then yeah, I'd probably make it always use token type, and always use intrinsics to extract the values instead of extractvalue.

In D15405#310301, @rnk wrote:

It just seemed backwards to snoop the landingpad's type when we expect landingpad type to be dictated by the personality, nothing deeper than that.

I think it's fine. landingpad personalities don't impose a lot of constraints on the compiler. It seems nicer to treat them as more of an open set of well-behaved things.

Alright. Chen, sorry for the back-and-forth.

We do have this check in the verifier:

if (!LandingPadResultTy)
  LandingPadResultTy = LPI.getType();
else
  Assert(LandingPadResultTy == LPI.getType(),
         "The landingpad instruction should have a consistent result type "
         "inside a function.",
         &LPI);

we could either remove it or decide that landingpads can vary but must agree within a function. I'm fine with either.

rnk added a comment.Dec 14 2015, 2:39 PM

we could either remove it or decide that landingpads can vary but must agree within a function. I'm fine with either.

Checking for consistency in a function sounds good. It'd be good to update ExceptionHandling.rst to describe this usage of landingpad too.

In D15405#310301, @rnk wrote:

It just seemed backwards to snoop the landingpad's type when we expect landingpad type to be dictated by the personality, nothing deeper than that.

I think it's fine. landingpad personalities don't impose a lot of constraints on the compiler. It seems nicer to treat them as more of an open set of well-behaved things.

Alright. Chen, sorry for the back-and-forth.

Not a problem :)

We do have this check in the verifier:

if (!LandingPadResultTy)
  LandingPadResultTy = LPI.getType();
else
  Assert(LandingPadResultTy == LPI.getType(),
         "The landingpad instruction should have a consistent result type "
         "inside a function.",
         &LPI);

we could either remove it or decide that landingpads can vary but must agree within a function. I'm fine with either.

I'd prefer to keep the verifier and make the landingpad's type consistent within a function.

chenli retitled this revision from [EHPersonality] Add a new personality enum to represent langindPad of token type to [SelectionDAGBuilder] Adds support for landingpads of token type.
chenli updated this object.
chenli removed a reviewer: reames.

Update patch with initial approach as described by the updated summary.

@rnk -- as for ExceptionHandling.rst, I'd prefer to update gc.statepoint document first on how they use landingpads of token type, and then add a reference in ExceptionHandling.rst. Does that sounds reasonable to you?

rnk accepted this revision.Dec 15 2015, 4:35 PM
rnk edited edge metadata.

lgtm

This revision is now accepted and ready to land.Dec 15 2015, 4:35 PM
chenli closed this revision.Dec 15 2015, 8:51 PM