This is an archive of the discontinued LLVM Phabricator instance.

[IR] Add token types
ClosedPublic

Authored by majnemer on Aug 7 2015, 10:39 PM.

Details

Summary

This introduces the basic functionality to support "token types".
The motivation stems from the need to perform operations on a Value
whose provenance cannot be obscured.

There are several applications for such a type but my immediate
motivation stems from WinEH. Our personality routine enforces a
single-entry - single-exit regime for cleanups. After several rounds of
optimizations, we may be left with a terminator whose "cleanup-entry
block" is not entirely clear because control flow has merged two
cleanups together. We have experimented with using labels as operands
inside of instructions which are not terminators to indicate where we
came from but found that LLVM does not expect such exotic uses of
BasicBlocks.

Instead, we can use this new type to clearly associate the "entry point"
and "exit point" of our cleanup. This is done by having the cleanuppad
yield a Token and consuming it at the cleanupret.
The token type makes it impossible to obscure or otherwise hide the
Value, making it trivial to track the relationship between the two
points.

What is the burden to the optimizer? Well, it turns out we have already
paid down this cost by accepting that there are certain calls that we
are not permitted to duplicate, optimizations have to watch out for
such instructions anyway. There are additional places in the optimizer
that we will probably have to update but early examination has given me
the impression that this will not be heroic.

Diff Detail

Repository
rL LLVM

Event Timeline

majnemer updated this revision to Diff 31573.Aug 7 2015, 10:39 PM
majnemer retitled this revision from to [IR] Add token types.
majnemer updated this object.
majnemer added a subscriber: llvm-commits.
silvas added a subscriber: silvas.

+John McCall

LGTM aside from some questions about the return type exclusion.

lib/IR/Type.cpp
383 ↗(On Diff #31573)

Will this preclude having intrinsics that produce tokens? I'd have thought such a thing would make sense, though I don't have an immediate need for one.

test/Assembler/token.ll
6 ↗(On Diff #31573)

Seems odd to allow token parameters but not token returns; I'd have expected each kind of cross-function transfer to cause roughly the same amount of trouble.

rnk edited edge metadata.Aug 10 2015, 8:54 AM

Can you expand on the use cases for tokens? It sounded like there were more than just this on Friday.

Fundamentally, this is breaking the invariant that values produced by instructions Can Be Phied. We need more than one use case to want to break that rule.

lib/Transforms/Utils/SimplifyCFG.cpp
1644–1645 ↗(On Diff #31573)

Shouldn't this have a test?

test/Verifier/token1.ll
8 ↗(On Diff #31573)

I think you can combine these tests into one token-invalid.ll file. You get one verifier failure per function, so if you keep them split up like this, it'll work.

In D11861#220538, @rnk wrote:

Can you expand on the use cases for tokens? It sounded like there were more than just this on Friday.

Fundamentally, this is breaking the invariant that values produced by instructions Can Be Phied. We need more than one use case to want to break that rule.

The folks working on safepoints need something along these lines in order to correctly track pointer provenance. @sanjoy or @reames could probably do a better job describing the role of token types in "argument bundles".

lib/IR/Type.cpp
383 ↗(On Diff #31573)

We can always loosen the restriction later if we need to.

lib/Transforms/Utils/SimplifyCFG.cpp
1644–1645 ↗(On Diff #31573)

This change was made by inspection. I'll try to come up with a test case.

test/Assembler/token.ll
6 ↗(On Diff #31573)

Right, I'll preclude this possibility.

test/Verifier/token1.ll
8 ↗(On Diff #31573)

Sure, will do.

majnemer added inline comments.Aug 10 2015, 9:22 AM
test/Assembler/token.ll
6 ↗(On Diff #31573)

Actually, it should be OK for a function to accept token parameters so long as it cannot return a token. If it returned a token, it would be ambiguous which token would be returned.

test/Verifier/token1.ll
8 ↗(On Diff #31573)

I tried this and it didn't seem to work.

Running llvm-as over the following gave me a single error:

define void @test1(token %A, token %B) {
entry:
  br label %bb

bb:
  %phi = phi token [ %A, %bb ], [ %B, %entry]
; CHECK: PHI nodes cannot have token type!
  br label %bb
}

define void @test2(token %A, token %B) {
entry:
  br label %bb

bb:
  %sel = select i1 undef, token %A, token %B
; CHECK: select values cannot have token type
  br label %bb
}

define void @test3(token %A, token %B) {
entry: 
  alloca token 
; CHECK: invalid type for alloca 
  ret void 
}
@GV = external global token
; CHECK: invalid type for global variable
test/Assembler/token.ll
6 ↗(On Diff #31573)

The use case for return tokens that I can see is in the caller, to track a value back unobscurably to a callsite (which would require the call to be rewritten before/at lowering, so seems more likely with intrinsics than real functions).

The use case for parameter tokens I don't really understand. I think you're suggesting that it would be to introduce unobscurable interprocedural dataflow? Allowing that with multiple callers seems tantamount to allowing token PHIs.

At any rate, all of this is a thought experiment for possible future uses as far as I'm concerned, so I have no issue with the code going in as-is.

majnemer updated this revision to Diff 31691.Aug 10 2015, 11:00 AM
majnemer edited edge metadata.
  • Let functions return tokens
sanjoy edited edge metadata.EditedAug 10 2015, 11:35 AM

Couple of questions

  • [obsolete -- I see you disallow the token* type] Can we load / store through a pointer of type token *, allocated on the heap, or created via inttoptr?
  • Can we do this transform:
if (cond) {
  %t0 = call token @p()
  call @consume(token %t0)
} else {
  %t1 = call token @p()
  call @consume(token %t1)
}

TO

%t = call token (cond ? @p : @q)()
call @consume(token %t)

?

lib/Analysis/LoopInfo.cpp
211 ↗(On Diff #31691)

Just to be clear -- this is only conservatively correct, right? IIUC, we should be able to clone token producing instructions as long as we do not introduce phis / selects.

Couple of questions

  • [obsolete -- I see you disallow the token* type] Can we load / store through a pointer of type token *, allocated on the heap, or created via inttoptr?
  • Can we do this transform:
if (cond) {
  %t0 = call token @p()
  call @consume(token %t0)
} else {
  %t1 = call token @p()
  call @consume(token %t1)
}

TO

%t = call token (cond ? @p : @q)()
call @consume(token %t)

?

I don't think such a transformation would be canonical because the call is now indirect. Nevertheless, this wouldn't have an effect on my use case. Which would you prefer?

lib/Analysis/LoopInfo.cpp
211 ↗(On Diff #31691)

Correct.

majnemer updated this revision to Diff 31731.Aug 10 2015, 2:53 PM
majnemer edited edge metadata.
  • Let functions return tokens
  • Only allow intrinsics to return/pass tokens around

Do we feel we have reached consensus (perhaps weak consensus) or is there something you'd like to see addressed?

Nothing more for me.

Looks good to me too.

vsk added a subscriber: vsk.Aug 12 2015, 2:34 PM

Hi David,

Could you add a small snippet to test/Bitcode/compatibility.ll?

Something like:

define token @llvm.some.intrinsic() {
; CHECK: define token @llvm.some.intrinsic()
entry:
  ret token undef
; CHECK: ret token undef
}

You can find the rationale for this here: http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility

Otherwise, LGTM.

I realized I do have an issue to raise as I was going to hook up LLILC to the new EH representation.

lib/IR/Verifier.cpp
2200 ↗(On Diff #31731)

I think we'll need to allow token PHIs with undef, if we want to use this for EH pads and we still want code commoning in EH pads to be legal.

I.e., if we have something along the lines of

cleanup1:
  cleanuppad
  ; chunk of code
  cleanupret unwind label %next1
cleanup2:
  cleanuppad
  ; identical chunk of code
  cleanupret unwind label %next2

then it was my understanding that it was a key design point of the new EH representation that it would be legal (if unprofitable) to transform this to

cleanup1:
  cleanuppad
  br label %shared_chunk
tail1:
  cleanupret unwind label %next1
shared_chunk:
  %origin = phi i1 [ 0, %cleanup1 ], [ 1, %cleanup2 ]
  ; chunk of code
  br i1 %origin, label %tail1, label %tail2
cleanup2:
  cleanuppad
  br label %shared_chunk
tail2:
  cleanupret unwind label %next2

but if cleanuppad produces a token that cleanupret consumes:

cleanup1:
  %tok1 = cleanuppad
  ; chunk of code
  cleanupret token %tok1, unwind label %next1
cleanup2:
  %tok2 = cleanuppad
  ; identical chunk of code
  cleanupret token %tok2, unwind label %next2

then a blanket rule against token PHIs precludes the transformation (the token defs would no longer dominate the uses). On the other hand, if we allow PHI with undef (as such a transformation would naturally introduce for any SSA values live across the combined regions), we could create this:

cleanup1:
  %tok1 = cleanuppad
  br label %shared_chunk
tail1:
  cleanupret token %tok1_copy, unwind label %next1
shared_chunk:
  %origin = phi i1 [ 0, %cleanup1 ], [ 1, %cleanup2 ]
  %tok1_copy = phi token [ %tok1, %cleanup1 ], [ undef, %cleanup2 ]
  %tok2_copy = phi token [ undef, %cleanup1 ], [ %tok2, %cleanup2 ]
  ; chunk of code
  br i1 %origin, label %tail1, label %tail2
cleanup2:
  %tok2 = cleanuppad
  br label %shared_chunk
tail2:
  cleanupret token %tok2_copy, unwind label %next2

(on the EH side: if things have changed and we're ok with disallowing this transformation, that's fine with me but it puts cleanupcall on my critical path)

That said, there are different things that "PHI with undef" could mean:

  1. allow a token PHI with only one non-undef incoming value ( %x = phi token [ %y, %1 ], [ undef, %2 ], [ undef, %3 ] )
  2. allow a token PHI with only one distinct non-undef incoming value ( %x = phi token [ %y, %1 ], [ %y, %2 ], [ undef, %3 ] )
  3. also allow directly recursive token PHIs ( %x = phi token [ %y, %1 ], [ %x, %2] , [ undef, %3] )
  4. others?
  5. allow a web of token PHIs so long as there's only one unique incoming value, discounting undef and other phis in the web

#2 seems like a reasonable starting place to me, though any of them would be good enough for my immediate intended purpose (which is to encode finalies with this construct per Reid's suggestion (http://article.gmane.org/gmane.comp.compilers.llvm.devel/85937) as a staging plan until we add cleanupcall).

rjmccall edited edge metadata.Aug 12 2015, 8:42 PM

Could you clarify what, exactly, it is about your personality routine that requires a single exit to correspond to a single entry? Is that on a runtime level or just on a backend-lowering level? I am having trouble imagining what on a runtime level would rely on sniffing the address of the return instruction from the cleanup routine, but I am also having trouble imagining what it is about lowering that justifies such an incredibly invasive change to IR. You're introducing a type of "value" that instantly becomes a massive exception to every single standard rule about values. You can't put them in memory, you can't put them in PHIs, you can't put them in selects, you can't propagate them into or out of functions, you can't clone an instruction that contains a use, and I'm sure there are a dozen other difficulties.

This seems like a wildly general attempt at a solution, but its restrictions will probably prevent it from being useful for anything other than your EH problem, and I am very concerned that you will eventually discover that it isn't even all that useful or important for your EH problem, leaving this as an unwanted fruit simply bit-rotting on the vine.

Having support for #2 makes our job in WinEHPrepare not seriously harder, I'm OK with it. Do any of the other stakeholders see any issues?

I'm a little confused though. Won't the preparation logic color the shared region twice, causing it to be duplicated anyway? If this is just a stepping stone to cleanupcall and no other stakeholder needs it, could we strengthen the restriction later?

Could you clarify what, exactly, it is about your personality routine that requires a single exit to correspond to a single entry? Is that on a runtime level or just on a backend-lowering level? I am having trouble imagining what on a runtime level would rely on sniffing the address of the return instruction from the cleanup routine, but I am also having trouble imagining what it is about lowering that justifies such an incredibly invasive change to IR. You're introducing a type of "value" that instantly becomes a massive exception to every single standard rule about values. You can't put them in memory, you can't put them in PHIs, you can't put them in selects, you can't propagate them into or out of functions, you can't clone an instruction that contains a use, and I'm sure there are a dozen other difficulties.

There are already values which may not be PHI'd or stuck in selects: LabelTy, MetadataTy, FunctionTy
There are also values which may not be loaded or stores to in memory: opaque structures, MetadataTy, FunctionTy, LabelTy
There are also values which cannot be passed into or out of functions and instructions which may not be cloned.

This seems like a wildly general attempt at a solution, but its restrictions will probably prevent it from being useful for anything other than your EH problem, and I am very concerned that you will eventually discover that it isn't even all that useful or important for your EH problem, leaving this as an unwanted fruit simply bit-rotting on the vine.

It was, in fact, originally designed to handle a JIT deopt problem. I am merely reusing it for my EH purposes.

majnemer updated this revision to Diff 32026.Aug 12 2015, 8:58 PM
majnemer edited edge metadata.
  • Let functions return tokens
  • Only allow intrinsics to return/pass tokens around
  • Address Vedant's review comments

Could you clarify what, exactly, it is about your personality routine that requires a single exit to correspond to a single entry? Is that on a runtime level or just on a backend-lowering level? I am having trouble imagining what on a runtime level would rely on sniffing the address of the return instruction from the cleanup routine, but I am also having trouble imagining what it is about lowering that justifies such an incredibly invasive change to IR. You're introducing a type of "value" that instantly becomes a massive exception to every single standard rule about values. You can't put them in memory, you can't put them in PHIs, you can't put them in selects, you can't propagate them into or out of functions, you can't clone an instruction that contains a use, and I'm sure there are a dozen other difficulties.

There are already values which may not be PHI'd or stuck in selects: LabelTy, MetadataTy, FunctionTy
There are also values which may not be loaded or stores to in memory: opaque structures, MetadataTy, FunctionTy, LabelTy

There are no values of FunctionTy at all; I think that's true of opaque as well, but maybe you can make an undef. MetadataTy makes no pretense at all to being an ordinary value; metadata are legal in exactly two positions (operands of other metadata, and a special location on instructions). LabelTy is only legal in successor lists, and as far as I've ever been able to tell, there's not really any good reason for it exist, or indeed basic blocks to be Values at all; you could certainly implement blockaddress without that.

There are also values which cannot be passed into or out of functions

Beyond the above values?

and instructions which may not be cloned.

Not based on their uses, I don't think. In fact, I don't remember there being any instructions that can't be cloned at all — even indirectbr is only *dangerous* to clone, and only because it tends to cause catastrophic use-list scaling problems. The closest thing I can think of is that you can't move an indirectbr (or any of its blockaddress destinations) to a different function.

This seems like a wildly general attempt at a solution, but its restrictions will probably prevent it from being useful for anything other than your EH problem, and I am very concerned that you will eventually discover that it isn't even all that useful or important for your EH problem, leaving this as an unwanted fruit simply bit-rotting on the vine.

It was, in fact, originally designed to handle a JIT deopt problem. I am merely reusing it for my EH purposes.

This proposal is still under design; it's changed multiple times in this very thread. It's at least not obvious to me that it still satisfies everyone else's needs with these changes, because I don't understand anyone else's needs, because they seem to me to be very hand-wavey.

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

Could you clarify what, exactly, it is about your personality routine that requires a single exit to correspond to a single entry? Is that on a runtime level or just on a backend-lowering level? I am having trouble imagining what on a runtime level would rely on sniffing the address of the return instruction from the cleanup routine, but I am also having trouble imagining what it is about lowering that justifies such an incredibly invasive change to IR. You're introducing a type of "value" that instantly becomes a massive exception to every single standard rule about values. You can't put them in memory, you can't put them in PHIs, you can't put them in selects, you can't propagate them into or out of functions, you can't clone an instruction that contains a use, and I'm sure there are a dozen other difficulties.

There are already values which may not be PHI'd or stuck in selects: LabelTy, MetadataTy, FunctionTy
There are also values which may not be loaded or stores to in memory: opaque structures, MetadataTy, FunctionTy, LabelTy

There are no values of FunctionTy at all; I think that's true of opaque as well, but maybe you can make an undef.

opaque can be inside of aggregates, function types can have opaque as their return type.

MetadataTy makes no pretense at all to being an ordinary value; metadata are legal in exactly two positions (operands of other metadata, and a special location on instructions). LabelTy is only legal in successor lists, and as far as I've ever been able to tell, there's not really any good reason for it exist, or indeed basic blocks to be Values at all; you could certainly implement blockaddress without that.

LabelTy can actually be used in places outside of successor lists and blockaddress. However, things go badly if the BasicBlock referred to be the label becomes unreachable as we expect all the non-successor uses to be blockaddress.

There are also values which cannot be passed into or out of functions

Beyond the above values?

and instructions which may not be cloned.

Not based on their uses, I don't think. In fact, I don't remember there being any instructions that can't be cloned at all — even indirectbr is only *dangerous* to clone, and only because it tends to cause catastrophic use-list scaling problems. The closest thing I can think of is that you can't move an indirectbr (or any of its blockaddress destinations) to a different function.

The noduplicate function attribute imbues calls and invokes may not be cloned: http://llvm.org/docs/LangRef.html#function-attributes

This seems like a wildly general attempt at a solution, but its restrictions will probably prevent it from being useful for anything other than your EH problem, and I am very concerned that you will eventually discover that it isn't even all that useful or important for your EH problem, leaving this as an unwanted fruit simply bit-rotting on the vine.

It was, in fact, originally designed to handle a JIT deopt problem. I am merely reusing it for my EH purposes.

This proposal is still under design; it's changed multiple times in this very thread. It's at least not obvious to me that it still satisfies everyone else's needs with these changes, because I don't understand anyone else's needs, because they seem to me to be very hand-wavey.

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

My cleanup is materialized as a funclet which, by definition, has a single entry point. My personality routine has a restriction which requires that a cleanup transfer control to a single, specific, successor. This restriction comes about by the format of the unwind information that my personality routine uses to control its behavior, the cleanup has no mechanism to communicate where it would like to go next. I am in trouble if I have a cleanuppad which can reach two cleanuprets with different successors. By having my cleanuppad produce a value which may not be obscured and consume it in my cleanupret, I know that I can disregard other cleanupret instructions which appear reachable in the CFG but are dynamically unreachable.

In all fairness, token type doesn't preclude the possibility of a different solution for WinEH.
It might be less odious to encode the funclet property in the BasicBlocks themselves. For example, we could state that some BBs are part a funclet. Then we would only need to consider cleanuppad instructions which lead to cleanupret instructions in the same funclet (with a verifier rule that all such cleanupret instructions have the same successor).

I'm a little confused though. Won't the preparation logic color the shared region twice, causing it to be duplicated anyway? If this is just a stepping stone to cleanupcall and no other stakeholder needs it, could we strengthen the restriction later?

Yes. I think that the phi restriction and the need to clone in EH prep are very closely related; if you start with a cannot-be-PHIed def at the handler's single entry and a cannot-be-obscured use of it at every exit from the handler, then I don't see any way that a transformation could introduce a side-entry to that handler, because doing so would break the dominance relation between the entry and one of the exits, requiring an illegal PHI.

Supposing I'm right about that, and supposing we ultimately want the stricter PHI restriction, then since Clang duplicates cleanups and so only generates single-entry handlers, and since ultimately LLILC wants to use cleanupcall and so only generate single-entry handlers, it follows that cloning in EH preparation is just a stepping stone to cleanupcall. If that's the case, I'd rather spend cycles on cleanupcall than cloning so we get to our ultimate destination faster and without subtle/difficult-to-reason-about stepping-stone constructs along the way.

So that brings us to the question: do we ultimately want the stricter PHI restriction? I think the argument in favor of the stricter restriction is that it is simpler. I think the argument in favor of the looser restriction is that it allows transformations and IR generators more freedom, while still preserving the essential property that a token use can unambiguously be resolved to a single token def by walking the SSA graph.

I actually don't know whether we want that freedom for the EH case or not (where by 'we' I mean 'the community'; for LLILC specifically, we ultimately want cleanupcall and don't need this freedom). It allows code commoming across handlers, but since EH prep will necessarily undo such commoning (now that we've given up on using the new representation on all targets and translating it to landingpads in Dwarf EH Prep) that will never be beneficial from a CQ perspective, merely beneficial in the sense that the authors of these (hypothetical?) code commoning transformations won't have to check whether there are crossing token liveranges and abort if so.
I know that it used to be a design goal that "EH preparation should not assume that each basic block lives in exactly one funclet" [1], but I don't know if that's changed. I think the stricter restriction precludes that goal.

[1]: http://article.gmane.org/gmane.comp.compilers.llvm.devel/85915

opaque can be inside of aggregates, function types can have opaque as their return type.

Okay, but the corresponding aggregate/function type isn't usable as an aggregate/function: you can't GEP into the aggregate, and you can't call the function. My point is that arbitrary code does not have to worry about these types existing.

LabelTy can actually be used in places outside of successor lists and blockaddress. However, things go badly if the BasicBlock referred to be the label becomes unreachable as we expect all the non-successor uses to be blockaddress.

So, to me, that sounds like you *can't* use LabelTy in other places, but we just haven't formalized that restriction and enforced it in the verifier.

Not based on their uses, I don't think. In fact, I don't remember there being any instructions that can't be cloned at all — even indirectbr is only *dangerous* to clone, and only because it tends to cause catastrophic use-list scaling problems. The closest thing I can think of is that you can't move an indirectbr (or any of its blockaddress destinations) to a different function.

The noduplicate function attribute imbues calls and invokes may not be cloned: http://llvm.org/docs/LangRef.html#function-attributes

Ah, interesting.

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

My cleanup is materialized as a funclet which, by definition, has a single entry point. My personality routine has a restriction which requires that a cleanup transfer control to a single, specific, successor. This restriction comes about by the format of the unwind information that my personality routine uses to control its behavior, the cleanup has no mechanism to communicate where it would like to go next. I am in trouble if I have a cleanuppad which can reach two cleanuprets with different successors. By having my cleanuppad produce a value which may not be obscured and consume it in my cleanupret, I know that I can disregard other cleanupret instructions which appear reachable in the CFG but are dynamically unreachable.

What happens if the cleanupret becomes unreachable? Also, it doesn't sound like you really need a restriction that the token can't be used in multiple places: if someone cloned the cleanupret, you could make that work pretty easily by just creating multiple returns from your funclet.

Also, won't being able to use tokens as parameters and returns completely break your analysis? I mean, I can't imagine what sort of transformation would actually spontaneously extract a function like that, but if you were willing to rely on statements like that, you wouldn't need token types at all, because no plausible transformation is going to spontaneously insert intermediate SSA defs, either.

opaque can be inside of aggregates, function types can have opaque as their return type.

Okay, but the corresponding aggregate/function type isn't usable as an aggregate/function: you can't GEP into the aggregate, and you can't call the function. My point is that arbitrary code does not have to worry about these types existing.

The following has Value * which cannot be demoted, it passes the verifier:

%X = type opaque

declare %X @g()

declare %X @h()

declare void @i(%X)

define %X @f(i1 %A) {
entry:
  br i1 %A, label %bb1, label %bb2

bb1:                                              ; preds = %entry
  %V1 = call %X @h()
  br label %bb

bb2:                                              ; preds = %entry
  %V2 = call %X @g()
  br label %bb

bb:                                               ; preds = %bb2, %bb1
  %P = phi %X [ %V1, %bb1 ], [ %V2, %bb2 ]
  ret %X %P
}

LabelTy can actually be used in places outside of successor lists and blockaddress. However, things go badly if the BasicBlock referred to be the label becomes unreachable as we expect all the non-successor uses to be blockaddress.

So, to me, that sounds like you *can't* use LabelTy in other places, but we just haven't formalized that restriction and enforced it in the verifier.

That's a fair assessment.

Not based on their uses, I don't think. In fact, I don't remember there being any instructions that can't be cloned at all — even indirectbr is only *dangerous* to clone, and only because it tends to cause catastrophic use-list scaling problems. The closest thing I can think of is that you can't move an indirectbr (or any of its blockaddress destinations) to a different function.

The noduplicate function attribute imbues calls and invokes may not be cloned: http://llvm.org/docs/LangRef.html#function-attributes

Ah, interesting.

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

My cleanup is materialized as a funclet which, by definition, has a single entry point. My personality routine has a restriction which requires that a cleanup transfer control to a single, specific, successor. This restriction comes about by the format of the unwind information that my personality routine uses to control its behavior, the cleanup has no mechanism to communicate where it would like to go next. I am in trouble if I have a cleanuppad which can reach two cleanuprets with different successors. By having my cleanuppad produce a value which may not be obscured and consume it in my cleanupret, I know that I can disregard other cleanupret instructions which appear reachable in the CFG but are dynamically unreachable.

What happens if the cleanupret becomes unreachable?

Then we are left with a cleanuppad whose Value * is unused and the cleanupret would be deleted. This should be fine.

Also, it doesn't sound like you really need a restriction that the token can't be used in multiple places: if someone cloned the cleanupret, you could make that work pretty easily by just creating multiple returns from your funclet.

I can have multiple returns from my funclet, they just all have to go to the same place.

A token value can be used in multiple places, the users can be cloned just fine. What cannot be cloned is the instruction which produces the value as any phi nodes would be problematic.

Also, won't being able to use tokens as parameters and returns completely break your analysis? I mean, I can't imagine what sort of transformation would actually spontaneously extract a function like that, but if you were willing to rely on statements like that, you wouldn't need token types at all, because no plausible transformation is going to spontaneously insert intermediate SSA defs, either.

I have a verifier rule against parameters or returns of token type except when inside of intrinsics. Intrinsics, for obvious reasons, will have to make their token semantics quite clear.

The following has Value * which cannot be demoted, it passes the verifier:

I assume it crashes all the backends, though. Still, interesting; do you know if that's intentional?

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

My cleanup is materialized as a funclet which, by definition, has a single entry point. My personality routine has a restriction which requires that a cleanup transfer control to a single, specific, successor. This restriction comes about by the format of the unwind information that my personality routine uses to control its behavior, the cleanup has no mechanism to communicate where it would like to go next. I am in trouble if I have a cleanuppad which can reach two cleanuprets with different successors. By having my cleanuppad produce a value which may not be obscured and consume it in my cleanupret, I know that I can disregard other cleanupret instructions which appear reachable in the CFG but are dynamically unreachable.

What happens if the cleanupret becomes unreachable?

Then we are left with a cleanuppad whose Value * is unused and the cleanupret would be deleted. This should be fine.

Also, it doesn't sound like you really need a restriction that the token can't be used in multiple places: if someone cloned the cleanupret, you could make that work pretty easily by just creating multiple returns from your funclet.

I can have multiple returns from my funclet, they just all have to go to the same place.

A token value can be used in multiple places, the users can be cloned just fine. What cannot be cloned is the instruction which produces the value as any phi nodes would be problematic.

Okay. But cloning the instruction would be fine if you cloned all of the uses as well?

Also, won't being able to use tokens as parameters and returns completely break your analysis? I mean, I can't imagine what sort of transformation would actually spontaneously extract a function like that, but if you were willing to rely on statements like that, you wouldn't need token types at all, because no plausible transformation is going to spontaneously insert intermediate SSA defs, either.

I have a verifier rule against parameters or returns of token type except when inside of intrinsics. Intrinsics, for obvious reasons, will have to make their token semantics quite clear.

Makes sense.

Okay, I may have just have misunderstood some of the restrictions you were proposing. I have no further objections.

The following has Value * which cannot be demoted, it passes the verifier:

I assume it crashes all the backends, though.

Oh, for sure :)

Still, interesting; do you know if that's intentional?

No idea. Pointers to opaque make sense but we forbid loading them.
The only real use-case that I can imagine is for special, magick, intrinsics.

Anyway, I am not deeply objecting to the feature; I would just like to understand why you need it for EH.

My cleanup is materialized as a funclet which, by definition, has a single entry point. My personality routine has a restriction which requires that a cleanup transfer control to a single, specific, successor. This restriction comes about by the format of the unwind information that my personality routine uses to control its behavior, the cleanup has no mechanism to communicate where it would like to go next. I am in trouble if I have a cleanuppad which can reach two cleanuprets with different successors. By having my cleanuppad produce a value which may not be obscured and consume it in my cleanupret, I know that I can disregard other cleanupret instructions which appear reachable in the CFG but are dynamically unreachable.

What happens if the cleanupret becomes unreachable?

Then we are left with a cleanuppad whose Value * is unused and the cleanupret would be deleted. This should be fine.

Also, it doesn't sound like you really need a restriction that the token can't be used in multiple places: if someone cloned the cleanupret, you could make that work pretty easily by just creating multiple returns from your funclet.

I can have multiple returns from my funclet, they just all have to go to the same place.

A token value can be used in multiple places, the users can be cloned just fine. What cannot be cloned is the instruction which produces the value as any phi nodes would be problematic.

Okay. But cloning the instruction would be fine if you cloned all of the uses as well?

Yes, it is totally fine to clone a BasicBlock wholesale where the token producer feeds into the user.

Also, won't being able to use tokens as parameters and returns completely break your analysis? I mean, I can't imagine what sort of transformation would actually spontaneously extract a function like that, but if you were willing to rely on statements like that, you wouldn't need token types at all, because no plausible transformation is going to spontaneously insert intermediate SSA defs, either.

I have a verifier rule against parameters or returns of token type except when inside of intrinsics. Intrinsics, for obvious reasons, will have to make their token semantics quite clear.

Makes sense.

Okay, I may have just have misunderstood some of the restrictions you were proposing. I have no further objections.

Great, thanks!

This revision was automatically updated to reflect the committed changes.
Herald added a reviewer: ctetreau. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: jdoerfert. · View Herald Transcript