In AAIsDeadFunction, don't include AliveSuccessors that are UB.
Diff Detail
Event Timeline
I think we need to query AAUB from AAIsDead in the updateImpl. See the TODO in AAIsDeadFunction::updateImpl(Attributor &A). Similarly, in AAUB we should go through the explorer context when we add something to the knownUB set. So if I is knownUB, make all instructions in the must-be-executed-context of I knownUB, thus insert all into the set. In AAIsDead we can then simply query isKnownUB and that will only need to look into the set (as before). In addition to the TODO mentioned before we need the same logic in the initialize from AAIsDeadFunction before we add the entry block as live.
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
2170 | We should avoid caching stuff in the AAs. isKnownToCauseUB can take an Attributor reference. | |
3118 | As mentioned above, you can add the Attributor as a argument to calls. |
I didn't see that at all, thanks! I'll give it a try.
Similarly, in AAUB we should go through the explorer context when we add something to the knownUB set. So if I is knownUB, make all instructions in the must-be-executed-context of I knownUB, thus insert all into the set. In AAIsDead we can then simply query isKnownUB and that will only need to look into the set (as before).
If I understand it correctly, this will mark as UB all the instructions from the UB instruction and forwards (i.e. the must be executed context goes to the successors). That will probably complicate things as to see if a BB is dead, with the current code, it makes sense to see its first instruction right?
In addition to the TODO mentioned before we need the same logic in the initialize from AAIsDeadFunction before we add the entry block as live.
Oh yes, I had added and removed that basically because that I think can't work with assumed info easily. I'll add it with known info to make clear what I mean.
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3118 | Ok, thanks, I'll do it that way. |
- Query AAUB in identifyAliveSuccessors() for BranchInst.
This is essentially a proposed PoC on how to use AAUB in AAIsDeadFunction::updateImpl(). The TODO there says "look for (assumed) UB to backwards propagate "deadness"".
In my understanding that seems kind of unorthodox since the whole process is moving forwards (with an "artificial" recursion so it's not easy to "go back" since we never actually return back). Because of that, I thought it may be better instead of that to just "not go forwards" on successors (and BBs) that have UB (not going forwards into them means they're not put in AssumedLiveBlocks, effectively keeping them dead).
- Passed the attributor in isAssumedDead() etc. (that resulted in changes all over the place, I hope that's ok).
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3112–3115 | With the known info it's easy because we don't need to remember that we were based on assumed info (in the updateImpl()). But now that I see that again, it probably is as simple as inserting Front in the ToBeExploredFrom (always) and only make the EntryBlock live if it is not assumed to cause UB. |
The must-be-executed-context is not defined to only contain "successors". It might right now but that will change eventually.
I think we should stick to known information right now.
I still believe this is more complex and less general than the approach I tried to describe. Maybe the following is a good compromise in the direction I think we should going but working already right now:
- Since the must-be-executed-context is not collecting predecessors yet, we need to change that. The code actually exists already, it just needs to be separated from some other improvements and put for review again. I'll look into that (or @uenoku you can if you want to).
- Collecting instructions that are executed always with one which causes UB in the "knownUB" set seems logical to me. After all, their execution is "eventually" leading to UB. Having them in the set gives us a single consistent way to check instead of iterating over must-be-executed contexts all the time. We can keep the context loop in isKnownUB for now but we should eventually not do that (thus we should add a TODO).
- During the liveness exploration we should check if the beginning of a block is known to cause UB, if so, we do not make it live. We can even check on the instruction level if we want to later. For now, we should be able to do all this in the assumeLive method by "not assuming it is live" if it leads to UB.
Are no tests affected?
Ok, isAssumedToCauseUB() is overoptimistic right now.
I still believe this is more complex and less general than the approach I tried to describe. Maybe the following is a good compromise in the direction I think we should going but working already right now:
- Since the must-be-executed-context is not collecting predecessors yet, we need to change that. The code actually exists already, it just needs to be separated from some other improvements and put for review again. I'll look into that (or @uenoku you can if you want to).
Great, I didn't know this was planned.
- Collecting instructions that are executed always with one which causes UB in the "knownUB" set seems logical to me. After all, their execution is "eventually" leading to UB. Having them in the set gives us a single consistent way to check instead of iterating over must-be-executed contexts all the time. We can keep the context loop in isKnownUB for now but we should eventually not do that (thus we should add a TODO).
I surely agree. My comment was relative to the current MustBeExecutedContext. That is, since it currently only looks successors (and I didn't know it was planned to change), it would be difficult to add the right predecessors to the set when we found a UB instruction.
And that would then make more difficult the fact that to connect it with AAIsDead, I was looking at the start of the block (which could be predecessor of a UB instruction). Now it's clear, thanks!
- During the liveness exploration we should check if the beginning of a block is known to cause UB, if so, we do not make it live. We can even check on the instruction level if we want to later. For now, we should be able to do all this in the assumeLive method by "not assuming it is live" if it leads to UB.
This is sort of what I was going for (if you see the identifyAliveSuccessors for BranchInst, it marks alive only BBs whose first instruction is not UB, hence the whole forwards vs backwards etc.). But as you said, I made it too complicated as instead
of modifying all identifyAliveSuccessors() etc. I could just put the code in assumeLive() which I will now do. :)
Are no tests affected?
A lot, I'll upload in the next diff.
EDIT:
I think we should stick to known information right now.
There's one problem with that when I tried it in assumeLive(). It queries AAUB and blocks that are not yet known to cause UB are assumed live and are never re-tested.
There's one problem with that when I tried it in assumeLive(). It queries AAUB and blocks that are not yet known to cause UB are assumed live and are never re-tested.
I see. Now this becomes a little more involved. What we can do is to call AAUB::isAssumedToCauseUB(BB.front()). Since we haven't visited BB yet, we will say true if there are instructions in there that might cause UB.
Now, that will require us to keep track of the instruction that transferred control to BB as we have to ask AAUB again. Not in assumeLive but in the updateImpl, before the the if(UsedAssumedInformation) check, we can filter the AliveSuccessors based on isAssumedToCauseUB(BB). If we filtered a block we need to set UsedAssumedInformation to true so we will revisit that instruction in the future. If isAssumedToCauseUB(BB) is true and isKnownToCauseUB(BB) as well, we can even remove the block without setting UsedAssumedInformation.
Thx. There is a prototype patch out there that you can/should use as a starting point.
Thanks for your suggestion Johannes, fortunately that was an easy change. But it also impacts about 26 tests or so, which I won't be able to fix today.
However now the interaction between AAUB and AAIsDead becomes interesting. For example, for this code:
define internal i1 @ret_undef() { ret i1 undef } define void @cond_br_on_undef_interproc() { %cond = call i1 @ret_undef() br i1 %cond, label %t, label %e t: ret void e: ret void }
the Attributor leaves... you guessed it, nothing. :)
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 | I think something like make_filter_range from llvm/include/llvm/ADT/STLExtras.h could make this nicer. Or maybe we just need to put it in a helper function, or both. |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 | Yes, using a helper would be better. make_filter_range seems cool, I didn't know it. TBH though, I think it gives less control and predictability. It's not as clear to me as in the current what happens under the hood (apart from the fact that we'll query 2 times the AAUB). Also, FWIW, the current will have a worst case of one allocation and one free and an average case of no allocation / free (assuming that most instructions have less than 8 successors). |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 | Btw, I forgot to explain the AssumedLiveBlocks.erase(BB) part. Basically, if an alive successor is UB and its parent BB has been put into AssumedLiveBlocks, then we should mark that BB dead by removing it. That however seems bad to me. Up to now, we only inserted to it and it feels that something can go badly if we start removing stuff (like, go to an endless loop or the not having a monotone procedure). |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 | I haven't read the code in detail, some observations: if (AAUB.isKnownToCauseUB(AliveSuccessor) || (Assumed = AAUB.isAssumedToCauseUB(AliveSuccessor))) { Assumed is always true or uninitialized here. Reading it after results in true or UB. What you want is to check assumed and if true check known to determine if you used assumed information. No need for count if you call erase. Just erase/insert stuff, the return value even tells you if it was in before. I would like to understand the situation in which we think a block is UB *only after* it was put in AssumedLiveBlocks. We should not erase it but add an assertion so we can see if it triggers on the test. I hope it does not (and we can keep the assertion) as that would mean it works properly. |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 |
Yes, it should be initialized to false. Well, the reason the code has been written in this (maybe weird) way is because if Known returns true, we don't have to call assumed (which we will do if we always call assumed first). I think that if we just initialize to false it will be ok.
Ty, I didn't know that.
A simple way this can happen is with that: define void @cond_br_on_undef_interproc() { br i1 undef, label %t, label %e t: ret void e: ret void } Note that br i1 undef, ... is in the entry block, which is assumed live in initialize() based on known info (and of course, at this point, it's not yet known it causes UB). So, this becomes: define void @cond_br_on_undef_interproc() { unreachable t: unreachable e: unreachable } which is ok I guess, but we could have deleted the whole function (which we didn't since the block is still alive). |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 | Nit: Actually, we can remove Assumed completely and use UsedAssumedInformation which will probably make the code maybe too smart for our own good. :P |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 |
We need to apply the same logic (via the helper) in the initialize.
That is forbidden.
I think the logic in the initialize will allow us to place an assertion here.
Variables do not cost anything. Make the core readable first. |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 |
Sorry, I didn't mean to say "go" as "branch to". I meant when we add other instructions to the entry block.
Sorry, I didn't get that. Could you please specify what logic?
Unfortunately, I think that the fact that isAssumedToCauseUB() is over-optimistic will make all this more and more complicated. define i32 @example(i1* %alloc) { %cond = load i1, i1* %alloc br i1 %cond, label %t, label %e t: ret i32 1 e: ret i32 2 } (and assuming we're having the current thing in initialize()).
This means that the br never gets live and this procedure continues and we're left with: define i32 @cond_br_on_undef_uninit(i1* nocapture nofree nonnull readonly dereferenceable(1) %alloc) #0 { %cond = load i1, i1* %alloc br i1 %cond, label %t, label %e t: ; preds = %0 unreachable e: ; preds = %0 unreachable } |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 |
The same as you apply now to filter "alive successors". We should not mark entry live in the initilize if AAUB::isAssumedUB(EntryBB) is true. We probably need to redefine KnownDeadEnds such that we can add the entry block first instruction during initilize and it will not be assumed live. That will require some changes but should be possible. Let's talke that next year ;) |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
3424 |
Yes, happy new year. :) |
Sorry for being late, this is an exam period so time is very limited. I saw again the problem and it changed a bit but it's similar. To be sure we're on the same page:
In this:
define i32 @example(i1* %alloc) { %cond = load i1, i1* %alloc br i1 %cond, label %t, label %e t: ret i32 1 e: ret i32 2 }
The block of br initially is assumed live, but AAIsDeadFunction::updateImpl() is called, which, because the br is assumed UB, removes the block from the AssumedLiveBlocks.
AAUB looks only live instructions and the br now will never be live (also maybe the fact that we're removing - and it's the only place - messes with the monotony).
I think one solution is to only remove a block if it is known UB.
No problem. I just rebased and at first glance, the problem now does not exist, meaning the @example is output as is.
But a lot of things have changed in the Attributor so certainly I don't understand what is happening. I'll check again tomorrow.
I've been looking today again at this and it's the first time in the Attributor that I don't understand what is happening. This diff contains some prints and AAIsDeadFunction::manifest() is never called. I'll continue looking into this but any help is appreciated, I have lost episodes.
- Rebased with AAUB not manifesting
In this diff, note that I have commented the manifesting in AAUB, so that, for now, we can test UB blocks are deleted because of AAIsDead.
Btw, may I throw the idea that this be done in the end ? That is, blocks be deleted only because of AAIsDead.
Anyway, here are some examples:
define i32 @cond_br_on_undef_interproc(i1 %cond) { br i1 %cond, label %t, label %e t: %a = load i32, i32* null ret i32 1 e: ret i32 2 }
In here, happens what we expect. The entry block is marked live initially and then in AAIsDeadFunction::updateImpl() the block with the UB is removed from alive blocks.
When the manifest phase in Attributor::run() comes, AAIsDeadFunction has 2/3 live blocks, so it is considered live (the AA) and its manifest ultimately deletes the block.
define i32 @example(i1* %alloc) { %cond = load i1, i1* %alloc br i1 %cond, label %t, label %e t: ret i32 1 e: ret i32 2 }
Here, isAssumedToCauseUB() is overoptimistic and that means that the branch is assumed UB and its block is removed in AAIsDeadFunction::updateImpl().
The catch is that although assumed information is used, AAUB never corrects that since it looks at live instructions for UB.
That's a problem in and of itself but that was happening before. The weird thing is that in the manifest stage of run, AAIsDeadFunction is considered
dead because the entry point of the function is considered dead. Shouldn't that mean that the function has to be deleted?
Ping for when you have time. :)
As a note, I think that we shouldn't delete a block based on assumed info. Because then AAUB can never correct it.
This does not completely remove the problems but it should lead to a better path.
A lot of tests fail but I'm not sure if it makes sense to include any tests that test the new behavior yet. Do you want invalid tests?
For example, this now:
define i32 @example(i1* %alloc) { %cond = load i1, i1* %alloc br i1 %cond, label %t, label %e t: ret i32 1 e: ret i32 2 }
results in:
define i32 @example(i1* nocapture nofree nonnull readnone dereferenceable(1) %alloc) #0 { br i1 undef, label %t, label %e t: ; preds = %0 unreachable e: ; preds = %0 unreachable }
And again, as far as I can understand, removing something from alive successors means it isn't getting updated by the AAUB because AAUB never sees it (as it is dead).
Honestly, unfortunately I can't see how this connection fits into the whole picture given the current structure.
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
2285 | No, I think a previous commit states that I put that to test only what AAIsDead deletes. I removed it now. | |
2322 | Yes sorry, that was a leftover. | |
3418 | I agree, as I had stated in a previous comment:
which as far as I can understand, still holds. |
We should avoid caching stuff in the AAs. isKnownToCauseUB can take an Attributor reference.