Details
Diff Detail
Event Timeline
llvm/lib/Transforms/IPO/Attributor.cpp | ||
---|---|---|
616 | Isn't this almost the stripAndAccumulate code? How hard would it be to pass some kind of AttributorInfo object to that function (optionally) which triggers the lookup. Or more generically, a callback that takes a Value and returns a lower bound offset (plus indicates success/failure). | |
llvm/test/Transforms/Attributor/dereferenceable-1.ll | ||
234 | Isn't this the fill_range code? |
This patch is not complete yet.
Currently it can only use the known information. I would like to make it possible for it to use the assumed information as well.
The problem with that is:
We need to iterate over the uses and take the maximum of the current dereferancability that is indicated by that use.
this works fine for the known info. But since the lower bound of the assumed constant range can decrease over time,
taking max with existing state is a problem.
The way that the AAFromMustBeExecutedContext is wired makes things a bit tricky and
I would like to get your opinions on this before I proceed.
Can you explain why we need to catch overflows now but not before? I mean, the values determine by the external analysis are valid lower bounds, what is different from versioning on them and making them constant in one of the versions.
I think we should finish this one and then do the following:
We actually don't want the minimal offset but the maximal known offset, right? In addition to renaming the function we will check if the range from AAConstantRange is known, if so, we take the maximum instead.
llvm/include/llvm/IR/Operator.h | ||
---|---|---|
563 | = nullptr is not sufficient? unfortunate. Please describe what external analysis exactly does here. I like the solution of adding this callback though. We need more of these soon. | |
llvm/lib/IR/Value.cpp | ||
644 | Put the entire thing in an if (ExternalAnalysis) please. Or better !ExternaAnalysis then old code, else this code. | |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
552 | Style: UseAssumed, Range, Value, etc. |
llvm/test/Transforms/Attributor/dereferenceable-1.ll | ||
---|---|---|
214 | OK, I will. For us to be able to deduce a 30, there would have to be separate mechanism that deduces the biggest number Perhaps we could use SCEV ? | |
234 | No it is not. Currently this patch does not work with for loops. This is because with for loops AAFromMustBeExecutedContext looks at the branch at the top; This is probably the reason why only a single test is affected by this patch. I will address this issue with a separate patch that passes LoopInfo, DominatorTree and the PostDominatorTree |
void internal test(int index, char *ptrArg) { ptrArg[index] = 'A'; } test(10, ptrA); test(30, ptrB); test(40, ptrC);
Based on call sites the index argument of the test function would be in range [10, 40] right ?
but marking ptrArg dereferancable(41) would be wrong woudn't it ?
The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.
For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.
Correct. First though was: we need something special for the loop case so we know we reach the upper bound. After all, that is probably the most important case to get.
The reason that I made the stripAndAccumulateConstantOffsets overflow aware is that the range can be lower than what it can be in reality.
I thought that these differences can result in unintended underflows.
I see. Agreed.
For accumulateConstantOffset, I think for non inbound GEP's overflows should be ok.
but for stripAndAccumulateConstantOffsets bailing out should probably be the default behaviour.
OK.
llvm/test/Transforms/Attributor/dereferenceable-1.ll | ||
---|---|---|
234 | Right. That reminds me of D64974, though I'm unsure if we still need it. What we need is to make the explorer aware of CanProveNotTakenFirstIteration. Similar to the use of that function that exists, we can go to the non-exit block from a header in getMustBeExecutedNextInstruction (or findForwardJoinPoint) if we know that edge is taken at least once. To not loose the code after the loop we should add a stack of unexplored edges. The exit edge goes there if the loop is known not to be endless (see findForwardJoinPoint). If we are out of forward instructions we can pop an edge from the stack and continue. Alternatively, we could check if the loop was not endless when we visit the header for the second time. |
llvm/test/Transforms/Attributor/willreturn.ll | ||
---|---|---|
364 | Why can't we get dereferenceable(4) for p in this case? |
llvm/test/Transforms/Attributor/willreturn.ll | ||
---|---|---|
364 | Ok for this example specifically, the variable n can be negative so the ans +=p[n] may never be executed. But in general, this patch doesn't work with loops that are not do {} while (cond) This is because AAFromMustBeExecutedContext is not aware when the first iteration is always going to be ran. This needs to be addressed with a separate path and would probably improve many other deductions as well. |
llvm/test/Transforms/Attributor/willreturn.ll | ||
---|---|---|
364 | IIUC, loop rotation can help here because it provides this guarantee. |
llvm/test/Transforms/Attributor/willreturn.ll | ||
---|---|---|
364 | Oh, I thought %n is assumed to be positive:) Thanks. |
llvm/test/Transforms/Attributor/dereferenceable-1.ll | ||
---|---|---|
214 |
that the "Value" in question is going to be guaranteed to take.
I think so, what I thought once is: |
This is because AAFromMustBeExecutedContext is not aware when the first iteration is always going to be ran.
This needs to be addressed with a separate path and would probably improve many other deductions as well.
Agreed and agreed.
Perhaps we could use SCEV ?
[...]
We could. What we want (regardless of how) is to track in AAConstantRange (conceptually) two ranges: (1) potential value range, (2) known value range. This is not the same as the assumed/known we track now. Right now bot track the potential value range, that is what value can this llvm::Value potentially have at runtime. I think tracking the second, thus, what value is this llvm::Value going to have at runtime (if it is executed) could be tracked in the same AA or a different one, depending on how much logic we can share. This is also a follow up patch (or two).
I'll have to go through the logic in this patch tomorrow (=rested) but I think it looks pretty good.
Agreed.
Currently AAConstantRange tracks the range that a "llvm::value" is guaranteed to be within in runtime.
But for loops, this is results in information loss.
So I think what you suggest is that we track a specific range for loop like situations.
for an actual loop this would be, the first and the last index of the counter.
and when we are propagating this range from phi nodes, select instructions, call sites (of internal functions) we would intersect them together.
For example:
void test_use(char *ptrA, char *ptrB) { for (int i = 10; i < 100; i++) { test(ptrA, i); //i is in [10, 99] loop range. } for (int i = 5; i < 150; i++) { test(ptrB, i); //i is in [5, 149] loop range. } } //i is in [10, 99] loop range //ptr is dereferencable(100) void internal test(char *ptr, int i) { ptr[i] = 'A'; }
We should finish this review and then focus on the next steps. I added more comments but I think the general logic is fine.
(Next step would be to rename the range in AAConstantRange to PossibleRange, or similar, and add a second range, e.g., ObservedRange which we need to deduce. We can then use the max(PossibleRange.minimum(), ObservedRange.maximum()) for dereferenceable deduction.
llvm/lib/IR/Operator.cpp | ||
---|---|---|
79 | Style: I think it would be easier to read if you split the above if-else-cascade and use early exits: MinimalIndex = ConstOffset->getValue(); continue; } // The operand is not constant, check if an external analysis was provided. if (!ExternalAnalysis) return false; Do we need to track overflow *before* we use an external analysis? If not we don't need 75/76. Below we can than use UsedExternalAnalysis in 88 and remove it from 95. We should also be able to use a single overflowed flag. | |
99 | Put the simple case first, maybe add a continue to avoid the "else" | |
llvm/lib/IR/Value.cpp | ||
649 | Simple case first please. | |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
552 | Some names are still starting with a lower case letter. Do we have a test where the assumed range minimum is negative? | |
3749 | Merge in a single debug message and line. | |
3803 | No need for the newline in the beginning. TBH, the entire message doesn't help much given that we see the update debug message already. |
Eliminated redundant debug messages, Style fixes, Added negative test case,
Don't use external analysis if the get operand is a struct type.
llvm/lib/IR/Operator.cpp | ||
---|---|---|
79 | We probably don't need to detect a overflow before the use of external analysis I was just being safe. I am not sure what you mean by the early exit change. So the value of the ConstantOffset needs to pass through something that would detect a overflow if ExternalAnalysis is present and it has been used. We could do it with a lambda like this: AddArrayIndex(ConstOffset->getValue()); continue; } Also do we really need to have the old and the new code in this style: if (!External Analysis) { //old } else { //new } New code shouldn't really behave any different other than detecting overflows/underflows. |
llvm/lib/IR/Operator.cpp | ||
---|---|---|
79 |
|
Two more minor comments, other than that the code looks good. Please update so I can commit it.
llvm/lib/IR/Operator.cpp | ||
---|---|---|
60 | Nit: Don't call it MinimalIndex as there might be use cases that over-approximate the number. Index is just fine I think. | |
llvm/lib/Transforms/IPO/Attributor.cpp | ||
566 | Nit: AllowNonInbounds and `Value above. Check other variable names as well. |
Apologies for the delay, can you rebase this and provide me with "Firstname Lastname <email>" from you so I can attribute it to you?
Rebased.
Small logic change in GEPOperator::accumulateConstantOffset to bailout on scalable vector types
except for when the offset is zero.
Not allowing zero breaks @test_accumulate_constant_offset_vscale_zero
see https://reviews.llvm.org/rGef64ba831194c7deac8882a325ea9bea64eb612a
= nullptr is not sufficient? unfortunate.
Nit: -value +Offset, or remove the names.
Please describe what external analysis exactly does here.
I like the solution of adding this callback though. We need more of these soon.