Previously the RedoInsts was processed at the end of the block.
However it was possible that it left behind some instructions that
were not canonicalized.
This should guarantee that any previous instruction in the basic
block is canonicalized before we process a new instruction.
Details
- Reviewers
majnemer
Diff Detail
Event Timeline
Hi David,
Let’s consider:
%_0 = add i32 %in, 1 %_1 = mul i32 %in, -2 %_2 = add i32 %_0, %_1 %_3 = add i32 %_1, %_2 %_4 = add i32 %_3, 1 %_5 = add i32 %in, %_3 ret i32 %_5
Now when processing %3, the expression is:
"%1 + %in + 1 + %1”
and it is factorized to:
“%in + 1 + ( 2 * %1)”
the IR becomes:
%_0 = add i32 %in, 1 %_1 = mul i32 %in, -2 %factor = mul i32 %_1, 2 %_2 = add i32 %in, 1 %_3 = add i32 %_2, %factor %_4 = add i32 %_3, 1 %_5 = add i32 %in, %_3 ret i32 %_5
And %factor was added to the RedoInsts list because we known it might not be canonicalized.
And indeed in this case it is 2*(%1*-2) which can be turned into -4*%1.
Let continue to %5, it will consider
%in + %in + 1 + %factor
It will try to factorize, and list for all of the operands the factors to find some in common.
When processing %factor, the list of factors is
%in * -2 * 2
However this contains two constant and is forbidden because it should have been folded earlier. Because RedoInsts was processed at the end of the block instead of at the end of a transformation this canonicalization would have happened later.
Does it makes sense?
Thanks.
I still don't understand why this is the correct approach? Are we missing an optimization without your change?
Hi Chad,
No we are not missing an optimization without my approach, we are just hitting an assert...
I think it is the right thing to do anyway because I don't see the point of growing a long list of instructions to reprocess for later if you can do it now.
Mehdi
My fuzzer found a case where we hit this same assert with this patch but not without. So it seems I still have to work on it.... Hold on for the review.
lib/Transforms/Scalar/Reassociate.cpp | ||
---|---|---|
1956 | I think this comment needs to be expanded a bit. Why must we refrain from optimizing the PHI operands? | |
1957 | Please format this a little nicer. | |
2284 | Why is topological in quotes? | |
2292–2309 | Instead of having both RedoInsts and Worklist, what if we had a single SetVector that held both? | |
2296–2299 | This would probably be better as: SmallSet<Instruction *, 8> Worklist(BI->begin(), BI->end()); |
Thanks for your comment. I'll update the patch soon. I found out that I could use the RankMap to simplify what I'm trying to achieve.
I also found another but where we end up with an instruction moved to a new block (and hit an assertion).
lib/Transforms/Scalar/Reassociate.cpp | ||
---|---|---|
1956 | Because the operand might be locate in a block that wasn't already processed. | |
2284 | Because we don't have a DAG. | |
2292–2309 | I don't see how would it be possible? | |
2296–2299 | That was my first intention, but it seems that SmallSet does not provide such constructor. I also tried: SmallSet<Instruction *, 8> Worklist; Worklist.insert(BI->begin(), BI->end()); But was bitten by: "candidate function not viable: no known conversion from 'llvm::Instruction' to 'llvm::Instruction *' for 1st argument; take the address of the argument with &" |
Improve the filtering of instructions to reprocess based on basic block rank.
Add a new "crashing" test case that triggered this bug.
Reuse the RPOT iterator with BuildRankMap() for efficiency purpose.
Note, this patch has been ran over the weekend on my fuzzer without any
new crash.
(Wow, this is from a while ago.)
If three of these tests are crashing, shouldn't we just commit the fix?
(Does it still apply?)
Chad, do you still have concerns? Is it about compile time or something
else?
Hi Mehdi,
Please consider merging the test files. They all have the same test line and as far as I know you are fixing the source of one crash, not eight.
Also please:
- Add check lines.
- Use opt -instnamer to get rid of the %[num] variables.
My 2c.
Thanks,
-Quentin
Only 3 test are still crashing now, but at the time I wrote the patch, I iterated on a fuzzer and these six tests (not eight) were stressing different patterns and different part of reassociate, so they're absolutely not redundant.
I'm not sure either what CHECK line to put for a compiler crash non-regression, there are multiple cases in the test suite like this where FileCheck is not involved.
(and the fact that 3 tests were fixed in the meantime shows that they're independent)
What Quentin meant in his review was that you should concatenate all of the tests in a single file and run them that way, in general that's "accepted practice" for the project. I'm adding David as a reviewer as he's been doing that. Once you merge the testcases and David says ok, then we're good.
Thanks!
Sorry, this one fell completely off my radar.
Per Quentin/Eric, I would merge all the tests into a single file. Quentin's suggestion to use the instnamer seems fairly reasonable as well.
I'd like to hear David's comments before providing a LGTM as he has actually reviewed this patch in earnest.
I assume this should also be merged into the 3.8 branch, if we're actually crashing without this fix.
lib/Transforms/Scalar/Reassociate.cpp | ||
---|---|---|
2264 | A "Topological" what? A "Topological" ranking... The sentences seems incomplete. | |
2265 | were processed before -> have been processed. (Or something similar) | |
2293 | No need for extra curly brackets. | |
2304 | No need for extra curly brackets. |
lib/Transforms/Scalar/Reassociate.cpp | ||
---|---|---|
2265 | ben -> been ? | |
2301–2302 | Is it possible for I to equal II ? Removing this check doesn't seem to make any tests fail. | |
test/Transforms/Reassociate/prev_insts_canonicalized.ll | ||
2 | I'd follow the pattern used by the other tests: ; RUN: opt < %s -reassociate -S | FileCheck %s This will ensure that the test's filename will not show up in the test's output. |
I think this comment needs to be expanded a bit. Why must we refrain from optimizing the PHI operands?