- User Since
- Jan 23 2018, 4:17 PM (177 w, 2 d)
Fri, Jun 11
Fix issue found by buildbot. Locs for Scalar should be requested after VectorMap.operator since it may invalidate those.
Thu, Jun 10
Dummy update to restart testing
Wed, Jun 9
Wed, Jun 2
Tue, May 25
Wed, May 19
May 13 2021
May 12 2021
Apr 30 2021
Added comment + removed useless check + replaced explicit for with llvm::any_of
Apr 29 2021
Apr 27 2021
Apr 22 2021
Extracted small NFC out and rebase
Apr 21 2021
I think, more robust approach would be not to depend on existing instructions but create the required ones during phi translation as needed. Phi translation already does this to some extent. In D90095 (which prototypes PHI translation support in DSE) pointers are simplified by GetPointerBaseWithConstantOffset during PHI translation what effectively removes dependencies on existing GEPs and BitCasts.
Apr 15 2021
Apr 12 2021
Thanks for noting. Fixed.
Apr 9 2021
Apr 7 2021
Thanks for letting me know. Hadn't seen this before. I can reproduce the issue and working on the fix.
Mar 31 2021
I don't really understand why we need this separate heuristic for runtime checks. Why don't we simply add cost of runtime checks (possibly with some small scaling to be safe) to total cost of vector loop and just use existing cost model to decide?
Mar 11 2021
In general this patch seems to take exactly the same direction as D90095 and is a strict subset of it. We can choose to either re-build functionality step by step or take D90095 and split it to several smaller pieces. D90095 adds the follwoing functionality:
- Replaces main 'do-while' loop to 'for' loop in getDomMemoryDef. This is NFC and aimed at simplifying code structure and allows to minimize future changes.
- Supports phi translation to any predecessor (not only immediate one)
- Is able to phi translate in presence of geps and casts. In order to support this we have to track offests.
Stepped back to use SCEVExpander
Mar 5 2021
I'd like to understand why you chose to go the way it is instead of taking cost of runtime checks into account in the cost model itself? To me, the cost model is the right place for that. I would expect to see something similar to what I did in https://reviews.llvm.org/D71053 for LoopVectorizationPlanner::mayDisregardRTChecksOverhead
Mar 3 2021
Mar 2 2021
I actually think this patch should not have been reverted in the first place.
SCEV is known to be not good with undef, i don't really see why we should start blocking on that now.
So i would personally recommend to directly revert your revert.
If I'm getting it right you are OK to commit the patch in its current state. If this is the case please unblock it.
You better remove (WIP) suffix if this patch is ready for review. Otherwise people may think it still in progress...
Uhm, no? SCEV only supports integers and pointers.
I looked at the failing list one more time and indeed it was a pointer to double case.... In total, there are 54 failing tests which needs to be investigated... Roman, Are you working in this direction?
SCEV Expander should be fixed separately anyway and I don't think there is strong dependence between these pieces.
Fix found verification issue
Feb 26 2021
commit 13a5cac2ba919b4d02a296428b58919231e08569 (HEAD -> main, origin/main)
Author: Evgeniy Brevnov <firstname.lastname@example.org>
Date: Fri Feb 26 19:23:32 2021 +0700
I think it's better to revert it for now. Looks like the topic is not that trivial and we should first agree on the fix. There is no need to hurry.
Like i have already said, the fix is https://alive2.llvm.org/ce/z/RkBWxC.
Let me just fix this then.
I think the solution is to use >= instead of > when we do min/max reassociation. In other words, originally we had 'any' > MAX_INT which is known to be false. If we want semantically equal but reassociated expression we should invert the comparison logic. In other words we should check MAX_INT >= 'any' which is known to be true and MAX_INT will be selected.
Just to add to what Roman wrote, thinking of the code as max(,) is misleading. The code is doing icmp sgt INT_MAX, undef which can evaluate to true or false. But we cannot assume that undef from now on is equal to INT_MAX just because the comparison evaluated to true.
I think I understand the problem now. 'any'>max_int is known to be false (first case), while max_int >'any' is unknown (second case)....so this is not verification issue....
Feb 25 2021
I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.
Feb 24 2021
Thanks for doing the changes. Overall this version looks fine to me. Please go ahead with some final cleanups.
Feb 18 2021
I appreciate your time to review and experimenting with the patch. I agree it's kind of unexpected to see less stores get removed. Would it be possible for you to share an IR where that happens (not reduced one is fine)? I'm asking for this because I don't have access to those SPECs.
Feb 17 2021
Feb 16 2021
I started looking into this... Hopefully, will be able to provide my feedback by tomorrow.
Feb 10 2021
Feb 6 2021
I'm going to take a look next week if nobody outstrip me...
Jan 11 2021
Dec 24 2020
Yes. Please review proposed fix https://reviews.llvm.org/D93803. Evgeniy
Thanks for letting me know. Looking now...
Updated test case due to https://reviews.llvm.org/D79485 has landed.
Dec 23 2020
Added test case.
Dec 22 2020
Dec 20 2020
I had already tried to measure performance with the test-suite previously with out success. This time again I observe big variation. I'm using dedicated performance machine which runs only my process. I've build test-suite as follows:
Dec 18 2020
Dec 4 2020
Dec 3 2020