User Details
- User Since
- Nov 19 2015, 1:56 AM (276 w, 1 d)
Sep 9 2018
What does this comment in InstCombineSelect refer to?
/// TODO: Also support a - UMIN(a,b) patterns.
This is some old version of already commited patch, I think we need to close this.
Mar 6 2018
Looks like all possible umin patterns are already handled in existing patterns we added for max. The last tests in psubus.ll is also cannonical cmp-select-sub sequence. What should I add which is missing?
Feb 5 2018
Fixed last comments. Thank you for your help, could you please commit it for me?
Feb 4 2018
Thank you for your comments, fixed them.
Jan 31 2018
Fixed the comments
Dec 28 2017
Dec 21 2017
Fixed comments,
Dec 20 2017
Oct 27 2017
Oct 11 2017
Thanks, fixed your comment in additional patch below.
Oct 9 2017
Thanks! Could you please commit it for me?
Oct 2 2017
Fixed problem on min case.
Sep 27 2017
Sep 26 2017
Added 512bit support and fixed comments
Added 512bit support.
Sep 7 2017
Sep 6 2017
Moved second patch(backend part) to extra revision for convenience https://reviews.llvm.org/D37534.
tests, rebased
I can't commit, could you commit this for me please?
Tried this, it caused other changes, is it ok?
This is the second patch, containing backend-code for new pattern.
May 16 2017
May 12 2017
May 2 2017
Apr 30 2017
No, I can actually update all the tests like this.
Apr 28 2017
Feb 6 2017
Gentle ping
Jan 26 2017
Sorry, fixed clang-format.
I updated the patch to trunk, because test output changed due to rL292479 optimization. I failed to find a better check, based on TLI.isLegal.., because it doesn't requires exact type. For example, in case of SSE4.1, isOperationLegalOrCustom(MIN, v8i32) returns false and disables the optimization. However, the transformation works fine as the illegal operation is split into two v4i32 operations. The hasSSE41 check is sufficient, because it includes both v8i16 and v4i32 for each integer type. i8(SSE hass i8 based umin) can't be used in this optimisation, because the ExtType is always larger then psubus type and can be minimum i16.
Jan 9 2017
This optimization requires the umin instruction. It didn't exist before SSE4.1, so I added requirement for SSE4 for it and added new target to test. I haven't found profitable equivalent for umin in sse\sse2 for this situation.
Nov 10 2016
- Changed regex-based test to generated from update_llc_test_checks.py script
- Replaced getScalarType().getSizeInBits() with VT.getScalarSizeInBits()
Nov 8 2016
Gentle ping