Page MenuHomePhabricator

yulia_koval (Julia Koval)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 19 2015, 1:56 AM (160 w, 2 d)

Recent Activity

Sep 9 2018

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

What does this comment in InstCombineSelect refer to?

/// TODO: Also support a - UMIN(a,b) patterns.

Sep 9 2018, 11:55 PM
yulia_koval abandoned D33118: Another version of patch for D25987.

This is some old version of already commited patch, I think we need to close this.

Sep 9 2018, 11:49 PM

Mar 6 2018

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

Did D41480 affect anything with these tests? We still need to handle 'umin' patterns in IR and update the tests here with canonical IR forms to show psubus codegen?

Looks like all possible umin patterns are already handled in existing patterns we added for max. The last tests in psubus.ll is also cannonical cmp-select-sub sequence. What should I add which is missing?

Mar 6 2018, 4:27 AM

Feb 5 2018

yulia_koval updated the diff for D41480: Unsigned saturation subtraction canonicalization [Instcombine part].

Fixed last comments. Thank you for your help, could you please commit it for me?

Feb 5 2018, 7:53 AM

Feb 4 2018

yulia_koval updated the diff for D41480: Unsigned saturation subtraction canonicalization [Instcombine part].

Thank you for your comments, fixed them.

Feb 4 2018, 11:52 PM

Jan 31 2018

yulia_koval updated the diff for D41480: Unsigned saturation subtraction canonicalization [Instcombine part].

Fixed the comments

Jan 31 2018, 11:37 PM

Dec 28 2017

yulia_koval added inline comments to D41480: Unsigned saturation subtraction canonicalization [Instcombine part].
Dec 28 2017, 8:26 AM

Dec 21 2017

yulia_koval updated the diff for D41480: Unsigned saturation subtraction canonicalization [Instcombine part].

Fixed comments,

Dec 21 2017, 11:26 PM

Dec 20 2017

yulia_koval added reviewers for D41480: Unsigned saturation subtraction canonicalization [Instcombine part]: zvi, spatel, DavidKreitzer.
Dec 20 2017, 11:56 PM
yulia_koval created D41480: Unsigned saturation subtraction canonicalization [Instcombine part].
Dec 20 2017, 11:55 PM

Oct 27 2017

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

@yulia_koval What is the state of this after D37510 and D37534?

Oct 27 2017, 9:56 PM

Oct 11 2017

yulia_koval updated the diff for D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].

Thanks, fixed your comment in additional patch below.

Oct 11 2017, 11:45 PM

Oct 9 2017

yulia_koval added a comment to D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].

Thanks! Could you please commit it for me?

Oct 9 2017, 12:17 AM

Oct 2 2017

yulia_koval updated the diff for D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].

Fixed problem on min case.

Oct 2 2017, 12:57 AM

Sep 27 2017

yulia_koval added a comment to D37510: Unsigned saturation subtraction canonicalization.
In D37510#882071, @zvi wrote:

LGTM

Sep 27 2017, 7:15 AM
yulia_koval updated the diff for D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].
Sep 27 2017, 5:21 AM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.
Sep 27 2017, 5:04 AM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.
Sep 27 2017, 5:02 AM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.
Sep 27 2017, 3:08 AM

Sep 26 2017

yulia_koval retitled D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part] from Unsigned saturation subtraction canonicalization [the backend part] to [X86] Unsigned saturation subtraction canonicalization [the backend part].
Sep 26 2017, 10:27 PM
yulia_koval updated the diff for D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].

Added 512bit support and fixed comments

Sep 26 2017, 10:24 PM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.

Added 512bit support.

Sep 26 2017, 10:07 PM

Sep 7 2017

yulia_koval added inline comments to D37510: Unsigned saturation subtraction canonicalization.
Sep 7 2017, 3:38 PM
yulia_koval added a comment to D37510: Unsigned saturation subtraction canonicalization.

Is this patch intended to just add the baseline tests, or is there an associated code change that didn't get uploaded?

Sep 7 2017, 3:35 PM

Sep 6 2017

yulia_koval updated the summary of D37510: Unsigned saturation subtraction canonicalization.
Sep 6 2017, 1:52 PM
yulia_koval added a comment to D37510: Unsigned saturation subtraction canonicalization.

Moved second patch(backend part) to extra revision for convenience https://reviews.llvm.org/D37534.

Sep 6 2017, 1:48 PM
yulia_koval created D37534: [X86] Unsigned saturation subtraction canonicalization [the backend part].
Sep 6 2017, 1:46 PM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.

tests, rebased

Sep 6 2017, 1:41 PM
yulia_koval added a comment to D37523: Update_llc_tests for psubus.ll test.

I can't commit, could you commit this for me please?

Sep 6 2017, 11:50 AM
yulia_koval updated the diff for D37523: Update_llc_tests for psubus.ll test.

Tried this, it caused other changes, is it ok?

Sep 6 2017, 11:24 AM
yulia_koval added a comment to D37510: Unsigned saturation subtraction canonicalization.
In D37510#862264, @zvi wrote:

Please update the Summary with more details about the combine that was added.

Sep 6 2017, 11:08 AM
yulia_koval created D37523: Update_llc_tests for psubus.ll test.
Sep 6 2017, 11:01 AM
yulia_koval added reviewers for D37510: Unsigned saturation subtraction canonicalization: n.bozhenov, zvi, spatel, DavidKreitzer.
Sep 6 2017, 5:15 AM
yulia_koval updated the summary of D37510: Unsigned saturation subtraction canonicalization.
Sep 6 2017, 5:08 AM
yulia_koval changed the visibility for D37510: Unsigned saturation subtraction canonicalization.
Sep 6 2017, 4:58 AM
yulia_koval updated the diff for D37510: Unsigned saturation subtraction canonicalization.

This is the second patch, containing backend-code for new pattern.

Sep 6 2017, 4:56 AM
yulia_koval created D37510: Unsigned saturation subtraction canonicalization.
Sep 6 2017, 4:56 AM

May 16 2017

yulia_koval added a comment to D32643: Update tests in psubus.ll.

Do you need someone to commit this for you? I just hit a case where another transform would affect codegen in this file, so I'd like to make sure these tests are minimal.

May 16 2017, 10:05 PM

May 12 2017

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

But do you agree that it is reasonable to write this function in C as:

void goo(unsigned short *p, int max, int n) {
  int i;
  unsigned m;
  for (i = 0; i < n; i++) {
    m = *--p;
    unsigned umax = m > max ? m : max;
    *p = (unsigned short)(umax - max);
  }
}
May 12 2017, 1:26 AM
yulia_koval created D33118: Another version of patch for D25987.
May 12 2017, 1:25 AM

May 2 2017

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

I think we need to take a step back here. The tests are assuming a specific form of IR, but I'm not sure it's canonical. If we change this upstream (in InstCombine), the transform may fail to activate when you hoped it would. It's also possible that generic DAG combines could interfere with the pattern you're expecting. See rL301781 as an example. In either case, this seems like a fragile solution because the pattern can change above this.

http://rise4fun.com/Alive/5e4
Unless I made a mistake in transcribing that, the 1st form is what you're expecting in these tests. The 2nd form might be considered better since it includes a canonical 'max'.

Can you test this patch with more examples like this to see if it behaves as expected:

define i16 @scalar(i16 %x, i32 %y) nounwind {
  %lhs = zext i16 %x to i32
  %cond = icmp ugt i32 %lhs, %y
  %sub = sub i32 %lhs, %y
  %truncsub = trunc i32 %sub to i16
  %res = select i1 %cond, i16 %truncsub, i16 zeroinitializer
  ret i16 %res
}

define i16 @scalar_max(i16 %x, i32 %y) nounwind {
  %lhs = zext i16 %x to i32
  %cond = icmp ugt i32 %lhs, %y
  %umax = select i1 %cond, i32 %lhs, i32 %y
  %sub = sub i32 %umax, %y
  %truncsub = trunc i32 %sub to i16
  ret i16 %truncsub
}

define <8 x i16> @test15(<8 x i16> %x, <8 x i32> %y) nounwind {
  %lhs = zext <8 x i16> %x to <8 x i32>
  %cond = icmp ugt <8 x i32> %lhs, %y
  %sub = sub <8 x i32> %lhs, %y
  %truncsub = trunc <8 x i32> %sub to <8 x i16>
  %res = select <8 x i1> %cond, <8 x i16> %truncsub, <8 x i16> zeroinitializer
  ret <8 x i16> %res
}

define <8 x i16> @test15_max(<8 x i16> %x, <8 x i32> %y) nounwind {
  %lhs = zext <8 x i16> %x to <8 x i32>
  %cond = icmp ugt <8 x i32> %lhs, %y
  %umax = select <8 x i1> %cond, <8 x i32> %lhs, <8 x i32> %y
  %sub = sub <8 x i32> %umax, %y
  %truncsub = trunc <8 x i32> %sub to <8 x i16>
  ret <8 x i16> %truncsub
}

define <8 x i8> @test15_narrow(<8 x i8> %x, <8 x i16> %y) nounwind {
  %lhs = zext <8 x i8> %x to <8 x i16>
  %cond = icmp ugt <8 x i16> %lhs, %y
  %sub = sub <8 x i16> %lhs, %y
  %truncsub = trunc <8 x i16> %sub to <8 x i8>
  %res = select <8 x i1> %cond, <8 x i8> %truncsub, <8 x i8> zeroinitializer
  ret <8 x i8> %res
}

define <8 x i8> @test15_narrow_max(<8 x i8> %x, <8 x i16> %y) nounwind {
  %lhs = zext <8 x i8> %x to <8 x i16>
  %cond = icmp ugt <8 x i16> %lhs, %y
  %umax = select <8 x i1> %cond, <8 x i16> %lhs, <8 x i16> %y
  %sub = sub <8 x i16> %umax, %y
  %truncsub = trunc <8 x i16> %sub to <8 x i8>
  ret <8 x i8> %truncsub
}
May 2 2017, 7:26 AM

Apr 30 2017

yulia_koval updated the diff for D32643: Update tests in psubus.ll.

No, I can actually update all the tests like this.

Apr 30 2017, 4:14 AM

Apr 28 2017

yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.
Apr 28 2017, 6:52 AM
yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.

It's difficult for me to see what's going on in these tests because there are a bunch of irrelevant load/store/gep/bitcast instructions. Can you remove those?

If you do that, the root of the problem is very similar to:
https://reviews.llvm.org/rL286776

Ie, if we recognized the umin/umax patterns in IR as ISD::UMIN/ISD::UMAX when we created the DAG, the subsequent matching to the x86-specific PSUBUS might already happen with the existing lowering?

Apr 28 2017, 6:45 AM
yulia_koval created D32643: Update tests in psubus.ll.
Apr 28 2017, 6:41 AM

Feb 6 2017

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

Gentle ping

Feb 6 2017, 12:45 AM

Jan 26 2017

yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.

Sorry, fixed clang-format.

Jan 26 2017, 6:32 AM
yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.

I updated the patch to trunk, because test output changed due to rL292479 optimization. I failed to find a better check, based on TLI.isLegal.., because it doesn't requires exact type. For example, in case of SSE4.1, isOperationLegalOrCustom(MIN, v8i32) returns false and disables the optimization. However, the transformation works fine as the illegal operation is split into two v4i32 operations. The hasSSE41 check is sufficient, because it includes both v8i16 and v4i32 for each integer type. i8(SSE hass i8 based umin) can't be used in this optimisation, because the ExtType is always larger then psubus type and can be minimum i16.

Jan 26 2017, 3:46 AM

Jan 9 2017

yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.

This optimization requires the umin instruction. It didn't exist before SSE4.1, so I added requirement for SSE4 for it and added new target to test. I haven't found profitable equivalent for umin in sse\sse2 for this situation.

Jan 9 2017, 5:15 AM

Nov 10 2016

yulia_koval updated the diff for D25987: [X86] New pattern to generate PSUBUS from SELECT.
  1. Changed regex-based test to generated from update_llc_test_checks.py script
  2. Replaced getScalarType().getSizeInBits() with VT.getScalarSizeInBits()
Nov 10 2016, 6:27 AM

Nov 8 2016

yulia_koval added a comment to D25987: [X86] New pattern to generate PSUBUS from SELECT.

Gentle ping

Nov 8 2016, 3:27 AM
yulia_koval added a reviewer for D25987: [X86] New pattern to generate PSUBUS from SELECT: chandlerc.
Nov 8 2016, 3:26 AM

Oct 26 2016

yulia_koval retitled D25987: [X86] New pattern to generate PSUBUS from SELECT from to [X86] New pattern to generate PSUBUS from SELECT.
Oct 26 2016, 7:29 AM

Nov 19 2015

yulia_koval updated subscribers of D12498: X86: add an interrupt calling convention.
Nov 19 2015, 1:56 AM