Page MenuHomePhabricator

nemanjai (Nemanja Ivanovic)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 23 2015, 9:38 AM (234 w, 2 d)

Recent Activity

Yesterday

nemanjai committed rG3d68adebc579: [PowerPC][NFC] Precomit test case for upcoming patch (authored by nemanjai).
[PowerPC][NFC] Precomit test case for upcoming patch
Sun, Jul 21, 2:05 PM
nemanjai committed rL366661: [PowerPC][NFC] Precomit test case for upcoming patch.
[PowerPC][NFC] Precomit test case for upcoming patch
Sun, Jul 21, 2:03 PM
nemanjai committed rG73d641a23c29: [PowerPC][NFC] Regenerate test using script (authored by nemanjai).
[PowerPC][NFC] Regenerate test using script
Sun, Jul 21, 11:43 AM
nemanjai committed rL366659: [PowerPC][NFC] Regenerate test using script.
[PowerPC][NFC] Regenerate test using script
Sun, Jul 21, 11:42 AM

Tue, Jul 16

nemanjai accepted D54409: PowerPC/SPE: Fix load/store handling for SPE.

LGTM other than a few minor nits.

Tue, Jul 16, 2:06 PM · Restricted Project
nemanjai accepted D56703: PowerPC: Fix register spilling for SPE registers.

Oops. Forgot to accept.

Tue, Jul 16, 1:56 PM · Restricted Project
nemanjai added a comment to D56703: PowerPC: Fix register spilling for SPE registers.

LGTM other than the code here looks really messy - but it looks messy regardless of this patch.

Tue, Jul 16, 1:56 PM · Restricted Project

Thu, Jul 11

nemanjai updated the diff for D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.

Clean up some of the comments, function naming, single-use check and other comments from Jinsong. Thanks for the review Jinsong.

Thu, Jul 11, 4:28 AM · Restricted Project

Wed, Jul 10

nemanjai added inline comments to D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.
Wed, Jul 10, 5:19 PM · Restricted Project

Mon, Jul 8

nemanjai added a comment to D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.

@jsji @amyk Do you guys have any further comments regarding this?

Mon, Jul 8, 5:05 AM · Restricted Project

Fri, Jul 5

nemanjai committed rG6c9a392c8eb0: [PowerPC] Move TOC save to prologue when profitable (authored by nemanjai).
[PowerPC] Move TOC save to prologue when profitable
Fri, Jul 5, 11:39 AM
nemanjai committed rL365232: [PowerPC] Move TOC save to prologue when profitable.
[PowerPC] Move TOC save to prologue when profitable
Fri, Jul 5, 11:39 AM
nemanjai closed D63803: [PowerPC] Move TOC save to prologue when profitable.
Fri, Jul 5, 11:39 AM · Restricted Project
nemanjai added inline comments to D63803: [PowerPC] Move TOC save to prologue when profitable.
Fri, Jul 5, 11:39 AM · Restricted Project
nemanjai added inline comments to D64220: [PowerPC] Remove redundant load immediate instructions.
Fri, Jul 5, 11:17 AM · Restricted Project
nemanjai requested changes to D61961: [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32.

A few minor nits and a functional problem that needs to be addressed (exiting if the input is not a v4f32).

Fri, Jul 5, 11:14 AM · Restricted Project

Mon, Jul 1

nemanjai updated the diff for D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.

Missed some code cleanup on the last upload.

Mon, Jul 1, 11:39 AM · Restricted Project
nemanjai updated the diff for D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.

Updated to handle the load->shuffle pattern in addition to the load->build_vector pattern.

Mon, Jul 1, 11:05 AM · Restricted Project
nemanjai created D64024: [PowerPC][Altivec] Emit correct builtin for single precision vec_all_ne.
Mon, Jul 1, 10:49 AM · Restricted Project

Wed, Jun 26

nemanjai added a comment to D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.

Thanks @jsji for letting me know, and thanks @nemanjai for the handling of the load and splats!

I think this mostly looks good to me. I'm curious though, we also have test/CodeGen/PowerPC/load-v4i8-improved.ll that has a load->shift/permute->splat case. Could this patch be extended to cover this?

Wed, Jun 26, 10:58 AM · Restricted Project
nemanjai updated the diff for D63803: [PowerPC] Move TOC save to prologue when profitable.

Remove unnecessary function and fix up conditions for the transformation.

Wed, Jun 26, 5:46 AM · Restricted Project
nemanjai added inline comments to D63803: [PowerPC] Move TOC save to prologue when profitable.
Wed, Jun 26, 4:26 AM · Restricted Project

Tue, Jun 25

nemanjai committed rG4c64c62b9af1: [NFC] Fix buildbot breaks due to r364375 (authored by nemanjai).
[NFC] Fix buildbot breaks due to r364375
Tue, Jun 25, 7:50 PM
nemanjai committed rL364377: [NFC] Fix buildbot breaks due to r364375.
[NFC] Fix buildbot breaks due to r364375
Tue, Jun 25, 7:46 PM
nemanjai created D63803: [PowerPC] Move TOC save to prologue when profitable.
Tue, Jun 25, 7:42 PM · Restricted Project
nemanjai committed rG69822ae10600: [PowerPC][NFC] Add a TOC save test case prior to posting a related patch (authored by nemanjai).
[PowerPC][NFC] Add a TOC save test case prior to posting a related patch
Tue, Jun 25, 7:03 PM
nemanjai committed rL364375: [PowerPC][NFC] Add a TOC save test case prior to posting a related patch.
[PowerPC][NFC] Add a TOC save test case prior to posting a related patch
Tue, Jun 25, 7:03 PM
nemanjai committed rG8265e8ff3656: [PowerPC] Mark FCOPYSIGN legal for FP vectors (authored by nemanjai).
[PowerPC] Mark FCOPYSIGN legal for FP vectors
Tue, Jun 25, 6:51 PM
nemanjai committed rL364373: [PowerPC] Mark FCOPYSIGN legal for FP vectors.
[PowerPC] Mark FCOPYSIGN legal for FP vectors
Tue, Jun 25, 6:50 PM
nemanjai closed D63634: [PowerPC] Mark FCOPYSIGN legal for FP vectors.
Tue, Jun 25, 6:50 PM · Restricted Project
nemanjai added a comment to rL363040: [DAGCombine] GetNegatedExpression - constant float vector support (PR42105).

This breaks test cases on a few targets. Please refer to the original review for details.

Tue, Jun 25, 4:58 AM
nemanjai committed rG47b7d13459a7: [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern (authored by nemanjai).
[PowerPC] Emit XXSEL for vec_sel and code that has the same pattern
Tue, Jun 25, 3:49 AM
nemanjai committed rL364289: [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern.
[PowerPC] Emit XXSEL for vec_sel and code that has the same pattern
Tue, Jun 25, 3:46 AM
nemanjai closed D61658: [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern.
Tue, Jun 25, 3:46 AM · Restricted Project

Mon, Jun 24

nemanjai added a comment to D62963: [DAGCombine] GetNegatedExpression - constant float vector support (PR42105).

Same issue happens with arm64, aarch64, nvptx triples.

Mon, Jun 24, 12:20 PM · Restricted Project
nemanjai added a comment to D62963: [DAGCombine] GetNegatedExpression - constant float vector support (PR42105).

This breaks (at least) PowerPC with the typical DAG Combine cycle (i.e. one combine undoes the other in a cycle). Here's a minimal test case to show this:

define dso_local <4 x double> @sub(double %b, double* nocapture readonly %ptr) local_unnamed_addr {
entry:
  %arrayidx = getelementptr inbounds double, double* %ptr, i64 45320
  %0 = load double, double* %arrayidx, align 4
  %vecinit = insertelement <4 x double> undef, double %0, i32 0
  %arrayidx1 = getelementptr inbounds double, double* %ptr, i64 176
  %1 = load double, double* %arrayidx1, align 4
  %vecinit2 = insertelement <4 x double> %vecinit, double %1, i32 1
  %arrayidx3 = getelementptr inbounds double, double* %ptr, i64 2734
  %2 = load double, double* %arrayidx3, align 4
  %vecinit4 = insertelement <4 x double> %vecinit2, double %2, i32 2
  %arrayidx5 = getelementptr inbounds double, double* %ptr, i64 7
  %3 = load double, double* %arrayidx5, align 4
  %vecinit6 = insertelement <4 x double> %vecinit4, double %3, i32 3
  %splat.splatinsert = insertelement <4 x double> undef, double %b, i32 0
  %splat.splat = shufflevector <4 x double> %splat.splatinsert, <4 x double> undef, <4 x i32> zeroinitializer
  %div = fdiv fast <4 x double> %vecinit6, %splat.splat
  %sub = fsub fast <4 x double> <double 0.000000e+00, double 0.000000e+00, double 0.000000e+00, double 0.000000e+00>, %div
  ret <4 x double> %sub
}

Compile with llc -mtriple=powerpc64le-unknown-unknown

Mon, Jun 24, 11:38 AM · Restricted Project

Sat, Jun 22

nemanjai added a comment to D63676: Early exit from Hoist() in machine licm pass based on block hotness.

Added a couple more reviewers that either recently modified this file or I think may be interested in the proposed change.

Sat, Jun 22, 10:38 AM · Restricted Project
nemanjai added reviewers for D63676: Early exit from Hoist() in machine licm pass based on block hotness: chandlerc, craig.topper, eli.friedman.
Sat, Jun 22, 10:38 AM · Restricted Project
nemanjai updated the diff for D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst.

Remove the double cast. Simplify the test case. Rename the temp.

Sat, Jun 22, 10:32 AM · Restricted Project
nemanjai added inline comments to D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst.
Sat, Jun 22, 10:09 AM · Restricted Project

Jun 20 2019

nemanjai added inline comments to D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.
Jun 20 2019, 6:26 PM · Restricted Project
nemanjai created D63636: [PowerPC][Altivec] Fix offsets for vec_xl and vec_xst.
Jun 20 2019, 6:23 PM · Restricted Project
nemanjai created D63634: [PowerPC] Mark FCOPYSIGN legal for FP vectors.
Jun 20 2019, 5:13 PM · Restricted Project
nemanjai added inline comments to D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.
Jun 20 2019, 2:06 PM · Restricted Project
nemanjai created D63624: [PowerPC] Exploit single instruction load-and-splat for word and doubleword.
Jun 20 2019, 2:06 PM · Restricted Project

Jun 13 2019

nemanjai added a comment to D61228: [PowerPC] Set the innermost hot loop to align 32 bytes.

Also, please make sure that the summary and text of the commit message does not mention PGO since it is not really considered any longer.

Jun 13 2019, 11:15 AM · Restricted Project

Jun 6 2019

nemanjai created D62993: [PowerPC] Emit scalar min/max instructions with unsafe fp math.
Jun 6 2019, 8:39 PM · Restricted Project
nemanjai updated the diff for D61658: [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern.

Removed redundant pattern.
Added commuted tests, a v2i64 test and a negative (v4i1) test.

Jun 6 2019, 8:04 PM · Restricted Project
nemanjai added inline comments to D61658: [PowerPC] Emit XXSEL for vec_sel and code that has the same pattern.
Jun 6 2019, 8:01 PM · Restricted Project
nemanjai committed rGef4a3aa549ea: [PowerPC] Exploit the vector min/max instructions (authored by nemanjai).
[PowerPC] Exploit the vector min/max instructions
Jun 6 2019, 4:48 PM
nemanjai committed rL362759: [PowerPC] Exploit the vector min/max instructions.
[PowerPC] Exploit the vector min/max instructions
Jun 6 2019, 4:47 PM
nemanjai closed D47332: [PowerPC] Exploit the vector min/max instructions.
Jun 6 2019, 4:47 PM · Restricted Project

Jun 4 2019

nemanjai committed rG7c842fadf100: [PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible (authored by nemanjai).
[PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible
Jun 4 2019, 7:34 PM
nemanjai committed rL362576: [PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible.
[PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible
Jun 4 2019, 7:33 PM
nemanjai closed D60402: [PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible.
Jun 4 2019, 7:33 PM · Restricted Project
nemanjai committed rGcfb6c82172e7: [PowerPC][NFC] Add codegen test for consecutive stores of vector elements (authored by nemanjai).
[PowerPC][NFC] Add codegen test for consecutive stores of vector elements
Jun 4 2019, 7:09 PM
nemanjai committed rL362573: [PowerPC][NFC] Add codegen test for consecutive stores of vector elements.
[PowerPC][NFC] Add codegen test for consecutive stores of vector elements
Jun 4 2019, 7:08 PM
nemanjai closed D62843: [PowerPC][NFC] Add tests to show current codegen of consecutive stores of vector elements.
Jun 4 2019, 7:08 PM · Restricted Project
nemanjai added a comment to D62843: [PowerPC][NFC] Add tests to show current codegen of consecutive stores of vector elements.

I'll commit this for you.

Jun 4 2019, 7:02 PM · Restricted Project
nemanjai accepted D62843: [PowerPC][NFC] Add tests to show current codegen of consecutive stores of vector elements.

LGTM. I assume you're only posting this for review because you don't have commit access.

Jun 4 2019, 7:01 PM · Restricted Project
nemanjai added a comment to D59881: Initial support for vectorization using MASSV (IBM MASS vector library).

The LLVM portion of this patch committed in https://reviews.llvm.org/rL362568.

Jun 4 2019, 7:01 PM · Restricted Project
nemanjai committed rG6321c6806591: Initial support for vectorization using MASSV (IBM MASS vector library) (authored by nemanjai).
Initial support for vectorization using MASSV (IBM MASS vector library)
Jun 4 2019, 6:56 PM
nemanjai committed rL362571: Initial support for vectorization using MASSV (IBM MASS vector library).
Initial support for vectorization using MASSV (IBM MASS vector library)
Jun 4 2019, 6:55 PM
nemanjai closed D59881: Initial support for vectorization using MASSV (IBM MASS vector library).
Jun 4 2019, 6:55 PM · Restricted Project
nemanjai committed rGfe97754acff1: Initial support for IBM MASS vector library (authored by nemanjai).
Initial support for IBM MASS vector library
Jun 4 2019, 6:29 PM
nemanjai committed rL362568: Initial support for IBM MASS vector library.
Initial support for IBM MASS vector library
Jun 4 2019, 6:29 PM
nemanjai added a comment to D59881: Initial support for vectorization using MASSV (IBM MASS vector library).

I'll commit this for you soon Jeeva.

Jun 4 2019, 5:02 PM · Restricted Project
nemanjai committed rGaed7227b7178: Revert r362472 as it is breaking PPC build bots (authored by nemanjai).
Revert r362472 as it is breaking PPC build bots
Jun 4 2019, 11:46 AM
nemanjai committed rL362539: Revert r362472 as it is breaking PPC build bots.
Revert r362472 as it is breaking PPC build bots
Jun 4 2019, 11:46 AM
nemanjai added a comment to rL362472: [DAGCombine] Match a pattern where a wide type scalar value is stored by….

This was identified as the culprit commit that broke the PPC big endian bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/27963

Jun 4 2019, 5:57 AM

Jun 3 2019

nemanjai committed rGbad43d8f49cc: [PowerPC] Look through copies for compare elimination (authored by nemanjai).
[PowerPC] Look through copies for compare elimination
Jun 3 2019, 12:08 PM
nemanjai committed rL362438: [PowerPC] Look through copies for compare elimination.
[PowerPC] Look through copies for compare elimination
Jun 3 2019, 12:06 PM
nemanjai closed D59633: [PowerPC] Look through copies for compare elimination.
Jun 3 2019, 12:06 PM · Restricted Project
nemanjai committed rG009d08f313c4: [PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines (authored by nemanjai).
[PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines
Jun 3 2019, 9:18 AM
nemanjai committed rL362412: [PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines.
[PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines
Jun 3 2019, 9:18 AM
nemanjai closed D62741: [PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines.
Jun 3 2019, 9:18 AM · Restricted Project
nemanjai accepted D40554: [PowerPC] Fix bugs in sign-/zero-extension elimination.

LGTM.

Jun 3 2019, 4:11 AM · Restricted Project

May 31 2019

nemanjai created D62741: [PowerPC] Set PROT_READ flag for MF_EXEC to prevent segfaults on PPC machines.
May 31 2019, 9:53 AM · Restricted Project

May 17 2019

nemanjai added a comment to D61228: [PowerPC] Set the innermost hot loop to align 32 bytes.

Okay, but we just call MBB->getParent()->getFunction().hasProfileData(), where do we actually check that the loop is hot?

Also, even if we align the majority of loops, how much does that really cost us? The code-size impact could be minor compared to the perf improvement, and if so, we should just always do it. It is still true that most users don't use PGO.

May 17 2019, 5:57 AM · Restricted Project

May 16 2019

nemanjai accepted D61966: [PowerPC][NFC] Add a tests for Reordering CSR reloads in epilogue to follow the same order as CSR saves in the prologue.

Please name the test case csr-save-restore-order.ll rather than using the word reverse since the reversal is something you do once and the test case shows the order of saves/restores.

May 16 2019, 3:00 PM · Restricted Project
nemanjai added inline comments to D60506: [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominators.
May 16 2019, 2:45 PM · Restricted Project
nemanjai accepted D59881: Initial support for vectorization using MASSV (IBM MASS vector library).

LGTM. Thanks for getting this done.

May 16 2019, 2:32 PM · Restricted Project
nemanjai added a comment to D61228: [PowerPC] Set the innermost hot loop to align 32 bytes.

Can you explain why we don't always do this? You're checking for profiling data but then not using it?

May 16 2019, 2:29 PM · Restricted Project

May 14 2019

nemanjai added a comment to D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.

It would be interesting to see how that result translates on a more typical x86 build machine. Either way, I suspect we'll get different opinions about whether a 0.3% time increase is minimal and whether that cost is worth paying for the runtime perf gains. This might be a case for differentiating between -O2 and -O3?

May 14 2019, 2:10 PM · Restricted Project

May 13 2019

nemanjai updated the diff for D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.

Remove an unrelated change that snuck in by accident.

May 13 2019, 6:40 PM · Restricted Project
nemanjai updated the diff for D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.

Update the new test case. Thanks @efriedma for the tip.

May 13 2019, 6:35 PM · Restricted Project
nemanjai committed rG1d662316cbff: [Pass Pipeline][NFC] Add a test prior to committing D61726 (authored by nemanjai).
[Pass Pipeline][NFC] Add a test prior to committing D61726
May 13 2019, 2:15 PM
nemanjai committed rL360620: [Pass Pipeline][NFC] Add a test prior to committing D61726.
[Pass Pipeline][NFC] Add a test prior to committing D61726
May 13 2019, 2:15 PM

May 10 2019

nemanjai added a comment to D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.
May 10 2019, 10:18 AM · Restricted Project
nemanjai committed rG34dc3aca407c: Pull r360426 as it is breaking the build bots. (authored by nemanjai).
Pull r360426 as it is breaking the build bots.
May 10 2019, 9:01 AM
nemanjai committed rL360437: Pull r360426 as it is breaking the build bots..
Pull r360426 as it is breaking the build bots.
May 10 2019, 9:01 AM
nemanjai committed rG7a41cd5b8884: Another attempt to fix the build bot breaks after r360426 (authored by nemanjai).
Another attempt to fix the build bot breaks after r360426
May 10 2019, 8:46 AM
nemanjai committed rL360434: Another attempt to fix the build bot breaks after r360426.
Another attempt to fix the build bot breaks after r360426
May 10 2019, 8:45 AM
nemanjai committed rG0f991c65f2c1: Fix build break after r360426 (authored by nemanjai).
Fix build break after r360426
May 10 2019, 8:10 AM
nemanjai committed rL360433: Fix build break after r360426.
Fix build break after r360426
May 10 2019, 8:09 AM
nemanjai updated the diff for D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.

Move the newly added test case and update it to only show the different behaviour (after committing it to show the current behaviour in r360426)
Move the additional run of reassociation before the late LICM pass. I assumed that this is a good place for it in the pipeline since LICM might move things out of the loop and potentially take away some opportunities. This is just based on a weak hunch and I am very much open to suggestions for a better place for this in the pipeline.

May 10 2019, 6:52 AM · Restricted Project
nemanjai committed rGcfc89896e018: [Pass Pipeline][NFC] Add a test prior to committing D61726 (authored by nemanjai).
[Pass Pipeline][NFC] Add a test prior to committing D61726
May 10 2019, 6:45 AM
nemanjai committed rL360426: [Pass Pipeline][NFC] Add a test prior to committing D61726.
[Pass Pipeline][NFC] Add a test prior to committing D61726
May 10 2019, 6:45 AM

May 9 2019

nemanjai added a comment to D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline.

Running reassociate after unroll probably makes sense. But I'd like to see compile-time numbers.

How carefully have you considered the exact placement? It looks like you're using different placement for the legacy vs. new pass manager. Do we want to run before the late LICM pass?

May 9 2019, 12:27 PM · Restricted Project
nemanjai added a reviewer for D61726: [Pass Pipeline] Run another round of reassociation after loop pipeline: tstellar.
May 9 2019, 5:40 AM · Restricted Project