jbhateja (Jatin Bhateja)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 23 2017, 9:51 AM (29 w, 6 d)

Recent Activity

Mon, Nov 13

jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Mon, Nov 13, 9:54 AM
jbhateja committed rL318050: [SCEV] Handling for ICmp occuring in the evolution chain..
[SCEV] Handling for ICmp occuring in the evolution chain.
Mon, Nov 13, 8:44 AM
jbhateja closed D38494: [SCEV] Handling for ICmp occuring in the evolution chain. by committing rL318050: [SCEV] Handling for ICmp occuring in the evolution chain..
Mon, Nov 13, 8:44 AM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Formatting change.
  • Rebasing for commit.
Mon, Nov 13, 8:33 AM
jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Mon, Nov 13, 8:13 AM

Sun, Nov 12

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

ping @reviewers

Sun, Nov 12, 6:47 AM

Fri, Nov 10

jbhateja committed rL317895: [WebAssembly] Fix stack offsets of return values from call lowering..
[WebAssembly] Fix stack offsets of return values from call lowering.
Fri, Nov 10, 8:26 AM
jbhateja closed D39866: [WebAssembly] Fix stack offsets of return values from call lowering. by committing rL317895: [WebAssembly] Fix stack offsets of return values from call lowering..
Fri, Nov 10, 8:26 AM
jbhateja added a reviewer for D39866: [WebAssembly] Fix stack offsets of return values from call lowering.: alexcrichton.
Fri, Nov 10, 5:18 AM

Thu, Nov 9

jbhateja added a reviewer for D39866: [WebAssembly] Fix stack offsets of return values from call lowering.: vadimcn.
Thu, Nov 9, 1:19 PM
jbhateja created D39866: [WebAssembly] Fix stack offsets of return values from call lowering..
Thu, Nov 9, 1:17 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Thu, Nov 9, 12:56 PM

Wed, Nov 8

jbhateja accepted D39529: [WebAssembly] Call signExtend to get sign extended register [NFCI].
Wed, Nov 8, 11:00 AM
jbhateja added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@RKSimon, @lsaba , @jmolly , all your comments have been addressed. Kindly verify so that I can land this into trunk.

Wed, Nov 8, 10:57 AM
jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Wed, Nov 8, 10:53 AM

Sun, Nov 5

jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Sun, Nov 5, 8:28 PM

Sat, Nov 4

jbhateja updated the diff for D39529: [WebAssembly] Call signExtend to get sign extended register [NFCI].
  • Adding a test for signext of function arguments.
Sat, Nov 4, 11:32 AM

Fri, Nov 3

jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

1/ Making the factorization alog iterative. This was earlier commited with

Diff : https://reviews.llvm.org/D35014?id=116144
but some how got removed in successive commits.

2/ Rebasing again. All comments are resolved.

Fri, Nov 3, 3:11 AM

Wed, Nov 1

jbhateja added a reviewer for D39529: [WebAssembly] Call signExtend to get sign extended register [NFCI]: sunfish.
Wed, Nov 1, 11:39 PM
jbhateja created D39529: [WebAssembly] Call signExtend to get sign extended register [NFCI].
Wed, Nov 1, 11:36 PM
jbhateja added a comment to D39252: [WebAssembly] Handle errors in getRegFor{Uns,S}ignedValue..

ConstantExpr are not handled by FastISel.

Wed, Nov 1, 6:16 AM

Tue, Oct 31

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

The problem is, if under ICmp's operand node has a ICmp can evolutable? (or any other conditional expressions).
This only can be solved by registering this chain into getSCEV.

; RUN: opt -analyze -scalar-evolution < %s | FileCheck %s --check-prefix=CHECK-ANALYSIS

define i32 @foo() {
  br label %do_start

do_start:
  %inc = phi i32 [ 0, %0 ], [ %add, %do_start ]
  %cmp = icmp slt i32 %inc, 10000
  %add_cond = add nsw i32 %inc, 2
  %sel = select i1 %cmp, i32 %add_cond, i32 %inc
  %add = add nsw i32 %sel, 1
  br i1 %cmp, label %do_start, label %do_end

; CHECK-ANALYSIS: Loop %do_start: backedge-taken count is 3334
; CHECK-ANALYSIS: Loop %do_start: max backedge-taken count is 3334
; CHECK-ANALYSIS: Loop %do_start: Predicated backedge-taken count is 3334

do_end:
  ret i32 0
}

This case is "actaully" generated from Clang executable using -O3 -S -emit-llvm option.
that mean this case can be "always" appear.

To fix this case, we need to create SCEVConditional. that I commented here.

Okay, just we need to make sure we are evoluting conditional instructions or just checking that instructions are true or false for now.

first one.
If we are "actaully" evoluting a conditional instructions. better creating SCEVConditonal, checking that is true or false when is computing backedge taken count on each AddRec step.
I think this is a correct way for handling conditional instructions.
but, this will cause SCEV jobs MUUUUCHHH SLOWER.

second one.
If we are just checking that instructions are true or false. we do not need to rewrite it, rewriting value means that we handle that value as that "actual" SCEV. (register SCEV on SCEVCallbackVH)
this mean, this is same to write this chain on to createSCEV that we did at first.
so we have to handle it as a "except" of SCEV chain. better inlining code into createAddRecFromPHI.

and just note this issue CANNOT be solved just by checking ICmp has conditional instruction.

Just try pulling out earlier diff : https://reviews.llvm.org/D38494?id=120193 , this was fixing your new case.
You shuld attempt an incremental fix once this patch goes through.

That isn't a POINT, Please read the point that I commented.
Only that handling each test case dosn't means fixing a "point" of problem.

Is the point of this problem is handling the problem just by CHECKING that ICmp is true, or false?
OR are we REALLY wants to handle condtional instructions.
The point of problem is, current SCEV chain cannot handle conditonal expressions.
NOT only a clang, clang emits PHI node when its conditional, LLVM is a toolchain that we can create a compiler.

so I wanted to listen the opinion that "how about handling condtional variables by just creating SCEVCondtional, and comparing at computing backedge-count by visiting each step."

Tue, Oct 31, 1:59 AM

Mon, Oct 30

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

The problem is, if under ICmp's operand node has a ICmp can evolutable? (or any other conditional expressions).
This only can be solved by registering this chain into getSCEV.

; RUN: opt -analyze -scalar-evolution < %s | FileCheck %s --check-prefix=CHECK-ANALYSIS

define i32 @foo() {
  br label %do_start

do_start:
  %inc = phi i32 [ 0, %0 ], [ %add, %do_start ]
  %cmp = icmp slt i32 %inc, 10000
  %add_cond = add nsw i32 %inc, 2
  %sel = select i1 %cmp, i32 %add_cond, i32 %inc
  %add = add nsw i32 %sel, 1
  br i1 %cmp, label %do_start, label %do_end

; CHECK-ANALYSIS: Loop %do_start: backedge-taken count is 3334
; CHECK-ANALYSIS: Loop %do_start: max backedge-taken count is 3334
; CHECK-ANALYSIS: Loop %do_start: Predicated backedge-taken count is 3334

do_end:
  ret i32 0
}

This case is "actaully" generated from Clang executable using -O3 -S -emit-llvm option.
that mean this case can be "always" appear.

To fix this case, we need to create SCEVConditional. that I commented here.

Okay, just we need to make sure we are evoluting conditional instructions or just checking that instructions are true or false for now.

first one.
If we are "actaully" evoluting a conditional instructions. better creating SCEVConditonal, checking that is true or false when is computing backedge taken count on each AddRec step.
I think this is a correct way for handling conditional instructions.
but, this will cause SCEV jobs MUUUUCHHH SLOWER.

second one.
If we are just checking that instructions are true or false. we do not need to rewrite it, rewriting value means that we handle that value as that "actual" SCEV. (register SCEV on SCEVCallbackVH)
this mean, this is same to write this chain on to createSCEV that we did at first.
so we have to handle it as a "except" of SCEV chain. better inlining code into createAddRecFromPHI.

and just note this issue CANNOT be solved just by checking ICmp has conditional instruction.

Mon, Oct 30, 10:42 PM
jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Mon, Oct 30, 10:32 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Review comments resolution.
Mon, Oct 30, 10:32 PM
jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

:( that I commented didn't reviewed. this is why I wanted to make other reversion.

Are you saying that you had made the same comments as I did above? If so, that wasn't clear from what you wrote. You can help authors by:

  • Adding inline comments instead of a top level comment, knowing which part of the patch you're referring to helps comprehension
  • Being specific; instead of saying "this is very general" say "use cast instead of switch"

No, I wanted to say all that we are missing a very important point.

Okay, just we need to make sure we are evoluting conditional instructions or just checking that instructions are true or false for now.

first one.
If we are "actaully" evoluting a conditional instructions. better creating SCEVConditonal, checking that is true or false when is computing backedge taken count on each AddRec step.
I think this is a correct way for handling conditional instructions.
but, this will cause SCEV jobs MUUUUCHHH SLOWER.

second one.
If we are just checking that instructions are true or false. we do not need to rewrite it, rewriting value means that we handle that value as that "actual" SCEV. (register SCEV on SCEVCallbackVH)
this mean, this is same to write this chain on to createSCEV that we did at first.
so we have to handle it as a "except" of SCEV chain. better inlining code into createAddRecFromPHI.

This is the actual point. not only fixing a ICmp, this will not fix a "point" that bug reported.
and this will cause same bug, but from littlely different code.
It just fixing only that "specific" IR assembles in a very narrow range.

Mon, Oct 30, 7:47 PM
jbhateja added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@RKSimon, requested revision changes have been made as per your comments. Can you please validate.

Mon, Oct 30, 7:27 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Review comments resolution.
Mon, Oct 30, 1:09 PM

Sun, Oct 29

jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
  • Rebasing
  • Review comments resolution.
Sun, Oct 29, 10:12 AM
jbhateja added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Sun, Oct 29, 9:55 AM
jbhateja retitled D35014: [X86] Improvement in CodeGen instruction selection for LEAs. from [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. to [X86] Improvement in CodeGen instruction selection for LEAs..
Sun, Oct 29, 8:36 AM

Sat, Oct 28

jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Review comment resolutions.
Sat, Oct 28, 4:19 AM

Tue, Oct 24

jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Review comments resolutions.
Tue, Oct 24, 11:54 PM
jbhateja retitled D38494: [SCEV] Handling for ICmp occuring in the evolution chain. from [ScalarEvolution] Handling for ICmp occuring in the evolution chain. to [SCEV] Handling for ICmp occuring in the evolution chain..
Tue, Oct 24, 1:59 AM
jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

@reviewers, kindly let me know if any other comments.

I hope just understand that I can be rude a bit, please understand I'm not really good at english. ( I'm just a korean highschool student. )

Tue, Oct 24, 1:44 AM

Mon, Oct 23

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

@reviewerskindly let me know if any other comments

Mon, Oct 23, 7:57 AM
jbhateja added a comment to D37660: [ScalarEvolution] Handling Conditional Instruction in SCEV chain..

Few general concerns.

1/  Always submit patch with few test cases (valid for any patch).
2/  Patch https://reviews.llvm.org/D38494 is taking care of this problem. I don't see what is your point in duplicating the efforts.

You are on the review list also.

Mon, Oct 23, 3:40 AM

Sat, Oct 21

jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Sat, Oct 21, 10:37 AM

Oct 18 2017

jbhateja reopened D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

[llvm] r316129 - Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."

Oct 18 2017, 7:49 PM
jbhateja added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

Patch has been regressed through chrome test sweet.
No issues reported. Thanks to Hans Wennborg (hans@chromium.org) for validating it.

Oct 18 2017, 11:42 AM

Oct 17 2017

jbhateja added a comment to D37660: [ScalarEvolution] Handling Conditional Instruction in SCEV chain..
Oct 17 2017, 7:15 PM
jbhateja committed rL316054: [ScalarEvolution] Handling for ICmp occuring in the evolution chain..
[ScalarEvolution] Handling for ICmp occuring in the evolution chain.
Oct 17 2017, 6:36 PM
jbhateja closed D38494: [SCEV] Handling for ICmp occuring in the evolution chain. by committing rL316054: [ScalarEvolution] Handling for ICmp occuring in the evolution chain..
Oct 17 2017, 6:36 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

If a compare instruction is same or inverse of the compare in the
branch of the loop latch, then return a constant evolution node.
Currently scope of evaluation is limited to SCEV computation for
PHI nodes.

Oct 17 2017, 6:29 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

Formatting changes.
Test case extension from .txt to .ll

Oct 17 2017, 9:01 AM
jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

Go to diff history, and download diff. and re upload it.

Oct 17 2017, 6:17 AM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

Removing unrelated changes uploaded by mistake in previous diff.

Oct 17 2017, 6:15 AM
jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

I don't understand why is there are CG commits here....

Oct 17 2017, 5:59 AM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Updating for a review comment.
Oct 17 2017, 5:21 AM
jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Oct 17 2017, 3:10 AM
jbhateja added a comment to D37660: [ScalarEvolution] Handling Conditional Instruction in SCEV chain..

Hi Jun Ryoung Ju,

Oct 17 2017, 3:03 AM

Oct 16 2017

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

With this IR:

define void @f() {
entry:
  br label %loop

loop:
  %iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
  %iv.inc = add i32 %iv, 1
  %cmp = icmp ne i32 %iv.inc, 10
  %cmp.zext = zext i1 %cmp to i32
  br i1 %cmp, label %loop, label %leave

leave:
  ret void
}

and opt -analyze -scalar-evolution , on master:

Printing analysis 'Scalar Evolution Analysis' for function 'f':
Classifying expressions for: @f
  %iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
  -->  {0,+,1}<%loop> U: [0,10) S: [0,10)               Exits: 9                LoopDispositions: { %loop: Computable }
  %iv.inc = add i32 %iv, 1
  -->  {1,+,1}<%loop> U: [1,11) S: [1,11)               Exits: 10               LoopDispositions: { %loop: Computable }
  %cmp.zext = zext i1 %cmp to i32
  -->  (zext i1 %cmp to i32) U: [0,2) S: [0,2)          Exits: 0                LoopDispositions: { %loop: Variant }
Determining loop execution counts for: @f
Loop %loop: backedge-taken count is 9
Loop %loop: max backedge-taken count is 9
Loop %loop: Predicated backedge-taken count is 9
 Predicates:

Loop %loop: Trip multiple is 10

and with your patch:

Printing analysis 'Scalar Evolution Analysis' for function 'f':
Classifying expressions for: @f
  %iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
  -->  {0,+,1}<%loop> U: [0,10) S: [0,10)               Exits: 9                LoopDispositions: { %loop: Computable }
  %iv.inc = add i32 %iv, 1
  -->  {1,+,1}<%loop> U: [1,11) S: [1,11)               Exits: 10               LoopDispositions: { %loop: Computable }
  %cmp.zext = zext i1 %cmp to i32
  -->  1 U: [1,2) S: [1,2)              Exits: 1                LoopDispositions: { %loop: Invariant }
Determining loop execution counts for: @f
Loop %loop: backedge-taken count is 9
Loop %loop: max backedge-taken count is 9
Loop %loop: Predicated backedge-taken count is 9
 Predicates:

Loop %loop: Trip multiple is 10

Note that the SCEV for %cmp.zext is different. I'm not sure how you got a
different answer on your end, since this should basically be the same as the IR
from Hal's example. Perhaps in the example you tried the icmp feeding in to the
latch was a different llvm::Instruction than the icmp feeding into the zext? If
so, that's a missing-CSE bug.

Oct 16 2017, 7:33 PM
jbhateja updated the diff for D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
  • Limiting the evaluation of ICmp to computation of SCEV of PHI nodes.
Oct 16 2017, 7:28 PM

Oct 14 2017

jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
  • Operands of LEA must be of same register class, this constraint is as per Intel's architecture manual.
  • Remove map entry from LEAs map if value list becomes empty.
  • Rebase.
Oct 14 2017, 1:17 AM
jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
  • D35014 : Review comments resolution
  • Removing 2 tests, pulled their latest renamed versions from trunk.
  • [X86] : Factorize LEA, handling for patterns involing SUBREG_TO_REG as LEA operands.
  • Few more changes for LEA factorization.
  • Updating test lea-opt-cse3.ll
  • Formatting changes.
  • Formatting changes
  • Changes to avoid creating costly LEAs in loops, strength reduction for simple LEAs with unit scale
  • Updating test.
  • [X86] Limiting the scope of DAG operands folding while AM based instruction selection to LEAs.
  • Merge from trunk.
  • Extending aggressive AM based folding for LEAs to cover more cases.
  • Updating test post rebase.
  • Formatting changes + fine tuning pattern matching condition.
  • Adding a check for subtarget feature Slow3OpLEA in pattern matching.
  • Few synthetic changes.
  • Undefining result operand of factored statement to preserve SSA nature of Machine IR.
  • Merge branch 'master' of https://github.com/llvm-mirror/llvm
  • Merge branch 'master' of https://github.com/llvm-mirror/llvm
  • Updating tests for reported PRs for initial patch.
  • Merge branch 'master' of https://github.com/llvm-mirror/llvm
  • Pull from trunk.
  • Operands of LEAs must be of same register class.
  • Revert "Operands of LEAs must be of same register class."
Oct 14 2017, 1:06 AM

Oct 13 2017

jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
  • Operands of factored LEA must belong to same register class as per Intel's Architecture Manual.
  • Some code reorganization + rebase.
Oct 13 2017, 11:37 AM

Oct 12 2017

jbhateja added inline comments to D37660: [ScalarEvolution] Handling Conditional Instruction in SCEV chain..
Oct 12 2017, 8:11 AM

Oct 4 2017

jbhateja added inline comments to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Oct 4 2017, 2:26 AM
jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

can you explain me why? (I may I didn't understood why do we need to process without loop latch)
I thought without latch cannot be evoluted from ICmp evolution.

without latch loop have to me evoluted from previous SCEV operation. isn't it?

I don't understand your statement.

The point is that, in the last loop iteration, latch condition is not true, so replacing it with true is incorrect. If you have:

int values[11];
int i = 0;
do {

values[i] = i != 10;

} while (i++ != 10);

After the loop, values should be { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 }. In the 11th iteration, the comparison will be false. With this patch, we'd store 1 into the values array, even for the last iteration.

Oct 4 2017, 2:21 AM
jbhateja committed rL314886: [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post….
[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post…
Oct 4 2017, 2:04 AM

Oct 3 2017

jbhateja added a comment to D38494: [SCEV] Handling for ICmp occuring in the evolution chain..

For following C and IR references.

Oct 3 2017, 8:32 PM
jbhateja retitled D38494: [SCEV] Handling for ICmp occuring in the evolution chain. from [ScalarEvolution] Handling for ICmp occuring in then evolution chain. to [ScalarEvolution] Handling for ICmp occuring in the evolution chain..
Oct 3 2017, 1:53 AM
jbhateja accepted D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@reviewers, if no more comment I shall be landing this into trunk since required revision changes post acceptance are through.

Oct 3 2017, 1:46 AM
jbhateja edited reviewers for D38494: [SCEV] Handling for ICmp occuring in the evolution chain., added: sanjoy, hfinkel, junryoungju; removed: llvm-commits.
Oct 3 2017, 12:40 AM
jbhateja created D38494: [SCEV] Handling for ICmp occuring in the evolution chain..
Oct 3 2017, 12:36 AM

Sep 29 2017

jbhateja added a comment to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..

@reviewers, please let me know if there are any more comments on this patch.

Sep 29 2017, 11:30 AM
jbhateja added a comment to D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..

I'm not sure about this approach; I don't think most of this needs to be done in the X86 backend at all, and much of it shouldn't even be done in the DAG. Most of the code appears to be better handled in a mixture of SimplifyDemandedVectorElts and SLPVectorizer.

PR33758 was about improving codegen for horizontal reductions, so we'd probably be better off having the backend optimize for @llvm.experimental.vector.reduce.add.* (or the legalized patterns it produces), and then getting the vectorizers to create these properly.

Sep 29 2017, 7:47 AM
jbhateja added reviewers for D36454: [X86] Changes to extract Horizontal addition operation for AVX-512.: zvi, spatel.
Sep 29 2017, 2:08 AM

Sep 28 2017

jbhateja added inline comments to D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..
Sep 28 2017, 8:22 AM
jbhateja updated the diff for D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..
  • Changes to cover more patterns for [f]hadd/[f]sub for AVX512 vector types.
  • Generic routines added which looks at uses / undef operands of an operation for scaling it down.
Sep 28 2017, 8:15 AM
jbhateja committed rL314385: [X86] Adding more cases to horizontal [f]add/[f]sub for avx512..
[X86] Adding more cases to horizontal [f]add/[f]sub for avx512.
Sep 28 2017, 12:42 AM
jbhateja closed D38344: [X86] Adding more cases to horizontal [f]add/[f]sub for avx512. by committing rL314385: [X86] Adding more cases to horizontal [f]add/[f]sub for avx512..
Sep 28 2017, 12:42 AM
jbhateja accepted D38344: [X86] Adding more cases to horizontal [f]add/[f]sub for avx512..
Sep 28 2017, 12:36 AM
jbhateja created D38344: [X86] Adding more cases to horizontal [f]add/[f]sub for avx512..
Sep 28 2017, 12:35 AM

Sep 27 2017

jbhateja accepted D37880: Fix an out-of-bounds shufflevector index bug.

LGTM with a minor comment, please run clang-format. over the changes for better inndentation.

Sep 27 2017, 10:21 PM
jbhateja added a comment to D34596: [X86]: Adding a new priority function 'guided-src' for Scheduler DAG instruction scheduling..

Sorry about this review stalling. To give at least some feedback:

  • We generally want to phase out the SelectionDAG schedulers and only have them do a minimal amount of work. The actual scheduling should happen in the MachineScheduler. So ideally we would not add new features here. (That said I can see why you are doing it here: the MachineScheduler currently does not reorder flags producers/users)
  • I believe none of the currently active LLVM developers is really familiar with the SelectionDAG schedulers.

    I will try to find time for an actual review soon (otherwise please keep pinging).
Sep 27 2017, 10:14 PM
jbhateja added a comment to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..

@spatel , @reviewiews , can this land now into trunk ?

I haven't actually looked at the builder changes since you revised them, so I defer to @hfinkel is he's already approved that part.

I don't think that I approved anything yet. I can take a holistic look at the patch later today.

Sep 27 2017, 10:53 AM
jbhateja added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@jmolloy , @RKSimon , this patch has been reviewd and due to regression was opened again for review, required changes have
been made, can this land now in trunk if there are no more observations from any reviewers.

Sep 27 2017, 4:38 AM

Sep 26 2017

jbhateja added inline comments to D37880: Fix an out-of-bounds shufflevector index bug.
Sep 26 2017, 3:30 PM
jbhateja updated the diff for D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..
Sep 26 2017, 3:29 PM
jbhateja added a comment to D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..

I don't think we should shrinking operations based on undef inputs. I think we should be shrinking them based on what elements are consumed by their users. There's no reason the shuffles in these reductions have to have undef elements. For integers we may rewrite the shuffle mask to undef in InstCombine if the elements aren't used down stream. But we don't do that for FP.

As an example, why shouldn't we be able use a horizontal add for this

define <4 x double> @fadd_noundef(<8 x double> %x225, <8 x double> %x227) {

  %x226 = shufflevector <8 x double> %x225, <8 x double> %x227, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>
  %x228 = shufflevector <8 x double> %x225, <8 x double> %x227, <8 x i32> <i32 1, i32 9, i32 3, i32 11, i32 5 ,i32 13, i32 7, i32 15>
  %x229 = fadd <8 x double> %x226, %x228
  %x230 = shufflevector <8 x double> %x229, <8 x double> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  ret <4 x double> %x230
}
Sep 26 2017, 3:29 PM
jbhateja added reviewers for D37880: Fix an out-of-bounds shufflevector index bug: zvi, aymanmus.
Sep 26 2017, 3:29 PM

Sep 25 2017

jbhateja added a comment to D37880: Fix an out-of-bounds shufflevector index bug.

LGTM

Sep 25 2017, 8:57 AM
jbhateja added a comment to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..

@spatel , @reviewiews , can this land now into trunk ?

Sep 25 2017, 5:02 AM

Sep 21 2017

jbhateja updated the diff for D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..
  • Updating test case with more than one uses of sqrt / mul.
Sep 21 2017, 10:58 PM
jbhateja committed rL313964: [X86] Updating the test case for FMF propagation..
[X86] Updating the test case for FMF propagation.
Sep 21 2017, 10:50 PM
jbhateja closed D38163: [X86] Updating the test case for FMF propagation. by committing rL313964: [X86] Updating the test case for FMF propagation..
Sep 21 2017, 10:50 PM
jbhateja accepted D38163: [X86] Updating the test case for FMF propagation..
Sep 21 2017, 10:45 PM
jbhateja created D38163: [X86] Updating the test case for FMF propagation..
Sep 21 2017, 10:45 PM
jbhateja added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@reviewers, required revision change are through, let me know if this can land back.

Sep 21 2017, 12:21 PM
jbhateja added a comment to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..

I've added some more FMF tests at rL313893 which I think this patch will miscompile. Please rebase/update.

As I suggested before, this patch shouldn't try to enable multiple DAG combines with node-level FMF. It's not as straightforward as you might think.

Pick exactly one combine if you want to show that this patch is working as intended. The llvm.muladd intrinsic test that you have here with a target that supports 'fma' (not plain x86) seems like a good choice to me. If we have a strict op in IR, it should produce an fma instruction. If we have a fast op in IR, it should produce the simpler fmul instruction?

My understanding and code changes are based LLVM Ref Manual 's section about Fast-Math flags" (http://llvm.org/docs/LangRef.html#fast-math-flags)

Which say for FMF flag NaN "Allow optimizations to assume the arguments and result are not NaN".

Now in following case which has been added by you

%y = call float @llvm.sqrt.f32(float %x)
%z = fdiv fast float 1.0, %y
ret float %z

We dont have fast flag over intrinsic but DAGCombining for fdiv sees a fast flag and assume result (%z) and arguments (constant , %y) as not a Nan and goes ahead and generates a reciprocal sqrt. If you remove fast from fdiv and add it to intrinsic then FMF opt at fdiv will not kick in.

Can you please let me know what you expected here.

I expect that the sqrt result is strict. Ie, it should use sqrtss if this is x86-64. We're not allowed to use rsqrtss and lose precision on that op.

That said, my memory of exactly how op-level FMF should work is fuzzy. If anyone else remembers or can link to threads where we've discussed this, please feel free to jump in. :)

Sep 21 2017, 12:16 PM
jbhateja added a comment to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..

I've added some more FMF tests at rL313893 which I think this patch will miscompile. Please rebase/update.

As I suggested before, this patch shouldn't try to enable multiple DAG combines with node-level FMF. It's not as straightforward as you might think.

Pick exactly one combine if you want to show that this patch is working as intended. The llvm.muladd intrinsic test that you have here with a target that supports 'fma' (not plain x86) seems like a good choice to me. If we have a strict op in IR, it should produce an fma instruction. If we have a fast op in IR, it should produce the simpler fmul instruction?

Sep 21 2017, 11:53 AM
jbhateja added inline comments to D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..
Sep 21 2017, 3:10 AM
jbhateja updated the diff for D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management..
  • Review comments resolutions.
Sep 21 2017, 3:07 AM
jbhateja committed rL313869: [X86] Adding a testpoint for fast-math flags propagation..
[X86] Adding a testpoint for fast-math flags propagation.
Sep 21 2017, 2:55 AM
jbhateja closed D38127: [X86] Adding a testpoint for fast-math flags propagation. by committing rL313869: [X86] Adding a testpoint for fast-math flags propagation..
Sep 21 2017, 2:55 AM
jbhateja accepted D38127: [X86] Adding a testpoint for fast-math flags propagation..
Sep 21 2017, 2:50 AM
jbhateja created D38127: [X86] Adding a testpoint for fast-math flags propagation..
Sep 21 2017, 2:49 AM
jbhateja updated the diff for D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Sep 21 2017, 12:15 AM

Sep 19 2017

jbhateja added inline comments to D36454: [X86] Changes to extract Horizontal addition operation for AVX-512..
Sep 19 2017, 4:15 AM