lsaba (Lama)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 3 2016, 5:39 AM (94 w, 3 d)

Recent Activity

Mon, May 21

lsaba committed rL332849: [X86] - Avoid SFB pass - fix bug in updating the offsets for newly created….
[X86] - Avoid SFB pass - fix bug in updating the offsets for newly created…
Mon, May 21, 9:27 AM

Apr 26 2018

lsaba committed rL330939: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.
[X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153
Apr 26 2018, 6:19 AM
lsaba closed D45823: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.
Apr 26 2018, 6:19 AM

Apr 25 2018

lsaba updated the diff for D45823: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.

done @craig.topper

Apr 25 2018, 4:08 AM

Apr 24 2018

lsaba updated the diff for D45823: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.

fix @craig.topper comment

Apr 24 2018, 1:26 AM

Apr 22 2018

lsaba added a comment to D45823: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.

ping

Apr 22 2018, 3:39 AM

Apr 19 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

This seems to break the Machine Verifier. I filed PR37153. Do you mind taking a look please? Thanks!

Apr 19 2018, 9:07 AM
lsaba created D45823: [X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153.
Apr 19 2018, 9:06 AM

Apr 18 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

This seems to break the Machine Verifier. I filed PR37153. Do you mind taking a look please? Thanks!

Apr 18 2018, 12:57 AM

Apr 6 2018

lsaba abandoned D43619: [X86] Limit Store Forwarding Block only to cases where we can prove that the memcpy does not overlap.

@zvi the changes in this commit were added to https://reviews.llvm.org/D41330 and committed already. I will close this patch

Apr 6 2018, 1:31 AM

Apr 2 2018

lsaba committed rL328973: [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346.
[X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346
Apr 2 2018, 6:53 AM
lsaba closed D41330: [X86] Reduce Store Forward Block issues in HW.
Apr 2 2018, 6:53 AM

Mar 30 2018

lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Mar 30 2018, 12:30 AM

Mar 29 2018

lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

Updated with @craig.topper comments

Mar 29 2018, 7:11 AM
lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Mar 29 2018, 7:10 AM

Mar 27 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

The changes from just D43619 look correct but I haven't looked deeply at this part of the patch. From the comments it looks like the patch is a bit out of date as std::map should have been converted back to switch statements.

Mar 27 2018, 11:55 PM
lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

ping @ reviewers, if there are no more comments i'd like to re-submit this

Mar 27 2018, 4:18 AM

Mar 22 2018

lsaba added a comment to D43619: [X86] Limit Store Forwarding Block only to cases where we can prove that the memcpy does not overlap.

Fixed comments in review https://reviews.llvm.org/D41330

Mar 22 2018, 6:44 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

fix comments from @niravd and @craig.topper in review https://reviews.llvm.org/D43619#inline-381377 (closing that one)

Mar 22 2018, 6:43 AM

Mar 19 2018

lsaba reopened D41330: [X86] Reduce Store Forward Block issues in HW.
Mar 19 2018, 8:30 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

Fix the reported bug (added case to bug 36346, based on code provided by Richard Smith )
and limit the transformation to 64bit only

Mar 19 2018, 8:23 AM

Feb 22 2018

lsaba added a comment to D43619: [X86] Limit Store Forwarding Block only to cases where we can prove that the memcpy does not overlap.

@chandlerc there's a good chance this solves the miscompilations you are seeing

Feb 22 2018, 6:51 AM
lsaba created D43619: [X86] Limit Store Forwarding Block only to cases where we can prove that the memcpy does not overlap.
Feb 22 2018, 6:51 AM

Feb 20 2018

lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Feb 20 2018, 9:23 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

@chandlerc Thanks for your comments.
still checking the option of using tablegen for some of the tables.
Any luck with the reproducer?

Feb 20 2018, 9:23 AM

Feb 14 2018

lsaba committed rL325128: [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346.
[X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346
Feb 14 2018, 7:03 AM

Feb 11 2018

lsaba committed rL324836: fix test/CodeGen/X86/fixup-sfb.ll test failure after commit https://reviews..
fix test/CodeGen/X86/fixup-sfb.ll test failure after commit https://reviews.
Feb 11 2018, 2:35 AM
lsaba closed D41330: [X86] Reduce Store Forward Block issues in HW.

committed in https://reviews.llvm.org/rL324835

Feb 11 2018, 1:38 AM
lsaba committed rL324835: [X86] Reduce Store Forward Block issues in HW.
[X86] Reduce Store Forward Block issues in HW
Feb 11 2018, 1:36 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

fixed TRI @craig.topper

Feb 11 2018, 12:26 AM

Feb 8 2018

lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Feb 8 2018, 12:37 PM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

addressed @craig.topper 's comments

Feb 8 2018, 12:37 PM
lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

ping to reviewers

Feb 8 2018, 12:04 AM

Feb 6 2018

lsaba added a reviewer for D41330: [X86] Reduce Store Forward Block issues in HW: oren_ben_simhon.
Feb 6 2018, 6:13 AM

Feb 4 2018

lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

addressed @craig.topper comments

Feb 4 2018, 1:31 AM

Jan 31 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.
Jan 31 2018, 9:25 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

addressed comments from @RKSimon and @craig.topper

Jan 31 2018, 9:21 AM

Jan 25 2018

lsaba accepted D42522: [X86] Fix killed flag handling in X86FixupLea pass.
Jan 25 2018, 1:11 AM

Jan 16 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

Pinging again,,

Jan 16 2018, 1:17 AM

Jan 8 2018

lsaba added a comment to D41330: [X86] Reduce Store Forward Block issues in HW.

Ping.

Jan 8 2018, 12:32 AM

Jan 3 2018

lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

addressed @hfinkel comments.
ping to reviewers.

Jan 3 2018, 12:04 AM

Dec 26 2017

lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

add MMO info

Dec 26 2017, 4:00 AM

Dec 20 2017

lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Dec 20 2017, 7:08 AM

Dec 19 2017

lsaba added inline comments to D41330: [X86] Reduce Store Forward Block issues in HW.
Dec 19 2017, 5:40 AM
lsaba updated the diff for D41330: [X86] Reduce Store Forward Block issues in HW.

Addressed Craig's comments

Dec 19 2017, 5:40 AM

Dec 17 2017

lsaba added a reviewer for D41330: [X86] Reduce Store Forward Block issues in HW: zansari.
Dec 17 2017, 3:26 AM
lsaba created D41330: [X86] Reduce Store Forward Block issues in HW.
Dec 17 2017, 3:25 AM

Dec 5 2017

lsaba accepted D39720: [X86][AVX512] lowering kunpack intrinsic - llvm part.
Dec 5 2017, 7:20 AM
lsaba accepted D39719: [X86][AVX512] lowering kunpack intrinsic - clang part.
Dec 5 2017, 5:17 AM

Nov 26 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@RKSimon No more comments from my side

Nov 26 2017, 12:02 AM

Nov 21 2017

lsaba accepted D39421: [InstCombine] Extracting common and-mask for shift operands of Or instruction.

LGTM after fixing the minor comments

Nov 21 2017, 3:28 AM
lsaba added inline comments to D39421: [InstCombine] Extracting common and-mask for shift operands of Or instruction.
Nov 21 2017, 3:28 AM

Nov 7 2017

lsaba accepted D38671: [X86][AVX512] lowering shuffle i/f intrinsic - llvm part.
Nov 7 2017, 8:16 AM
lsaba accepted D38672: [X86][AVX512] lowering shuffle f/i intrinsic - clang part.
Nov 7 2017, 8:13 AM
lsaba added inline comments to D38671: [X86][AVX512] lowering shuffle i/f intrinsic - llvm part.
Nov 7 2017, 7:41 AM
lsaba added inline comments to D38672: [X86][AVX512] lowering shuffle f/i intrinsic - clang part.
Nov 7 2017, 7:38 AM

Sep 13 2017

lsaba accepted D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Sep 13 2017, 7:13 AM

Sep 6 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

3-Ops LEA are costly starting target SandyBridge , is there a limitation in the code for the targets this transformation works on? If not I think there should be.
you can check the Slow3OpsLEA feature for the full list of targets.

Sep 6 2017, 12:13 AM

Sep 4 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@lamas, @reviewers, comments have been taken care. Let me know if anything else.

Sep 4 2017, 8:19 AM
lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@lamas, @reviewers, comments have been taken care. Let me know if anything else.

Sep 4 2017, 12:24 AM

Aug 31 2017

lsaba added a comment to D37289: [X86] Speculatively load operands of select instruction.

I didn't look at the implementation, but why is it safe to speculate loads in these tests? I can create an example where one of the pointers in the select is unmapped, so speculating that load will crash in the general case.

The implementation handles a specific case where both operands of the select are GEPs into elements of the same struct, correct me if i'm wrong but this should be safe

I don't see how being elements of one struct changes anything. Just because one pointer is dereferenceable does not make the other dereferenceable? You would need 'dereferenceable' metadata or some other means to know that either load is safe to hoist ahead of the select.

You're proposing this transform as an x86-specific pass, so maybe I'm missing something. Is there some feature of x86 that makes speculating the load safe? I'm guessing no because I tested the example I was thinking of on x86, and this transform crashes there.

This is the transformation I am interested in doing:

struct S {

int a;
int b;

}

from:

foo (S* s, int x) {

 int c;
 if (x) 
   c = s->a;
 else
  c = s->b;
}

to:
foo (S* s, int x) {

 int c1= s->a;
 int c2 = a->b;
 c = x? c1 : c2;
}

I am assuming this transformation is legal in C for the given struct with the given types since the entire struct is allocated, is my assumption wrong? the idea is to limit the pass to these cases ( I am uploading a patch to limit the current implementation more)

Yes, I think your assumption is wrong. It's contrived, but consider this possibility based on the current version of the patch:

#include <stdlib.h>
typedef struct S {
  char padding[4088]; // not necessary, but might it make it easier for GuardMalloc or valgrind to see the bug
  struct S *p1;
  struct S *p2;
} S;

S *Sptr;

void init() {
  Sptr = malloc(4096); // sorry, p2 - no space for you
  Sptr->p1 = "1239"; // crazy, but just to prove a point
}

When input to a wrapper test similar to the test in this patch, this is safe without this transform, but crashes after. You need something to tell you the loads are dereferenceable (and whatever that information is will not be x86-specific, so there's no need to make an x86 pass for it).

Aug 31 2017, 7:40 AM
lsaba updated the diff for D37289: [X86] Speculatively load operands of select instruction.

addressing aaboud's comments + limiting the opt. to non aggregate gep accesses

Aug 31 2017, 7:11 AM
lsaba added a comment to D37289: [X86] Speculatively load operands of select instruction.

I didn't look at the implementation, but why is it safe to speculate loads in these tests? I can create an example where one of the pointers in the select is unmapped, so speculating that load will crash in the general case.

The implementation handles a specific case where both operands of the select are GEPs into elements of the same struct, correct me if i'm wrong but this should be safe

I don't see how being elements of one struct changes anything. Just because one pointer is dereferenceable does not make the other dereferenceable? You would need 'dereferenceable' metadata or some other means to know that either load is safe to hoist ahead of the select.

You're proposing this transform as an x86-specific pass, so maybe I'm missing something. Is there some feature of x86 that makes speculating the load safe? I'm guessing no because I tested the example I was thinking of on x86, and this transform crashes there.

Aug 31 2017, 7:00 AM
lsaba added a comment to D37289: [X86] Speculatively load operands of select instruction.

I didn't look at the implementation, but why is it safe to speculate loads in these tests? I can create an example where one of the pointers in the select is unmapped, so speculating that load will crash in the general case.

Aug 31 2017, 12:03 AM

Aug 30 2017

lsaba created D37289: [X86] Speculatively load operands of select instruction.
Aug 30 2017, 1:26 AM
lsaba abandoned D37288: [X86] Speculatively load operands of select instruction.
Aug 30 2017, 1:19 AM
lsaba created D37288: [X86] Speculatively load operands of select instruction.
Aug 30 2017, 12:46 AM

Aug 17 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

It seems like there are still correctness issues in the patch, I ran the llvm-test-suite and got a couple of runfails :
multisource_applications_alac_encode_alacconvert_encode
multisource_applications_jm_lencod_lencod

Aug 17 2017, 12:05 AM

Aug 9 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@ reviewers , kindly let me know if there are any more comments apart from last comment from lsaba.
Thanks.

Hi,
I ran the patch on several benchmarks to check performance, overall the changes look good, but there is a regression in one of the benchmarks (EEMBC/coremark-pro) caused by creating an undesired lea instruction instead of the previously created add instruction, I am working on creating a simple reproducer for the problem and would appreciate your patience.

Thanks

Aug 9 2017, 8:20 AM

Aug 8 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

@ reviewers , kindly let me know if there are any more comments apart from last comment from lsaba.
Thanks.

Aug 8 2017, 9:57 AM

Aug 7 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Aug 7 2017, 5:19 AM

Aug 6 2017

lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

Pinging reviewers. Kindly pour your comments.
Thanks

Aug 6 2017, 1:33 AM

Jul 31 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 31 2017, 4:50 AM

Jul 30 2017

lsaba committed rL309521: NFC: spell correction..
NFC: spell correction.
Jul 30 2017, 1:13 PM
lsaba closed D35885: NFC: spell correction. by committing rL309521: NFC: spell correction..
Jul 30 2017, 1:13 PM
lsaba accepted D36048: [X86]: Extending a test cases for LEA factorization..
Jul 30 2017, 6:37 AM

Jul 27 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 27 2017, 12:49 AM

Jul 26 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 26 2017, 4:22 AM

Jul 24 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 24 2017, 5:53 AM

Jul 19 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 19 2017, 8:04 AM

Jul 11 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 11 2017, 5:02 AM

Jul 10 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 10 2017, 6:20 AM

Jul 6 2017

lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 6 2017, 2:06 AM
lsaba added a comment to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..

Please add test cases for the optimization added in OptimizeLEAPass

Jul 6 2017, 1:42 AM
lsaba added inline comments to D35014: [X86] Improvement in CodeGen instruction selection for LEAs..
Jul 6 2017, 1:26 AM

May 18 2017

lsaba committed rL303333: [X86] Replace slow LEA instructions in X86.
[X86] Replace slow LEA instructions in X86
May 18 2017, 1:25 AM

May 16 2017

lsaba committed rL303183: [X86] Replace slow LEA instructions in X86.
[X86] Replace slow LEA instructions in X86
May 16 2017, 9:15 AM
lsaba closed D32277: Replace slow LEA instructions in X86 by committing rL303183: [X86] Replace slow LEA instructions in X86.
May 16 2017, 9:15 AM

May 8 2017

lsaba added inline comments to D32277: Replace slow LEA instructions in X86.
May 8 2017, 6:10 AM
lsaba updated the diff for D32277: Replace slow LEA instructions in X86.

Updated with Zvi's comments

May 8 2017, 4:41 AM
lsaba updated the diff for D32277: Replace slow LEA instructions in X86.
May 8 2017, 1:23 AM

May 7 2017

lsaba updated the diff for D32277: Replace slow LEA instructions in X86.

remove redundant variable

May 7 2017, 7:12 AM

May 3 2017

lsaba added inline comments to D32352: Go to eleven.
May 3 2017, 6:02 AM
lsaba updated the diff for D32277: Replace slow LEA instructions in X86.
May 3 2017, 5:44 AM

Apr 27 2017

lsaba added inline comments to D32352: Go to eleven.
Apr 27 2017, 1:53 AM
lsaba added inline comments to D32352: Go to eleven.
Apr 27 2017, 1:41 AM
lsaba added a comment to D32277: Replace slow LEA instructions in X86.

I have not got the conclusion: R13 is a bad register or not?
According to what I see in the comment it is not but patch still consider it is?
Could you please clarify this?

Apr 27 2017, 12:18 AM

Apr 26 2017

lsaba added a comment to D32277: Replace slow LEA instructions in X86.

D32352 is looking at more aggressive conversion of IMUL to multiple LEA instructions.

Thanks for notifying, the patch does not contain any 3-Operand LEAs as far as I can see, the only case we need to be careful with is for 2-Operand LEA with RBP/R13/EBP as a base register, since this is determined only after RA, I am thinking it's better to let my patch fix those cases rather than preventing that patch from running on the problematic targets

Sorry for being pedantic about the naming, but for AMD architectures the 'slowlea' cases (whether it uses the ALU or AGU pipe) include scale != 1 (even for 2-ops), but it doesn't seem to be noticeably affected by RBP/R13/EBP. Hence my interest in PR32326 to try and make it easier to discriminate.

will separating this feature form the existing slowLEA feature which is used in SLM (and giving it another name) make it less confusing?

Yes please, we need to discriminate between different slow LEA behaviours and separate features is probably the most straightforward way to do it.

Apr 26 2017, 7:07 AM
lsaba added a comment to D32277: Replace slow LEA instructions in X86.

Could you please also add some negative tests when you cannot do such transformation?
For example, involving eflags.

Apr 26 2017, 7:05 AM
lsaba updated the diff for D32277: Replace slow LEA instructions in X86.

Updated patch with ZVi's comments

Apr 26 2017, 7:05 AM