Page MenuHomePhabricator

bipmis (Biplob Mishra)
User

Projects

User does not belong to any projects.

User Details

User Since
Mar 29 2022, 2:04 AM (20 w, 1 d)

Recent Activity

Thu, Aug 11

bipmis added inline comments to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..
Thu, Aug 11, 2:43 PM · Restricted Project, Restricted Project
bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Thanks David for reviewing. Have handled most of the nits.

Thu, Aug 11, 2:37 PM · Restricted Project, Restricted Project

Wed, Aug 3

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Gentle ping for review on this! Thanks.

Wed, Aug 3, 6:38 AM · Restricted Project, Restricted Project

Mon, Aug 1

bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Handle Review comments from nikic. Made changes to Alias Analysis.

Mon, Aug 1, 4:25 AM · Restricted Project, Restricted Project

Fri, Jul 29

bipmis added inline comments to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..
Fri, Jul 29, 3:07 AM · Restricted Project, Restricted Project
bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

@spatel I have updated the patch and it should handle the forward load scenarios. Have incorporated most of the review comments received.
The patch has been tested and passing all tests. Do review and suggest if you have more comments. Thanks.

Fri, Jul 29, 12:54 AM · Restricted Project, Restricted Project
bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Handle GEP in a more generic way as as requested earlier by @nikic
Add check for loads belonging to same BB.

Fri, Jul 29, 12:47 AM · Restricted Project, Restricted Project

Wed, Jul 20

bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Handle Review Comments from David.

Wed, Jul 20, 4:26 AM · Restricted Project, Restricted Project

Jul 18 2022

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

I should have mentioned. This being the base version have not enabled the same. Just targeting simple load scenarios in this patch. This was enabled in the the InstCombine patch. The same will be enabled in the subsequent patches to handle memory access b/w loads.

I don't understand the comment. Does the posted patch miscompile the example with a store? If so, then the patch can't be committed in this form.
If there are planned changes to the posted patch before it is ready for review, please note that in the patch status (or add "WIP" to the title).

Jul 18 2022, 9:08 AM · Restricted Project, Restricted Project
bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Updating the patch with AliasAnalysis. Test split up for respective Targets and new tests updated.
Currently handling only simple forward load patterns.

Jul 18 2022, 9:03 AM · Restricted Project, Restricted Project

Jul 15 2022

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

How does this code account for potential memory accesses between the loads that are getting combined?

define i16 @loadCombine_2consecutive_store_between(ptr %p) {
  %p1 = getelementptr i8, ptr %p, i32 1
  %l1 = load i8, ptr %p, align 2
  store i8 42, ptr %p  ; this must happen after a combined load?
  %l2 = load i8, ptr %p1

  %e1 = zext i8 %l1 to i16
  %e2 = zext i8 %l2 to i16
  %s2 = shl i16 %e2, 8
  %o1 = or i16 %e1, %s2
  ret i16 %o1
}
Jul 15 2022, 2:54 AM · Restricted Project, Restricted Project

Jul 12 2022

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

When switching this to AggressiveInstCombine, I would strongly recommend to start with a much more minimal patch. Handle only a single simple case, without any of the possible variants. We can build on that base later.

Strongly agree - there's a lot of potential for this to go wrong both in correctness and perf regressions, so we need to build up in steps.
AFAIK, the load combine pass did not have correctness problems when it died, so that source code would be a good reference.

Jul 12 2022, 3:11 AM · Restricted Project, Restricted Project
bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

A simple implementation in AggressiveInstCombine which handles the forward consecutive load sequences as provided in the tests.
The implementation is limited to a specific consecutive load pattern which reduces to a combined load only(One and only use of individual loads is to generate a wider load). This is not a generic LoadCombine as combining loads with other uses can result in poison propagation.

Jul 12 2022, 3:02 AM · Restricted Project, Restricted Project

Jul 1 2022

bipmis added inline comments to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..
Jul 1 2022, 9:07 AM · Restricted Project, Restricted Project
bipmis updated the diff for D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

Handle some of the review comments.
Added support for additional scenarios like reverse order loads and big-endian support.

Jul 1 2022, 8:15 AM · Restricted Project, Restricted Project

Jun 28 2022

bipmis added inline comments to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..
Jun 28 2022, 9:21 AM · Restricted Project, Restricted Project
bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

The main question I see here is whether this needs to be TTI based or not -- if yes, then it also shouldn't be in InstCombine. I think there's two reason why we might want TTI:

  • Do we want to create unaligned loads? Creating them is legal, but if the target does not support fast unaligned loads, the backend will break them up again. Should we only perform this optimization if TTI.allowsMisalignedMemoryAccesses reports a fast unaligned access?
  • Do we want to create loads with illegal sizes? For example, if we have a 64-bit target, so we want to combine two i64 loads into an i128 load, even though it will later be broken up again? (At least for the current implementation where both loads must have the same size, the question of whether to allow i24 loads or similar does not come up.)
Jun 28 2022, 9:17 AM · Restricted Project, Restricted Project

Jun 20 2022

bipmis abandoned D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..

@RKSimon The InstCombine patch D124119 can now handle this and the pattern should not appear in the DAG. We can abandon this review. Thanks.

Jun 20 2022, 2:31 AM · Restricted Project, Restricted Project

Jun 17 2022

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

This is being discussed in the discourse -https://discourse.llvm.org/t/load-widening-in-ir/61952

Jun 17 2022, 6:58 AM · Restricted Project, Restricted Project

Jun 14 2022

bipmis added a comment to D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..

@spatel Thanks for reviewing it. The common scenarios in llvm where we see this as an issue is with the cost involved in the inliner, unrolller and vectorizer which results in a sub-optimal code. One such simple example can be with dot product where vectorizer fails to generate a vectorized code as below:
https://godbolt.org/z/Ee4cbf1PG

Jun 14 2022, 2:57 AM · Restricted Project, Restricted Project

Jun 9 2022

bipmis requested review of D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load..
Jun 9 2022, 3:49 AM · Restricted Project, Restricted Project
bipmis committed rGd87bfa9ad0af: [InstCombine] Combine instructions of type or/and where AND masks can be… (authored by bipmis).
[InstCombine] Combine instructions of type or/and where AND masks can be…
Jun 9 2022, 2:59 AM · Restricted Project, Restricted Project

Jun 6 2022

bipmis updated the diff for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

Added regression tests to the patch.

Jun 6 2022, 4:50 AM · Restricted Project, Restricted Project

Jun 1 2022

bipmis added a reviewer for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined.: alexfh.
Jun 1 2022, 7:19 AM · Restricted Project, Restricted Project
bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

@alexfh I think it would be good if you can try the patch and approve for commit if it looks fine. Thanks.

Jun 1 2022, 7:18 AM · Restricted Project, Restricted Project
bipmis updated the diff for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

Update the patch to fix the test case

Jun 1 2022, 7:11 AM · Restricted Project, Restricted Project
bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

@alexfh Verified this is an issue and thanks for reverting. I have the fix for the same which I can commit. Let me know if I can do it now or at a later date. Thanks.

Jun 1 2022, 5:32 AM · Restricted Project, Restricted Project

May 31 2022

bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

@alexfh The transformation here is from one form of OR to the other which can actually be handled by the existing implementation in InstCombine. I dont think we saw an issue with compile time in llvm-test-suite. Would be good to analyze the specific test scenario which triggers this issue.

May 31 2022, 10:33 AM · Restricted Project, Restricted Project

May 16 2022

bipmis committed rGec4adf1f6c33: [InstCombine] Combine instructions of type or/and where AND masks can be… (authored by bipmis).
[InstCombine] Combine instructions of type or/and where AND masks can be…
May 16 2022, 4:44 AM · Restricted Project, Restricted Project
bipmis closed D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..
May 16 2022, 4:44 AM · Restricted Project, Restricted Project

May 12 2022

bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

@spatel Could you review the rebased patch and let me know if you have any other comments. Thanks.

May 12 2022, 2:46 AM · Restricted Project, Restricted Project

May 3 2022

bipmis updated the diff for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

Rebasing after incorporating the comments and test additions. Current implementation folds to a representation which can be reduced by the existing implementation in InstCombine.

May 3 2022, 8:18 AM · Restricted Project, Restricted Project

Apr 28 2022

bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

I have modified and committed the tests with modifications requested. Please have a look if this looks ok. Subsequently will rebase my patch.

Apr 28 2022, 11:37 AM · Restricted Project, Restricted Project
bipmis committed rGc38344dd29ef: InstCombine: Add no-one-use tests and create thwart complexity-based… (authored by bipmis).
InstCombine: Add no-one-use tests and create thwart complexity-based…
Apr 28 2022, 11:25 AM · Restricted Project, Restricted Project
bipmis added inline comments to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..
Apr 28 2022, 3:23 AM · Restricted Project, Restricted Project
bipmis added inline comments to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..
Apr 28 2022, 2:36 AM · Restricted Project, Restricted Project

Apr 27 2022

bipmis updated the diff for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

Rebasing patch with the changes as suggested with different commuted tests.

Apr 27 2022, 7:59 AM · Restricted Project, Restricted Project
bipmis committed rG70dbb5abd361: InstCombine: Add tests to show or-and scenarios which can be possibly be… (authored by bipmis).
InstCombine: Add tests to show or-and scenarios which can be possibly be…
Apr 27 2022, 6:19 AM · Restricted Project, Restricted Project

Apr 26 2022

bipmis added a comment to D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

This seems like a problem with the -reassociate pass - it seems to invert the form we want here.

However, I'm not opposed to another relatively simple reassociation transform here in instcombine (we have many already), but the patch as written is not general enough, missing commuted patterns/tests, and missing some kind of one-use check as noted earlier.

We could generalize to a canonicalization that pushes the 'and' values together, and then the optimization of combining mask constants will fall out from that:
https://alive2.llvm.org/ce/z/ZHgrvL

As code, that would be something like this:

// (A & B) | (C | (A & D)) --> ((A & B) | (A & D)) | C
if (match(Op0, m_And(m_Value(A), m_Value(B))) &&
    match(Op1, m_c_Or(m_Value(C), m_c_And(m_Specific(A), m_Value(D)))))
...

There are already 4 commuted possibilities there, but we'd also need to test for the pattern where "B" is the repeated operand on the right side:

// (A & B) | (C | (B & D)) --> ((A & B) | (B & D)) | C

...and swap the operands of the final 'or'. So 16 total patterns to test for? If we make the matches require constants, we can reduce the number of possibilities (since we know that constants are always op1/right-side), but it's a less general transform.

Apr 26 2022, 6:44 AM · Restricted Project, Restricted Project

Apr 22 2022

bipmis updated the diff for D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..

Thanks. Adding additional test for showing the other commute scenario. Also re-patching with the tests committed and comments incorporated.
Alive link for showing the transform is https://alive2.llvm.org/ce/z/XeMdDp

Apr 22 2022, 7:43 AM · Restricted Project, Restricted Project
bipmis committed rG237c4bada957: InstCombine: Add tests to show or-and scenarios which can be possibly be… (authored by bipmis).
InstCombine: Add tests to show or-and scenarios which can be possibly be…
Apr 22 2022, 7:24 AM · Restricted Project, Restricted Project

Apr 20 2022

bipmis added a comment to D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..

An alternate implementation with InstCombine IR Pass is at https://reviews.llvm.org/D124119.

Apr 20 2022, 12:59 PM · Restricted Project, Restricted Project
bipmis updated the summary of D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..
Apr 20 2022, 12:55 PM · Restricted Project, Restricted Project
bipmis requested review of D124119: [InstCombine] Combine instructions of type or/and where AND masks can be combined..
Apr 20 2022, 12:54 PM · Restricted Project, Restricted Project

Apr 11 2022

bipmis updated the diff for D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..

Adding additional simple tests to show simple scenarios where this optimization will be useful.

Apr 11 2022, 1:33 PM · Restricted Project, Restricted Project
bipmis committed rGd06fb9045b9a: AArch64 adding more tests to show the simple scenarios for or/and combine (authored by bipmis).
AArch64 adding more tests to show the simple scenarios for or/and combine
Apr 11 2022, 12:55 PM · Restricted Project, Restricted Project

Apr 5 2022

bipmis updated the diff for D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..

The patch has been rebased after committing the tests.

Apr 5 2022, 6:39 AM · Restricted Project, Restricted Project
bipmis committed rG90853d8f371d: Adding new tests to demonstrate code patterns with multiple or/and which can be… (authored by bipmis).
Adding new tests to demonstrate code patterns with multiple or/and which can be…
Apr 5 2022, 6:17 AM · Restricted Project, Restricted Project
bipmis committed rGedb452020569: rev16 instruction is being generated for a half word byte swap on a 32-bit… (authored by bipmis).
rev16 instruction is being generated for a half word byte swap on a 32-bit…
Apr 5 2022, 5:43 AM · Restricted Project, Restricted Project
bipmis closed D122643: [AArch64] Optimize pattern for converting a half word byte swap in a 64-bit input to a rev16 instruction.
Apr 5 2022, 5:43 AM · Restricted Project, Restricted Project
bipmis committed rGf2b4b2ebe7a3: Reverting changes to correct the commit message (authored by bipmis).
Reverting changes to correct the commit message
Apr 5 2022, 5:39 AM · Restricted Project, Restricted Project
bipmis committed rGafca54f0cfc3: [ARM][AArch64] Optimize pattern for converting a half word byte swap in a 64… (authored by bipmis).
[ARM][AArch64] Optimize pattern for converting a half word byte swap in a 64…
Apr 5 2022, 4:36 AM · Restricted Project, Restricted Project

Apr 4 2022

bipmis added a comment to D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..

Yes for some of the examples provided, the pattern is actually seen in the DAG.

Apr 4 2022, 8:14 AM · Restricted Project, Restricted Project
bipmis added a reviewer for D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined.: RKSimon.
Apr 4 2022, 6:57 AM · Restricted Project, Restricted Project
bipmis requested review of D123029: [AArch64] Optimize patterns where AND's on same operand with multiple masks can be combined..
Apr 4 2022, 6:30 AM · Restricted Project, Restricted Project

Mar 29 2022

bipmis requested review of D122643: [AArch64] Optimize pattern for converting a half word byte swap in a 64-bit input to a rev16 instruction.
Mar 29 2022, 3:19 AM · Restricted Project, Restricted Project