Page MenuHomePhabricator

datta.nagraj (Datta Nagraj)
User

Projects

User does not belong to any projects.

User Details

User Since
Sun, Jun 6, 1:04 AM (2 w, 1 d)

Recent Activity

Fri, Jun 18

datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@lebedev.ri @RKSimon - any more comments?

Fri, Jun 18, 8:30 AM · Restricted Project

Sun, Jun 13

datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Update comment for multiple use of zext test case

Sun, Jun 13, 6:59 PM · Restricted Project
datta.nagraj added inline comments to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.
Sun, Jun 13, 6:31 AM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Use Log2_32 function instead of APInt

Sun, Jun 13, 6:29 AM · Restricted Project

Sat, Jun 12

datta.nagraj added inline comments to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.
Sat, Jun 12, 1:25 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Hi @spatel Sir, doesn't https://alive2.llvm.org/ce/z/89_wJb prove that this works for even type smaller than log2 of the wide type ? I can add the same test to the unit test, but I am in doubt whether to add it as a check, since the example here proves that it works for difference of bitwidth of more than log 2 as well. I tested with 63, 2 sizes as well, and that works too. Please suggest if we require to add the log2 check, since these are working fine from the above examples.

That test proves that the transform is correct for that pair of types exactly. It does not prove that the transform is correct for all pairs of types. If you can prove *why* truncating the addition constant will always work, then we can proceed without the additional check. If not, please add the extra check and mark the test with a 'TODO' comment that references what we have discussed here.

I'd rather be safe than cause a miscompile on some pair of types that we did not think of (and then potentially have the whole patch reverted). :)

Sat, Jun 12, 1:24 AM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Do the opt only if the difference in bitwidth of zext src and dst is less than log2

Sat, Jun 12, 1:22 AM · Restricted Project

Fri, Jun 11

datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Add a test with bitwidth difference of more than log2 between zext src and trunc dst

Fri, Jun 11, 7:58 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Removed the hard cut off check. Please review Sir.

The hard check was definitely not right, but have you proven that a log2-of-bitwidth check is not necessary? The original Alive allowed for that kind of proof based on value type widths.
If we can't prove it, then I would include that check in this patch to be safe.
I don't know what the motivating source code looks like. It seems unlikely that we would ever truncate to a type smaller than log2 of the wide type in real code, but we can't rule it out, so it still deserves a regression test.

Fri, Jun 11, 7:49 AM · Restricted Project

Thu, Jun 10

datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

If we look at the sequence with trunc, then preconditions are different.
I'm not sure if we have *some* problematic combination of src bitwidth and bitwidth increase,
but the hard cut-off that is currently there is not correct.

It seems like we could go wrong if the narrow (destination) type can't hold at least the bitwidth of the wide type -- that's what we saw in the Alive2 example of the transform without the trunc -- but I haven't come up with an example where it fails when we have the trunc:
https://alive2.llvm.org/ce/z/89_wJb

I am currently not sure of what to do here Sir, also I have no clue as to why some unit tests are failing, because they don't even contain trunc inst, and don't even hit my changes.

I don't understand this comment. What is failing?

Thu, Jun 10, 10:00 PM · Restricted Project
datta.nagraj abandoned D104038: [InstCombine] Check if zext src and trunc dst have same types.
Thu, Jun 10, 8:52 PM · Restricted Project
datta.nagraj abandoned D103762: Revert "[llvm] Add interface to order inlining".
Thu, Jun 10, 8:51 PM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Removed the hard cut off check. Please review Sir.

Thu, Jun 10, 8:45 PM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Add a check to see that the source and dst size are same.

Thu, Jun 10, 8:43 PM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

If we look at the sequence with trunc, then preconditions are different.
I'm not sure if we have *some* problematic combination of src bitwidth and bitwidth increase,
but the hard cut-off that is currently there is not correct.

Thu, Jun 10, 9:23 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Sorry, to ask this here, but this is my first time here and want to ask a lame question. Who is supposed to mark the review comments as done, the developer or the reviewer (really confused in this.)

Thu, Jun 10, 8:20 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@lebedev.ri Sir, but if we don't take trunc into consideration, then how are we optimizing this?
Earlier it was 2 insts (zext - ctlz), and now its 3 insts(ctlz - add - zext). I agree that the ctlz , add are done on a lower datatype.
If we consider trunc as well, then we can convert 3 insts (zext - ctlz - trunc) to 2 insts (ctlz - add).
Shouldn't we do this opt only when the trunc is present?

In general - yes, in instcombine we should not increase the instruction count,
however as it was already hinted by @spatel, this seems like a rare edge case
where we should be okay with that. (unless @spatel disagrees?)

This could go either way, but I'd prefer that we have this more conventional (don't increase instructions) transform as a first step. If there's evidence that we benefit from the more general (no trunc required) transform, then we can always follow-up.
Definitely agree that we need to check for matching source and destination types to make this patch sound (and add a negative test to verify that).

Thu, Jun 10, 8:18 AM · Restricted Project
datta.nagraj added inline comments to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.
Thu, Jun 10, 8:12 AM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Check if size of zext src and trunc dst are same and above 5 bits

Thu, Jun 10, 8:08 AM · Restricted Project
datta.nagraj requested review of D104038: [InstCombine] Check if zext src and trunc dst have same types.
Thu, Jun 10, 8:05 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@lebedev.ri Sir, but if we don't take trunc into consideration, then how are we optimizing this?
Earlier it was 2 insts (zext - ctlz), and now its 3 insts(ctlz - add - zext). I agree that the ctlz , add are done on a lower datatype.
If we consider trunc as well, then we can convert 3 insts (zext - ctlz - trunc) to 2 insts (ctlz - add).
Shouldn't we do this opt only when the trunc is present?

Thu, Jun 10, 3:33 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@lebedev.ri In this opt: ctlz(zext(A)) --> add(ctlz(A), (bitwidth(zext(A))-bitwidth(A))

Thu, Jun 10, 3:11 AM · Restricted Project

Tue, Jun 8

datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@spatel Addressed review comments.

Tue, Jun 8, 10:19 PM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Updated unit test and added condition for multiple use check for zext

Tue, Jun 8, 10:15 PM · Restricted Project

Mon, Jun 7

datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Clang Format

Mon, Jun 7, 10:53 PM · Restricted Project
datta.nagraj added inline comments to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.
Mon, Jun 7, 10:07 PM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Remove replaceAllUses manually, and let LLVM do it

Mon, Jun 7, 10:03 PM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

@RKSimon I have addressed the review comments Sir, please have a look.

Mon, Jun 7, 8:26 AM · Restricted Project
datta.nagraj added a comment to D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Added more tests as per review comments.

Mon, Jun 7, 4:08 AM · Restricted Project
datta.nagraj updated the diff for D103788: [InstCombine] Eliminate casts to optimize ctlz operation.

Updated the unit tests

Mon, Jun 7, 4:04 AM · Restricted Project

Sun, Jun 6

datta.nagraj edited reviewers for D103788: [InstCombine] Eliminate casts to optimize ctlz operation, added: RKSimon, xbolva00; removed: simon, david.
Sun, Jun 6, 11:55 PM · Restricted Project
datta.nagraj requested review of D103788: [InstCombine] Eliminate casts to optimize ctlz operation.
Sun, Jun 6, 11:53 PM · Restricted Project
datta.nagraj requested review of D103762: Revert "[llvm] Add interface to order inlining".
Sun, Jun 6, 2:19 AM · Restricted Project