Page MenuHomePhabricator

lawben (Lawrence Benson)
User

Projects

User does not belong to any projects.

User Details

User Since
Mar 4 2023, 1:11 AM (12 w, 6 d)

Recent Activity

May 2 2023

lawben accepted D148624: [AArch64] Add sign bits handling for vector compare nodes.

Together with my observation in D148316 and the new tests, this looks good.

May 2 2023, 2:05 AM · Restricted Project, Restricted Project

Apr 28 2023

lawben added inline comments to D145301: Add more efficient vector bitcast for AArch64.
Apr 28 2023, 7:19 AM · Restricted Project, Restricted Project

Apr 27 2023

lawben added a comment to D148316: [AArch64] Add support for efficient bitcast in vector truncate store..

@dmgreen Thanks for your review. Could you please merge this with "Lawrence Benson <github@lawben.com>".

Apr 27 2023, 1:34 AM · Restricted Project, Restricted Project

Apr 26 2023

lawben added inline comments to D148316: [AArch64] Add support for efficient bitcast in vector truncate store..
Apr 26 2023, 2:47 AM · Restricted Project, Restricted Project
lawben updated the diff for D148316: [AArch64] Add support for efficient bitcast in vector truncate store..

Added check for illegal types. I could not get this check to actually fire. The type is legalized before the bitcast in the bitcast lowering case, and it is legalized before the
store becomes a truncating store. So in both cases other checks prevent bad things from happening. But it is probably still fine to keep it as a defensive check in case things
change.

Apr 26 2023, 2:41 AM · Restricted Project, Restricted Project

Apr 22 2023

lawben updated the diff for D148316: [AArch64] Add support for efficient bitcast in vector truncate store..

Address a few review comments.

Apr 22 2023, 4:52 AM · Restricted Project, Restricted Project
lawben added inline comments to D145301: Add more efficient vector bitcast for AArch64.
Apr 22 2023, 4:05 AM · Restricted Project, Restricted Project

Apr 19 2023

lawben added a comment to D148316: [AArch64] Add support for efficient bitcast in vector truncate store..

I dug into this a bit, and your vector compare sing bit patch does the right thing here. But the problem is unrelated and a bit annoying. At the time the sign_extend_inreg that we added is combined (to potentially remove it), there is a build_vector with (1 << vectorElementSize) - 1 bits (to negate via xor). But ComputeNumSignBits breaks here. The vector has 32-bit constants but we only have 8-bit vectors. So it detects 24 sign bits and then kinda gives up. So without changing code in SelectionDAG::ComputeNumSignBits or the negation of vector entries, there is no way to correctly determine the sign bits.

Apr 19 2023, 9:22 AM · Restricted Project, Restricted Project
lawben added a comment to D148316: [AArch64] Add support for efficient bitcast in vector truncate store..

Does having sign bits for the vector compare nodes help, as in https://reviews.llvm.org/D148624? I wasn't able to find a good way to test that, the optimizer would always clear away the results for all the examples I tried prior to creating the compare nodes.

Apr 19 2023, 1:29 AM · Restricted Project, Restricted Project

Apr 18 2023

lawben added inline comments to D148316: [AArch64] Add support for efficient bitcast in vector truncate store..
Apr 18 2023, 3:23 AM · Restricted Project, Restricted Project

Apr 14 2023

lawben retitled D148316: [AArch64] Add support for efficient bitcast in vector truncate store. from Add support for efficient bitcast in vector truncate store. to [AArch64] Add support for efficient bitcast in vector truncate store..
Apr 14 2023, 2:31 AM · Restricted Project, Restricted Project
lawben requested review of D148316: [AArch64] Add support for efficient bitcast in vector truncate store..
Apr 14 2023, 2:27 AM · Restricted Project, Restricted Project

Apr 12 2023

lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Add test for float vector. This required a single-line change to convert the VecVT to an integer vector for sign-extend to work.

Apr 12 2023, 8:52 AM · Restricted Project, Restricted Project

Apr 11 2023

lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

Thanks for your effort in reviewing this patch. I think this solution is nicer than my original approach.

Apr 11 2023, 5:54 AM · Restricted Project, Restricted Project

Apr 10 2023

lawben added inline comments to D145301: Add more efficient vector bitcast for AArch64.
Apr 10 2023, 11:45 AM · Restricted Project, Restricted Project
lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Changed approach as suggested by @dmgreen. We now use an explicit sign-extend and ignore the vector compare. The sign-extend is removed in later steps if there is a vector compare,
so there is no overhead. This change allows us to determine the original type in more cases, as we can detect both SETCC and TRUNC.

Apr 10 2023, 11:42 AM · Restricted Project, Restricted Project

Apr 6 2023

lawben added inline comments to D145301: Add more efficient vector bitcast for AArch64.
Apr 6 2023, 2:31 AM · Restricted Project, Restricted Project
lawben added inline comments to D145301: Add more efficient vector bitcast for AArch64.
Apr 6 2023, 2:28 AM · Restricted Project, Restricted Project
lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Addressed some review comments.

Apr 6 2023, 2:28 AM · Restricted Project, Restricted Project

Apr 3 2023

lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

@dmgreen I can split this into two patches. I'll remove the truncate store part and only focus on the bitcast for now.

Apr 3 2023, 11:24 AM · Restricted Project, Restricted Project

Mar 31 2023

lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Changed large parts of where this conversion takes place.

Mar 31 2023, 6:03 AM · Restricted Project, Restricted Project

Mar 29 2023

lawben added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

Thanks. Could you please use "Lawrence Benson <github@lawben.com>".

Mar 29 2023, 5:20 AM · Restricted Project, Restricted Project
lawben added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

@fhahn is there anything missing from your side? If not, could one of you or @dmgreen merge this? As this is my first patch, I obviously do not have push access :)

Mar 29 2023, 4:37 AM · Restricted Project, Restricted Project

Mar 25 2023

lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

If we have a comparison, we know that all bits are 1 or all bits a 0, so if the least significant one is equal to all others.

Aren't the elements in a <N x i1> guaranteed to be 0 or -1 (so all zeros or all ones) anyways? And even if there was always an extra instruction emitted so that for compare + bitcast the flow would look like this: <initial compare> -> <compare returned bitmask> -> <use and-trick on the result of that>, I would assume that LLVM would just trivially optimize out the second compare if it knows that the result of the first compare already contains all zeros/all ones.

Mar 25 2023, 8:56 AM · Restricted Project, Restricted Project
lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

@Sp00ph Your example would not be optimized. The issue with that example is: how is a bitcast to i1 defined? The current logic in LLVM uses the least significant bit. But this trick does not work in that case, as we use bits 0 to n for lanes 0 to n, so we only use the least significant one for lane 0. If we have a comparison, we know that all bits are 1 or all bits are 0, so the least significant one is equal to all others. Without a comparison, we could shift the least significant bit and then do the rest, but that would need an extra instruction. Maybe this could be added in a follow-up? I'm happy to discuss options here. The current approach is a bit more conservative.

Mar 25 2023, 7:56 AM · Restricted Project, Restricted Project

Mar 24 2023

lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

@david-arm @dmgreen I rebased this into a single commit. I hope the changes are shown correctly now. All changes here are new, I did not modify any tests. I guess this was just shown incorrectly because I used multiple commits.

Mar 24 2023, 3:59 AM · Restricted Project, Restricted Project
lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Rebase onto main

Mar 24 2023, 3:50 AM · Restricted Project, Restricted Project
lawben updated the diff for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

An honest attempt to do a rebase with arc :)

Mar 24 2023, 3:45 AM · Restricted Project, Restricted Project

Mar 22 2023

lawben added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

@dmgreen Okay, I rebased onto main and submited via arc. I hope this did the right thing, but it doesn't look broken. I guess my mistake in the other patch was to create a merge commit instead of rebasing. For future reference, I'll just always squash into in commit.

Mar 22 2023, 5:23 AM · Restricted Project, Restricted Project
lawben updated the diff for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

Rebase main

Mar 22 2023, 5:21 AM · Restricted Project, Restricted Project

Mar 21 2023

lawben updated the summary of D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..
Mar 21 2023, 6:31 AM · Restricted Project, Restricted Project
lawben added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

@dmgreen Thanks for the feedback. I've addressed the comments you made. I've renamed the method, as it does not require an AND anymore.

Mar 21 2023, 6:12 AM · Restricted Project, Restricted Project
lawben updated the diff for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

Add a path without AND mask.

Mar 21 2023, 6:08 AM · Restricted Project, Restricted Project
lawben updated the summary of D145301: Add more efficient vector bitcast for AArch64.
Mar 21 2023, 2:40 AM · Restricted Project, Restricted Project
lawben updated the diff for D145301: Add more efficient vector bitcast for AArch64.

Apply the bitcast to truncating stores.

Mar 21 2023, 2:40 AM · Restricted Project, Restricted Project

Mar 17 2023

lawben retitled D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask. from [AArch64] Use NEON's tbl1 for 16xi8 build vector with mask. to [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..
Mar 17 2023, 8:08 AM · Restricted Project, Restricted Project
lawben added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..
Mar 17 2023, 8:07 AM · Restricted Project, Restricted Project
lawben updated the diff for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

Add tests for exit cases and extend to v8i8.

Mar 17 2023, 6:30 AM · Restricted Project, Restricted Project
lawben abandoned D146295: Add tests for exit cases.

Sorry, this was meant to be an update to: https://reviews.llvm.org/D146212, not a new diff.

Mar 17 2023, 6:22 AM · Restricted Project, Restricted Project
lawben requested review of D146295: Add tests for exit cases.
Mar 17 2023, 6:19 AM · Restricted Project, Restricted Project

Mar 16 2023

lawben retitled D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask. from Use NEON's tbl1 for 16xi8 build vector with mask. to [AArch64] Use NEON's tbl1 for 16xi8 build vector with mask..
Mar 16 2023, 3:45 AM · Restricted Project, Restricted Project
lawben added a reviewer for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask.: dmgreen.
Mar 16 2023, 3:42 AM · Restricted Project, Restricted Project
lawben retitled D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask. from Use NEON' tbl1 for 16xi8 build vector with mask. to Use NEON's tbl1 for 16xi8 build vector with mask..
Mar 16 2023, 3:35 AM · Restricted Project, Restricted Project
lawben requested review of D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..
Mar 16 2023, 3:31 AM · Restricted Project, Restricted Project
lawben added a comment to D145301: Add more efficient vector bitcast for AArch64.

@t.p.northover As this is my first patch submitted to LLVM, this is just a short ping to check if there is something that I have missed or forgotten to do. I'm not yet familiar with the procedure.

Mar 16 2023, 2:00 AM · Restricted Project, Restricted Project

Mar 4 2023

lawben added a reviewer for D145301: Add more efficient vector bitcast for AArch64: t.p.northover.
Mar 4 2023, 1:48 AM · Restricted Project, Restricted Project
lawben requested review of D145301: Add more efficient vector bitcast for AArch64.
Mar 4 2023, 1:45 AM · Restricted Project, Restricted Project