Page MenuHomePhabricator

cameron.mcinally (Cameron McInally)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 6 2015, 6:21 AM (385 w, 4 d)

Recent Activity

Apr 7 2022

cameron.mcinally added inline comments to D115924: [ConstantFolding] Unify handling of load from uniform value.
Apr 7 2022, 7:49 AM · Restricted Project, Restricted Project, Restricted Project

Apr 6 2022

Herald added a project to D115924: [ConstantFolding] Unify handling of load from uniform value: Restricted Project.
Apr 6 2022, 10:50 AM · Restricted Project, Restricted Project, Restricted Project

Apr 5 2022

cameron.mcinally accepted D120328: [DAGCombine] insert_subvector undef, (splat X), N2 -> splat X.

Ah, great. Thanks for working on this.

Apr 5 2022, 9:05 AM · Restricted Project, Restricted Project

Mar 15 2022

cameron.mcinally requested changes to D120328: [DAGCombine] insert_subvector undef, (splat X), N2 -> splat X.

Just spoke with Paul about this issue. We decided that defining the undef elements may be too aggressive. Updated patch to come...

Mar 15 2022, 9:09 AM · Restricted Project, Restricted Project

Mar 1 2022

cameron.mcinally committed rG70629d570bb6: [SVE] Update patterns to commute FMLS multiplication operands (authored by cameron.mcinally).
[SVE] Update patterns to commute FMLS multiplication operands
Mar 1 2022, 12:53 PM · Restricted Project
cameron.mcinally closed D120570: [SVE] Add pattern to commute FMLS operands.
Mar 1 2022, 12:53 PM · Restricted Project, Restricted Project
cameron.mcinally retitled D120570: [SVE] Add pattern to commute FMLS operands from [SVE] Add pattern to commute FMSB operands to [SVE] Add pattern to commute FMLS operands.
Mar 1 2022, 8:46 AM · Restricted Project, Restricted Project
cameron.mcinally updated the diff for D120570: [SVE] Add pattern to commute FMLS operands.

Addressed Paul's review...

Mar 1 2022, 8:21 AM · Restricted Project, Restricted Project
cameron.mcinally added inline comments to D120570: [SVE] Add pattern to commute FMLS operands.
Mar 1 2022, 7:18 AM · Restricted Project, Restricted Project

Feb 28 2022

cameron.mcinally updated the diff for D120570: [SVE] Add pattern to commute FMLS operands.

Updated Diff to implement commuting with PatFrags.

Feb 28 2022, 12:55 PM · Restricted Project, Restricted Project

Feb 25 2022

cameron.mcinally accepted D120328: [DAGCombine] insert_subvector undef, (splat X), N2 -> splat X.

LGTM. Thanks, Paul.

Feb 25 2022, 2:14 PM · Restricted Project, Restricted Project
cameron.mcinally requested review of D120570: [SVE] Add pattern to commute FMLS operands.
Feb 25 2022, 8:35 AM · Restricted Project, Restricted Project

Feb 23 2022

cameron.mcinally abandoned D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.

Good point. Replacing the lowered truncates with ptrue sounds like a win in the general case. Abandoning this Diff.

Feb 23 2022, 6:45 AM · Restricted Project

Feb 22 2022

cameron.mcinally added inline comments to D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.
Feb 22 2022, 12:04 PM · Restricted Project
cameron.mcinally updated the diff for D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.

Updated patch based on @david-arm's review.

Feb 22 2022, 11:58 AM · Restricted Project

Feb 18 2022

cameron.mcinally updated the diff for D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.

Fix formatting for the Lint bots.

Feb 18 2022, 1:53 PM · Restricted Project
cameron.mcinally added a comment to D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.

Why didn't or cannot InstCombine catch this?

Feb 18 2022, 1:28 PM · Restricted Project
cameron.mcinally updated the diff for D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.

Updated Diff.

Feb 18 2022, 12:04 PM · Restricted Project
cameron.mcinally requested review of D120152: [AArch64][SVE] Match VLS all-1's masks to PTRUE.
Feb 18 2022, 11:42 AM · Restricted Project

Feb 9 2022

cameron.mcinally abandoned D119285: [SVE] Bail out of constructDup(...) optimization for fixed width vectors > 128 bits.

Ah, sorry for the noise. Abandoning this Diff...

Feb 9 2022, 8:44 AM · Restricted Project
cameron.mcinally added a comment to D119252: [AArch64][SVE] Fix selection failure during lowering of shuffle_vector.

Hi @paulwalker-arm, this bug was actually found in user-written code in Gromacs, although only there was only one instance of this I think. So it is something users may see, just not very often!

Feb 9 2022, 8:42 AM · Restricted Project

Feb 8 2022

cameron.mcinally requested review of D119285: [SVE] Bail out of constructDup(...) optimization for fixed width vectors > 128 bits.
Feb 8 2022, 2:10 PM · Restricted Project

Jan 25 2022

cameron.mcinally abandoned D118047: [SVE] Fix VLS selection error from performPostLD1Combine(...).

I believe this is a duplicate of D117674, which had not been reviewed yet so I've just pushed it along.

Jan 25 2022, 7:07 AM · Restricted Project

Jan 24 2022

cameron.mcinally updated the diff for D118047: [SVE] Fix VLS selection error from performPostLD1Combine(...).

Further reduced test case, but still not great.

Jan 24 2022, 2:45 PM · Restricted Project
cameron.mcinally requested review of D118047: [SVE] Fix VLS selection error from performPostLD1Combine(...).
Jan 24 2022, 8:06 AM · Restricted Project
cameron.mcinally added a comment to D117795: [AArch64] Add some missing strict FP vector lowering.

Is it possible to break the 4 subtasks into separate reviews?

Jan 24 2022, 7:31 AM · Restricted Project, Restricted Project

Nov 1 2021

cameron.mcinally added a comment to D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.

Perhaps worth adding matching half/fp16 tests to sve-fixed-length-fp-fma.ll but otherwise looks good.

Nov 1 2021, 11:11 AM · Restricted Project
cameron.mcinally committed rG702fd3d323aa: [SVE] Fix VLS FMA matching for CodeGenOpt::Aggressive. (authored by cameron.mcinally).
[SVE] Fix VLS FMA matching for CodeGenOpt::Aggressive.
Nov 1 2021, 10:44 AM
cameron.mcinally closed D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.
Nov 1 2021, 10:44 AM · Restricted Project
cameron.mcinally updated the summary of D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.
Nov 1 2021, 8:47 AM · Restricted Project

Oct 30 2021

cameron.mcinally updated the diff for D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.

Updated Diff for @paulwalker-arm's reviews...

Oct 30 2021, 1:10 PM · Restricted Project

Oct 26 2021

cameron.mcinally updated the diff for D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.

Fix clang-format warning and add the missing '+' to "+sve".

Oct 26 2021, 11:36 AM · Restricted Project
cameron.mcinally added inline comments to D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.
Oct 26 2021, 10:14 AM · Restricted Project
cameron.mcinally requested review of D112557: [SVE] Fix VLS FMA generation at CodeGenOpt::Aggressive.
Oct 26 2021, 9:41 AM · Restricted Project

Jan 29 2021

cameron.mcinally updated the diff for D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.
Jan 29 2021, 1:49 PM · Restricted Project
cameron.mcinally updated the diff for D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

[NOT READY FOR REVIEW]

Jan 29 2021, 1:48 PM · Restricted Project

Jan 25 2021

cameron.mcinally added a comment to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

In D94444#2497697, @paulwalker-arm wrote:
<A x Elt> llvm.experimental.vector.extract.elements(<B x Elt> %invec, i32 index, i32 stride)

Jan 25 2021, 2:46 PM · Restricted Project
cameron.mcinally added a comment to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Ok, I see where you are coming from now. LoopVectorize is keeping the shuffle result full by widening the the load+shuffle to double wide. LV's double wide choice seems like a weird one, but I suppose if that sequence is codegen'd correctly, then it will work out.

Jan 25 2021, 8:15 AM · Restricted Project

Jan 22 2021

cameron.mcinally added a comment to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

In D94444#2497697, @paulwalker-arm wrote:
<A x Elt> llvm.experimental.vector.extract.elements(<B x Elt> %invec, i32 index, i32 stride)

Jan 22 2021, 8:18 AM · Restricted Project

Jan 19 2021

cameron.mcinally updated subscribers of D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Having said that, I wonder if we should revisit the idea of allowing shuffle vectors to accept step vector masks?

Jan 19 2021, 1:21 PM · Restricted Project

Jan 15 2021

cameron.mcinally added a comment to D94708: [IR] Introduce llvm.experimental.vector.splice intrinsic.

In D94444, @paulwalker-arm proposed a more generic extract vector intrinsic that accepts an index and stride. Now I'm wondering if we should just have a generic scalable shuffle vector intrinsic to handle all these operations under one intrinsic.

Jan 15 2021, 8:27 AM · Restricted Project

Jan 14 2021

cameron.mcinally added inline comments to D94708: [IR] Introduce llvm.experimental.vector.splice intrinsic.
Jan 14 2021, 12:37 PM · Restricted Project
cameron.mcinally added a comment to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

A bit of a flyby review as I'm still on holidays but to my mind many of the restrictions being proposed for the new intrinsic seem purely down to the design decision of splitting the input vector across two operands. I understand this is how the underlying instructions work for SVE but that does not seem like a good enough reason to compromise the IR.

So my first questions are whether the IR and ISD interfaces need to match and from an IR point of view what is the expected usage?

Jan 14 2021, 8:10 AM · Restricted Project

Jan 13 2021

cameron.mcinally added inline comments to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.
Jan 13 2021, 10:25 AM · Restricted Project
cameron.mcinally updated the diff for D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Add known minimum number of elements restrictions...

Jan 13 2021, 10:24 AM · Restricted Project

Jan 12 2021

cameron.mcinally updated the diff for D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Updated to @david-arm's suggested naming scheme...

Jan 12 2021, 1:49 PM · Restricted Project
cameron.mcinally accepted D94504: [SVE] Add ISel pattern for addvl.

I'm assuming scheduling the new addvls closer to their uses is a register pressure win?

Jan 12 2021, 1:14 PM · Restricted Project
cameron.mcinally updated the diff for D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Address some of @sdesmalen's comments, but deferring name changes...

Jan 12 2021, 9:02 AM · Restricted Project
cameron.mcinally added a comment to D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.

Thanks for creating this patch!

I chose to extract the even elements from a pair of vectors (full vector result), rather than a single vector (1/2 width vector result). This is in line with existing fixed shuffle vectors. And can be extended to accept an undef argument if needed. The motivation behind this decision was that we'd want the result vector to be a full vector for performance reasons. It would also map well to SVE's LD2 and UZP1.

Are you also planning to add intrinsics for interleaving?

Jan 12 2021, 7:30 AM · Restricted Project

Jan 11 2021

cameron.mcinally requested review of D94444: [RFC][Scalable] Add scalable shuffle intrinsic to extract evens from a pair of vectors.
Jan 11 2021, 12:51 PM · Restricted Project

Jan 7 2021

cameron.mcinally added a comment to D94193: [SVE] Unpacked scalable floating point ZIP/UZP/TRN.

Please can you add entries for nxv2f16 as well? That way all the legal fp types are covered.

Jan 7 2021, 7:58 AM · Restricted Project
cameron.mcinally committed rGf4013359b3da: [SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns (authored by cameron.mcinally).
[SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns
Jan 7 2021, 7:57 AM
cameron.mcinally closed D94193: [SVE] Unpacked scalable floating point ZIP/UZP/TRN.
Jan 7 2021, 7:57 AM · Restricted Project

Jan 6 2021

cameron.mcinally requested review of D94193: [SVE] Unpacked scalable floating point ZIP/UZP/TRN.
Jan 6 2021, 1:30 PM · Restricted Project

Jan 4 2021

cameron.mcinally committed rG92be640bd7d4: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are… (authored by cameron.mcinally).
[FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are…
Jan 4 2021, 12:44 PM
cameron.mcinally closed D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.
Jan 4 2021, 12:44 PM · Restricted Project
cameron.mcinally accepted D93607: [SVE] Lower vector CTLZ, CTPOP and CTTZ operations..

LGTM

Jan 4 2021, 7:57 AM · Restricted Project
cameron.mcinally added a comment to D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.

Ping.

Jan 4 2021, 7:49 AM · Restricted Project

Dec 26 2020

cameron.mcinally added inline comments to D93607: [SVE] Lower vector CTLZ, CTPOP and CTTZ operations..
Dec 26 2020, 9:46 AM · Restricted Project

Dec 17 2020

cameron.mcinally updated the diff for D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.

Add FIXME comment.

Dec 17 2020, 8:51 AM · Restricted Project

Dec 15 2020

cameron.mcinally added inline comments to D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.
Dec 15 2020, 7:41 AM · Restricted Project

Dec 14 2020

cameron.mcinally retitled D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed from [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are preserved to [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.
Dec 14 2020, 2:02 PM · Restricted Project
cameron.mcinally requested review of D93243: [FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed.
Dec 14 2020, 2:01 PM · Restricted Project

Dec 11 2020

cameron.mcinally accepted D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.

LGTM

Dec 11 2020, 7:17 AM · Restricted Project

Dec 10 2020

cameron.mcinally added a comment to D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.

LGTM with one nit below...

Dec 10 2020, 1:29 PM · Restricted Project

Dec 4 2020

cameron.mcinally accepted D91362: [SelectionDAG] Add llvm.vector.{extract,insert} intrinsics.

I think @ctetreau's "first class citizen" argument on the RFC has merit though. But this patch is a good first step if we're not ready to extend ShuffleVector yet. I personally would like to see ShuffleVector extended eventually, since it would be easier to optimize.

Dec 4 2020, 9:54 AM · Restricted Project

Dec 1 2020

cameron.mcinally added a comment to D91362: [SelectionDAG] Add llvm.vector.{extract,insert} intrinsics.

Do we need to protect against mismatched element types? Or does legalization handle those exts/truncs?

Dec 1 2020, 8:24 AM · Restricted Project

Nov 12 2020

cameron.mcinally added inline comments to D91362: [SelectionDAG] Add llvm.vector.{extract,insert} intrinsics.
Nov 12 2020, 10:17 AM · Restricted Project

Nov 10 2020

cameron.mcinally added inline comments to D91077: [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF..
Nov 10 2020, 7:35 AM · Restricted Project

Nov 4 2020

cameron.mcinally committed rGc126eb7529be: [SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL (authored by cameron.mcinally).
[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL
Nov 4 2020, 12:21 PM
cameron.mcinally closed D90644: [Legalizer][ARM][AArch64] Add legalizations for VECREDUCE_SEQ_FMUL.
Nov 4 2020, 12:20 PM · Restricted Project

Nov 3 2020

cameron.mcinally added a comment to D90644: [Legalizer][ARM][AArch64] Add legalizations for VECREDUCE_SEQ_FMUL.
  • In llvm/test/CodeGen/ARM/vecreduce-fmul-legalization-strict.ll and llvm/test/CodeGen/AArch64/vecreduce-fmul-legalization-strict.ll, use 1.0 instead of 0.0 as the start value. That was probably a copy&paste mistake from fadds.

That caught my eye too, but the 0.0 seemed okay since we can't peep this without NSZ (-0*0) and NNAN (0*NaN). Changing it to 1.0 isn't a big deal though...

Nov 3 2020, 2:11 PM · Restricted Project
cameron.mcinally added a comment to D90644: [Legalizer][ARM][AArch64] Add legalizations for VECREDUCE_SEQ_FMUL.
  • In llvm/test/CodeGen/ARM/vecreduce-fmul-legalization-strict.ll and llvm/test/CodeGen/AArch64/vecreduce-fmul-legalization-strict.ll, use 1.0 instead of 0.0 as the start value. That was probably a copy&paste mistake from fadds.
Nov 3 2020, 1:45 PM · Restricted Project
cameron.mcinally updated the diff for D90644: [Legalizer][ARM][AArch64] Add legalizations for VECREDUCE_SEQ_FMUL.

Reformat to appease pre-merge checks...

Nov 3 2020, 7:40 AM · Restricted Project

Nov 2 2020

cameron.mcinally requested review of D90644: [Legalizer][ARM][AArch64] Add legalizations for VECREDUCE_SEQ_FMUL.
Nov 2 2020, 1:45 PM · Restricted Project

Oct 30 2020

cameron.mcinally committed rGdda1e74b58bd: [Legalize] Add legalizations for VECREDUCE_SEQ_FADD (authored by cameron.mcinally).
[Legalize] Add legalizations for VECREDUCE_SEQ_FADD
Oct 30 2020, 2:03 PM
cameron.mcinally closed D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .
Oct 30 2020, 2:03 PM · Restricted Project
cameron.mcinally added inline comments to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .
Oct 30 2020, 2:02 PM · Restricted Project
cameron.mcinally added inline comments to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .
Oct 30 2020, 12:48 PM · Restricted Project
cameron.mcinally updated the diff for D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Update patch based on @nikic's comments...

Oct 30 2020, 12:48 PM · Restricted Project

Oct 28 2020

cameron.mcinally updated the diff for D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Updated patch with, I think, all the needed legalizations.

Oct 28 2020, 11:59 AM · Restricted Project

Oct 27 2020

cameron.mcinally added a comment to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Comment from ARM/ARMISelLowering.cpp:

Oct 27 2020, 2:41 PM · Restricted Project
cameron.mcinally added a comment to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Ah, I see it in ARM/. That will work...

Oct 27 2020, 11:57 AM · Restricted Project
cameron.mcinally updated the diff for D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Update 'neutral' element to -0.0.

Oct 27 2020, 11:45 AM · Restricted Project
cameron.mcinally added a comment to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

Ok, I can build that out. Are we okay with the suboptimal legalization though? I'll wait for that decision before putting more time into this.

Or does anyone see a clever fix for the illegal type legalization? It looks like we lost information during widening, so I'm not sure we can get it back in a non-hacky way.

Not sure I follow. If the neutral element is fixed, then the extra fadds should also get folded away. Or is there some additional sub-optimality here?

Oct 27 2020, 11:31 AM · Restricted Project
cameron.mcinally added a comment to D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .

A good example of this can be seen in @test_v3f32 from vecreduce-fadd-legalization-strict.ll. Here we end up with 4 FADDs, instead of the 3 FADDs required. The newly added FADD is the result of widening the illegal v3f32 vector type to v4f32, where the newly added element in the reduction is the "neutral" value, 0.0.

Looking at https://github.com/llvm/llvm-project/blob/5a3ef55a524bf9e072d98286e5febdb218b1fc72/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp#L7477-L7480, shouldn't this just be a matter of using -0.0 as the neutral element instead? If 0.0 is not actually neutral here, then this is not just suboptimal, it's incorrect. (We should fix this for the non-sequential case as well.)

Oct 27 2020, 10:45 AM · Restricted Project
cameron.mcinally added a reviewer for D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD : spatel.
Oct 27 2020, 10:39 AM · Restricted Project
cameron.mcinally requested review of D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .
Oct 27 2020, 9:39 AM · Restricted Project

Oct 23 2020

cameron.mcinally added a comment to D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

[1] I just wanted to highlight my previous VBITS_EQ_256-COUNT-33: fadd comment as this gives us a bit more test coverage and is something that will obviously fail (in a good way) when the splitting work is available.

Oct 23 2020, 2:25 PM · Restricted Project
cameron.mcinally committed rGa1cc274cb35f: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation (authored by cameron.mcinally).
[SVE] Lower fixed length VECREDUCE_SEQ_FADD operation
Oct 23 2020, 2:24 PM
cameron.mcinally closed D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.
Oct 23 2020, 2:24 PM · Restricted Project
cameron.mcinally added a comment to D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

@paulwalker-arm, back to the splitting discussion...

Oct 23 2020, 11:01 AM · Restricted Project
cameron.mcinally added a comment to D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

The eventual goal here is to expand reductions during legalization only. The IR expansion exists because the DAG legalization support has been patchy historically, with VECREDUCE_SEQ_* being the last remaining hole.

Oct 23 2020, 10:53 AM · Restricted Project
cameron.mcinally updated the diff for D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

Updating patch, but not ready for a serious review yet as I haven't started the splitting work. I'm still not convinced we can handle splitting appropriately with the current setup, but will comment on that seperately.

Oct 23 2020, 9:18 AM · Restricted Project
cameron.mcinally added a comment to D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

The new tests would be broken without the legalisation changes, so I'm assuming that those are enough coverage. Maybe I'm missing something though...

Are you sure? I took your patch for a test drive and removed all but the TLI.getOperationAction related change from Legalize*.{cpp, h} and the tests passed.

Oct 23 2020, 8:03 AM · Restricted Project

Oct 22 2020

cameron.mcinally added a comment to D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

Sorry @cameron.mcinally I've not had much time for code reviews this week although will take proper look tomorrow. I have a question though. You've added extra legalisation support but I don't see any explicit tests (or at least ones with matching check lines) for it. Is this something you need for this patch? (I'm guessing sve-fixed-length-fp-reduce.ll's stock NEON run line triggers the cases?) If so then there really should be a neon specific test file that verifies the widening and scalarisation changes as the NEON run line for the "fixed-length" tests is more about ensuring no SVE instructions slip through.

Oct 22 2020, 10:11 AM · Restricted Project
cameron.mcinally updated the diff for D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

Try again with 80 column fix...

Oct 22 2020, 7:51 AM · Restricted Project
cameron.mcinally updated the diff for D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation.

Fix 80 column issue. No other changes intended...

Oct 22 2020, 7:42 AM · Restricted Project

Oct 19 2020

cameron.mcinally committed rG629d1d117ae0: [SVE] Update vector reduction intrinsics in new tests. (authored by cameron.mcinally).
[SVE] Update vector reduction intrinsics in new tests.
Oct 19 2020, 11:28 AM
cameron.mcinally added a comment to D88707: [SVE] Lower fixed length VECREDUCE_AND operation.

Can you update the tests to use the new non-experimental intrinsic name?

Oct 19 2020, 10:22 AM · Restricted Project