Page MenuHomePhabricator

samparker (Sam Parker)
part-timer

Projects

User does not belong to any projects.

User Details

User Since
May 11 2015, 7:59 AM (310 w, 1 d)

Recent Activity

Tue, Apr 13

samparker accepted D100313: [TTI] NFC: Change getCFInstrCost to return InstructionCost.

LGTM

Tue, Apr 13, 8:16 AM · Restricted Project

Mon, Apr 12

samparker accepted D100291: [AArch64] Use type-legalization cost for code size memop cost..

Makes sense to me.

Mon, Apr 12, 2:31 AM · Restricted Project

Tue, Apr 6

samparker added inline comments to D99171: [WebAssembly] Fold xor by inverting branch target.
Tue, Apr 6, 1:01 AM · Restricted Project
samparker committed rGf1313b3b249a: [NFC][WebAssembly] Removed mangled name from test. (authored by samparker).
[NFC][WebAssembly] Removed mangled name from test.
Tue, Apr 6, 12:59 AM

Thu, Apr 1

samparker committed rG92e777148359: [WebAssembly] Invert branch condition on xor input (authored by samparker).
[WebAssembly] Invert branch condition on xor input
Thu, Apr 1, 1:24 AM
samparker closed D99171: [WebAssembly] Fold xor by inverting branch target.
Thu, Apr 1, 1:24 AM · Restricted Project
samparker added a comment to D99171: [WebAssembly] Fold xor by inverting branch target.

Cheers! And no thanks - I've had commit access for a few years.

Thu, Apr 1, 1:07 AM · Restricted Project

Wed, Mar 31

samparker added inline comments to D99649: [ARM] Updates to arm-block-placement pass.
Wed, Mar 31, 5:00 AM · Restricted Project
samparker added a reviewer for D99649: [ARM] Updates to arm-block-placement pass: samtebbs.
Wed, Mar 31, 4:55 AM · Restricted Project
samparker updated the diff for D99171: [WebAssembly] Fold xor by inverting branch target.
  • Moved PatLeaf to top-level file.
  • Added tests for missing FP conditions.
  • Added switch tests.
Wed, Mar 31, 1:33 AM · Restricted Project
samparker added a comment to D99171: [WebAssembly] Fold xor by inverting branch target.

Can you add tests that do explicit xors in the IR to show that it is not folded out where it shouldn't be?

Well this is what I mean, that I don't think this is possible. Writing a test in IR, the br has to take an i1, though this is expanded to i32 during codegen. I could write some IR tests with xors, but they should all be valid. AFAIK, I'd need a way of writing a test input in selection dag form to write a negative test. I have missed some FP conditions in the existing tests, so I'm going to add those.

Wed, Mar 31, 12:54 AM · Restricted Project

Wed, Mar 24

samparker updated the diff for D99171: [WebAssembly] Fold xor by inverting branch target.

Thanks all. I've added a PatLeaf to detect a boolean, but I wasn't sure how to write a negative test using LLVM IR considering that the branch always takes an i1. Any ideas?

Wed, Mar 24, 12:54 AM · Restricted Project

Tue, Mar 23

samparker added inline comments to D99171: [WebAssembly] Fold xor by inverting branch target.
Tue, Mar 23, 7:19 AM · Restricted Project
samparker updated the summary of D99171: [WebAssembly] Fold xor by inverting branch target.
Tue, Mar 23, 4:11 AM · Restricted Project
samparker updated the summary of D99171: [WebAssembly] Fold xor by inverting branch target.
Tue, Mar 23, 4:11 AM · Restricted Project
samparker requested review of D99171: [WebAssembly] Fold xor by inverting branch target.
Tue, Mar 23, 4:09 AM · Restricted Project

Mar 10 2021

samparker added a comment to D71992: [ARM] Unrestrict Armv8 IT blocks.

Hi Yvan,

Mar 10 2021, 5:58 AM · Restricted Project

Feb 17 2021

samparker added a comment to D94308: [MachineSink] SinkIntoLoop: analyse stores and aliases in between.

The X86 test is indeed large! Considering that it's just an unrolled loop with the same sequence repeated MANY times, could you not just delete 3/4 of the loop and update the indvar accordingly?

Feb 17 2021, 1:20 AM · Restricted Project

Feb 16 2021

samparker abandoned D76716: [ARM][MVE] Tail predicate VMAXV(unsigned) and VMAXAV.

Maybe? Not something that I'll be continuing with though.

Feb 16 2021, 9:13 AM · Restricted Project
samparker abandoned D65567: [ARM][ParallelDSP] Generate SMUAD.

No, don't think I have the time/energy to possibly introduce a bug :)

Feb 16 2021, 9:12 AM
samparker added inline comments to D96772: [LSR] Cleanup of getPreferredAddresingMode. NFC..
Feb 16 2021, 9:11 AM · Restricted Project

Feb 15 2021

samparker added a comment to D96600: [TTI] Unify FavorPostInc and FavorBackedgeIndex into getAddressingMode.

Nothing else from me.

Feb 15 2021, 1:40 AM · Restricted Project

Feb 12 2021

samparker added a comment to D96600: [TTI] Unify FavorPostInc and FavorBackedgeIndex into getAddressingMode.

But why not just order the logic so that it is an NFC?

Feb 12 2021, 6:51 AM · Restricted Project
samparker added a comment to D96600: [TTI] Unify FavorPostInc and FavorBackedgeIndex into getAddressingMode.

Shouldn't this be an NFC..? Looks like we should be favouring post-incs for MVE, even at optsize.

Feb 12 2021, 6:16 AM · Restricted Project
samparker added a comment to D89693: [AArch64] Favor pre-increments and implement TTI::getPreferredAddressingMode.

Sounds good. How about unifying it to one call to get the type of preferred indexing: none, pre and post?

Feb 12 2021, 2:35 AM · Restricted Project

Feb 10 2021

samparker accepted D91857: [ARM] Remove dead mov's in preheader of tail predicated loops.

cheers!

Feb 10 2021, 1:20 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

Thanks! Yes, agreed. I've also added a couple of tests before committing, just to check that unrolling is indeed disabled at -Os and -Oz.

Feb 10 2021, 12:27 AM · Restricted Project
samparker committed rG9d81ccc02ffb: [WebAssembly] Enable loop unrolling (authored by samparker).
[WebAssembly] Enable loop unrolling
Feb 10 2021, 12:27 AM
samparker closed D95125: [WebAssembly] Enable loop unrolling.
Feb 10 2021, 12:26 AM · Restricted Project

Feb 3 2021

samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

I've spent the day looking at lzma and my conclusions are that some loops are vectorized and/or unrolled but they're not on the hot path, and now I believe the improvement I saw on native was purely from secondary side effects. Loop unrolled code is ~4% larger and -msimd128 enabled is ~2.5%, so I'm still surprised you're not seeing change there.

Feb 3 2021, 8:53 AM · Restricted Project
samparker added inline comments to D95291: [CostModel] Remove VF from IntrinsicCostAttributes.
Feb 3 2021, 2:40 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

My node results are far more stable than wasmtime, on both x86_64 and aarch64 with and without liftoff. Those results correlate with yours and show no benefit of loop unrolling, but I'm seeing ~4% binary size increase too. So I'm surprised and confused and I guess I will investigate further.

Feb 3 2021, 2:37 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

And to qualify that, native execution runs vary by 10s of msec and -fno-unroll-loops is ~9% slower.

Feb 3 2021, 2:06 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

Thanks @tlively. I just took a look at lzma and yes, it looks like it should benefit from unrolling... What runtime are you running on? I've just done a few runs on wasmtime and, frankly, the benchmark looks far too noisey to be useful: my execution times are varying between ~6.5-8 seconds!

Feb 3 2021, 1:44 AM · Restricted Project

Feb 2 2021

samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

Gentle ping... and I'd like to clarify my point about code size: Wasm is actually no different to any other backend in that respect, where the user will can dictate whether it is the most important factor through optimisation levels. The ARM backend is a good example of this where we spend a considerable amount of time tuning our microcontrollers for both -Oz and -O3. The usage of optimisation levels nearly always requires some education, but I think this is easier than re-implementing loop unrolling elsewhere and/or letting users figure out how to enable sensible unrolling via the command line.

Feb 2 2021, 1:26 AM · Restricted Project

Jan 27 2021

samparker accepted D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

LGTM

Jan 27 2021, 12:46 AM · Restricted Project

Jan 26 2021

samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

But if we do this I'd recommend we at least update changelogs and docs in preparation, to try to minimize the annoyance for users.

Sure and users should definitely be recommended to compile at -Os if code size is their most important metric. From my, admittedly limited, benchmarking the difference between the optimisation levels is quite negligible, yet -Os can reduce size without suffering the performance impact often associated with -Oz.

Jan 26 2021, 3:14 AM · Restricted Project

Jan 25 2021

samparker added a comment to D95355: [CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost.

I guess these costs look more reasonable compared to their scalar counterparts, though I don't think we have cttz for MVE. But as a general query, if these are being scalarized why isn't the cost 16? It looks like it could just be the accumulation costs of the scalar parts. Or have I missed something?

Jan 25 2021, 8:02 AM · Restricted Project
samparker added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Jan 25 2021, 6:53 AM · Restricted Project
samparker added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Jan 25 2021, 4:14 AM · Restricted Project
samparker added inline comments to D95291: [CostModel] Remove VF from IntrinsicCostAttributes.
Jan 25 2021, 3:38 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

I'm inclined to merge this and leave it to users for whom the code size increase is unacceptable to use -Os/-Oz/-fno-unroll-loops to disable it.

The unrolling would be disabled at -Os and -Oz, see

Jan 25 2021, 2:40 AM · Restricted Project

Jan 22 2021

samparker added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Jan 22 2021, 6:55 AM · Restricted Project
samparker added a comment to D95125: [WebAssembly] Enable loop unrolling.

Thanks for taking a look!

Jan 22 2021, 2:17 AM · Restricted Project

Jan 21 2021

samparker requested review of D95125: [WebAssembly] Enable loop unrolling.
Jan 21 2021, 3:57 AM · Restricted Project

Jan 18 2021

samparker accepted D92934: [ARM][MachineOutliner] Add stack fixup feature..

LGTM

Jan 18 2021, 6:46 AM · Restricted Project

Jan 15 2021

samparker added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Jan 15 2021, 6:51 AM · Restricted Project
samparker added reviewers for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink: shchenz, qcolombet.
Jan 15 2021, 6:03 AM · Restricted Project
samparker added a comment to D92934: [ARM][MachineOutliner] Add stack fixup feature..

Sorry for the big delay Yvan, I haven't really been working on LLVM recently.

Jan 15 2021, 1:33 AM · Restricted Project

Dec 17 2020

samparker accepted D92933: [ARM][MachineOutliner] Fix costs model..
Dec 17 2020, 5:25 AM · Restricted Project
samparker added inline comments to D92933: [ARM][MachineOutliner] Fix costs model..
Dec 17 2020, 5:04 AM · Restricted Project
samparker added inline comments to D92933: [ARM][MachineOutliner] Fix costs model..
Dec 17 2020, 3:13 AM · Restricted Project

Dec 11 2020

samparker added inline comments to D92933: [ARM][MachineOutliner] Fix costs model..
Dec 11 2020, 1:18 AM · Restricted Project

Dec 10 2020

samparker added inline comments to D92933: [ARM][MachineOutliner] Fix costs model..
Dec 10 2020, 12:35 AM · Restricted Project

Dec 8 2020

samparker added a comment to D92094: [CostModel]Replace FixedVectorType by VectorType in costgetIntrinsicInstrCost.

Please move the if (isa<ScalableVectorType>(RetTy)) return BaseT::getIntrinsicInstrCost(ICA, CostKind); bailing out logic into the beginning of each switch branch that does not work/is not currently tested. These can be removed on a case-by-case basis in the patches that fix these branches.

+1

Dec 8 2020, 12:52 AM · Restricted Project

Dec 3 2020

samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 3 2020, 2:49 AM · Restricted Project

Dec 2 2020

samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 2 2020, 7:18 AM · Restricted Project
samparker added a comment to D90836: [ARM][MachineOutliner] Add stack fixup feature..

Hi Yvan,

Dec 2 2020, 7:01 AM · Restricted Project
samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 2 2020, 5:21 AM · Restricted Project
samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 2 2020, 12:58 AM · Restricted Project
samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 2 2020, 12:57 AM · Restricted Project

Dec 1 2020

samparker added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 1 2020, 5:53 AM · Restricted Project
samparker accepted D92159: [LSR][NFC] don't collect chains when isNumRegsMajorCostOfLSR is false.

Okay, fair enough.

Dec 1 2020, 1:09 AM · Restricted Project

Nov 30 2020

samparker added a comment to D92094: [CostModel]Replace FixedVectorType by VectorType in costgetIntrinsicInstrCost.

I'd prefer if we didn't knowingly create broken paths here... does it currently just crash in the AArch64 backend or are there also failures in the generic parts? I'm thinking that we could just have a separate switch statement which checks for supported opcodes for scalable vectors and protects us from going down bad paths. A TODO comment could also be added to document why and what needs fixing up.

Nov 30 2020, 2:45 AM · Restricted Project

Nov 26 2020

samparker added a comment to D92094: [CostModel]Replace FixedVectorType by VectorType in costgetIntrinsicInstrCost.

I think it would be a good idea to test that the paths through getCmpSelInstrCost and getArithmeticInstrCost work too.

Nov 26 2020, 1:40 AM · Restricted Project

Nov 24 2020

samparker accepted D91724: [HardwareLoops] Change order of SCEV expression construction for InitLoopCount..

Intuitively, it sounds unlikely for getExitCount to return a pointer type but I'm not sure.

Indeed, thanks for checking. LGTM, thanks.

Nov 24 2020, 1:01 AM · Restricted Project

Nov 23 2020

samparker added a comment to D91481: [LoopUnroll] Discount uniform instructions in cost models.

This looks sensible to me and a step in the right direction. Have you ran any benchmarks? Would be good to know, at least, that this isn't causing any obvious regressions.

Nov 23 2020, 1:27 AM · Restricted Project

Nov 20 2020

samparker added a comment to D89800: [ARM][LowOverheadLoops] Don't generate a LOL if lr is redefined after the start.

I'd be happy with an assertion that DoLoopStart is connected properly, but WhileLoopStart is still my real concern with this logic.

Nov 20 2020, 7:21 AM · Restricted Project
samparker added inline comments to D91857: [ARM] Remove dead mov's in preheader of tail predicated loops.
Nov 20 2020, 7:06 AM · Restricted Project
samparker added a comment to D89800: [ARM][LowOverheadLoops] Don't generate a LOL if lr is redefined after the start.

So my thoughts:

  • For the case where LRDef is not null, we should be able to check that this same value is used by 'Dec'. That way we have proved explicitly the the use-def chain is correct instead of relying on the check of the live-out value.
  • For any the case where LRDef is null, I'm pretty sure we just need to try harder to find it - something needs to be setting LR for loop entry.
  • My issue with the isSafeToDefRegAt is that is a local check. If WLS is in the preheader predecessor, we could decide we that defining LR is safe (locally), but then LR is redefined in the preheader.
Nov 20 2020, 6:26 AM · Restricted Project

Nov 19 2020

samparker added inline comments to D91724: [HardwareLoops] Change order of SCEV expression construction for InitLoopCount..
Nov 19 2020, 7:01 AM · Restricted Project
samparker added a comment to D91724: [HardwareLoops] Change order of SCEV expression construction for InitLoopCount..

I wonder whether the zero-extend should also be moved to isHardwareLoopCandidate at this point.

Nice, sounds good to me.

Nov 19 2020, 6:50 AM · Restricted Project
samparker added a comment to D91724: [HardwareLoops] Change order of SCEV expression construction for InitLoopCount..

Out of curiosity, did you try performing the +1 in HardwareLoopInfo::isHardwareLoopCandidate instead? Looking back through the code, I think it would be more clear to rename 'ExitCount' to 'TripCount' and perform this addition outside of the pass.

Nov 19 2020, 12:58 AM · Restricted Project
samparker accepted D91724: [HardwareLoops] Change order of SCEV expression construction for InitLoopCount..

LGTM, thanks!

Nov 19 2020, 12:55 AM · Restricted Project

Nov 18 2020

samparker accepted D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

I'm happy if Dave has no other suggestions.

Nov 18 2020, 2:47 AM · Restricted Project
samparker accepted D91663: [ARM] Disable WLSTP loops.

Sounds good to me, I would have been in favour of just completely disabling wls for now :)

Nov 18 2020, 12:42 AM · Restricted Project

Nov 17 2020

samparker added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Nov 17 2020, 8:52 AM · Restricted Project
samparker added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Nov 17 2020, 2:35 AM · Restricted Project

Nov 16 2020

samparker added a comment to D91450: [BPI] Look through bitcasts in calcZeroHeuristic.

But now I understand that this isn't an NFC! So can you add a test please?

Nov 16 2020, 4:13 AM · Restricted Project
samparker accepted D91450: [BPI] Look through bitcasts in calcZeroHeuristic.

Thanks. It's okay to commit cleanup NFCs without review, so don't worry about these kind of patches up for review in the future.

Nov 16 2020, 4:11 AM · Restricted Project

Nov 12 2020

samparker accepted D91287: [ARM] Ensure CountReg definition dominates InsertPt when creating t2DoLoopStartTP.

Ok, cheers! LGTM

Nov 12 2020, 3:51 AM · Restricted Project
samparker added inline comments to D91287: [ARM] Ensure CountReg definition dominates InsertPt when creating t2DoLoopStartTP.
Nov 12 2020, 1:59 AM · Restricted Project

Nov 11 2020

samparker added a comment to D91267: [ARM] Remove copies from low overhead phi inductions..

Getting phis to use the right register class sounds like a good generic change to make. If creating them like that is a pain, how about just moving most of this logic into OptimizePhis?

Nov 11 2020, 8:02 AM · Restricted Project
samparker accepted D91257: [ARM] Deliberately prevent inline asm in low overhead loops. NFC.

Explicit sounds good.

Nov 11 2020, 6:53 AM · Restricted Project
samparker committed rG898a81dfc544: [NFC][ARM] Replace lambda with any_of (authored by samparker).
[NFC][ARM] Replace lambda with any_of
Nov 11 2020, 2:03 AM

Nov 10 2020

samparker added a comment to D90554: [CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation.

Yep, LGTM

Nov 10 2020, 12:37 AM · Restricted Project

Nov 9 2020

samparker accepted D90591: [ARM] Introduce t2DoLoopStartTP.
Nov 9 2020, 7:46 AM · Restricted Project
samparker added inline comments to D90591: [ARM] Introduce t2DoLoopStartTP.
Nov 9 2020, 4:03 AM · Restricted Project

Nov 6 2020

samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

I think that the code would still need a t2DoLoopStartTP instruction, so those change would all have to stay

Agreed, and I think this pseudo is very good idea.

Nov 6 2020, 4:33 AM · Restricted Project

Nov 5 2020

samparker added a comment to D90781: [ARM] remove cost-kind predicate for cmp/sel costs.

and the MVE costs should be our problem not yours :)

Indeed! Thanks both for the explanations.

Nov 5 2020, 8:49 AM · Restricted Project
samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

That's all about different patches though! Any comments on this one?

Sorry, it's just difficult to gauge this without a view on the wider implementation idea. For the other stuff that you're talking about, of course we need that in MI but, like I said at the start, we could currently achieve this patch with probably 10x less code at the IR level.

Nov 5 2020, 4:37 AM · Restricted Project
samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

With the MVE instructions taking LR, maybe you wouldn't need to make the terminator change, have you tried it yet? I guess I'm naively assuming that the register allocator will be much less likely to spill something that is used by most / all of the MVE instructions within the loop.

Nov 5 2020, 3:08 AM · Restricted Project
samparker added inline comments to D90781: [ARM] remove cost-kind predicate for cmp/sel costs.
Nov 5 2020, 12:38 AM · Restricted Project

Nov 4 2020

samparker accepted D90606: [llvm][AArch64] Allow TB(N)Z to drop signext for sign bit tests..

Thanks, LGTM

Nov 4 2020, 12:29 AM · Restricted Project
samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

I would like to put here start with combining t2LoopDec and t2LoopEnd

Ok, fair enough.

Nov 4 2020, 12:20 AM · Restricted Project

Nov 3 2020

samparker accepted D90681: [CostModel] fix cost calc bug for sadd/ssub with overflow.

Okay, fair enough! I was getting confused that this is still throughput only.

Nov 3 2020, 6:51 AM · Restricted Project
samparker added a comment to D90681: [CostModel] fix cost calc bug for sadd/ssub with overflow.

Thanks for doing this. But now I'm surprised to not see any Arm changes?

Nov 3 2020, 6:23 AM · Restricted Project
samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

Things like reverting calls is easier to do at this point.

What calls do you mean..?
What specifically do you need from the pre-RA MachineInstr form that means this shouldn't be done at IR level? The search here is certainly not more simple and manageable than doing this MVETailPredication, where the components are also found, and handling phis at this level is just an added pain.

Nov 3 2020, 2:06 AM · Restricted Project
samparker added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

This sounds like a good idea to me, but if we're going to do stuff earlier, to make our lives easier, maybe we should do this at the IR level and use a target-specific intrinsic and free ourselves completely from searching at the MI level?

Nov 3 2020, 1:27 AM · Restricted Project
samparker added a comment to D90554: [CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation.
Cost += 2 * thisT()->getCmpSelInstrCost(
                BinaryOperator::ICmp, OverflowTy, OverflowTy,
                CmpInst::BAD_ICMP_PREDICATE, CostKind);

It looks like this second ICmp should be a select/

Nov 3 2020, 1:09 AM · Restricted Project