SjoerdMeijer (Sjoerd Meijer)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 26 2016, 2:17 AM (125 w, 2 d)

Recent Activity

Today

SjoerdMeijer updated the diff for D48128: [ARM] Parallel DSP IR Pass.

Addressed Sam's comments:

  • set the alignment on the load instructions.
  • added alias analysis checks.
  • check that the accumulator is the right integer type.
  • bail out early when the operand chains of the muls are integers constants or more complicated patterns we don't yet support.

Tests have been modified and added for all these cases.

Thu, Jun 21, 2:46 AM

Yesterday

SjoerdMeijer committed rL335111: [SPIR] Prevent SPIR targets from using half conversion intrinsics.
[SPIR] Prevent SPIR targets from using half conversion intrinsics
Wed, Jun 20, 2:54 AM
SjoerdMeijer committed rC335111: [SPIR] Prevent SPIR targets from using half conversion intrinsics.
[SPIR] Prevent SPIR targets from using half conversion intrinsics
Wed, Jun 20, 2:54 AM
SjoerdMeijer closed D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics.
Wed, Jun 20, 2:54 AM
SjoerdMeijer added a comment to D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics.

No problem, will commit this today.

Wed, Jun 20, 12:51 AM
SjoerdMeijer committed rL335100: [PatternMatch] Add m_Store pattern match helper.
[PatternMatch] Add m_Store pattern match helper
Wed, Jun 20, 12:32 AM
SjoerdMeijer closed D48279: [PatternMatch] Add m_Store pattern match helper.
Wed, Jun 20, 12:32 AM

Tue, Jun 19

SjoerdMeijer added a comment to D48279: [PatternMatch] Add m_Store pattern match helper.

Thanks a lot for the reviews. I will wait a day before committing.

Tue, Jun 19, 8:38 AM
SjoerdMeijer updated the diff for D48279: [PatternMatch] Add m_Store pattern match helper.

Thanks, more tests added.

Tue, Jun 19, 8:29 AM
SjoerdMeijer updated the diff for D48279: [PatternMatch] Add m_Store pattern match helper.

Tweaked the unittests a bit more.

Tue, Jun 19, 5:56 AM
SjoerdMeijer added a comment to D41953: [LoopUnroll] Unroll and Jam.

I think you can commit this. And should there be more comments, then they can be addressed post commit.

Tue, Jun 19, 3:16 AM
SjoerdMeijer accepted D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics.

Looks OK to me.

Tue, Jun 19, 12:58 AM

Mon, Jun 18

SjoerdMeijer updated the diff for D48279: [PatternMatch] Add m_Store pattern match helper.

Thanks for the suggestions:

  • added unittests,
  • now storing both the pointer and value operand (looks most convenient to me).
Mon, Jun 18, 9:17 AM
SjoerdMeijer added a comment to D47715: [AArch64][SVE] Asm: Support for saturating INC/DEC (64bit scalar) instructions..

As discussed with @rengolin and @t.p.northover, simple SVE assembler patches can be directly committed. I think this would be a good first candidate for that :)

Mon, Jun 18, 7:43 AM
SjoerdMeijer added a comment to D48279: [PatternMatch] Add m_Store pattern match helper.

Ah, thanks! Will add tests there, and check if I want the pointer operand.

Mon, Jun 18, 7:05 AM
SjoerdMeijer created D48279: [PatternMatch] Add m_Store pattern match helper.
Mon, Jun 18, 6:43 AM
SjoerdMeijer updated the diff for D48128: [ARM] Parallel DSP IR Pass.

Hi John, thanks for the review! I have made the kernel that drives the transformations explicit regarding the inputs and outputs of the different functions; that indeed reads much nicer now.

Mon, Jun 18, 3:01 AM

Fri, Jun 15

SjoerdMeijer added a comment to D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics.

I know very little about SPIR, and Initially didn't understand this:

Fri, Jun 15, 2:47 AM
SjoerdMeijer accepted D47711: [AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates..

Looks OK to me.

Fri, Jun 15, 1:47 AM

Thu, Jun 14

SjoerdMeijer updated the diff for D48128: [ARM] Parallel DSP IR Pass.

This is addressing:

Thu, Jun 14, 9:29 AM
SjoerdMeijer added inline comments to D48128: [ARM] Parallel DSP IR Pass.
Thu, Jun 14, 3:11 AM
SjoerdMeijer updated the diff for D48128: [ARM] Parallel DSP IR Pass.

Hi Sam, thanks for the reviews! This is a tidy-up addressing the nits and minor comments in the 1st review. Not addressed yet is the check for unaligned accesses support, which I will look first into now.

Thu, Jun 14, 3:03 AM
SjoerdMeijer added a comment to D48128: [ARM] Parallel DSP IR Pass.

On targets with NEON, we should prefer NEON vectorization (vmlal etc.) over DSP instructions in almost all cases.

Thu, Jun 14, 12:41 AM

Wed, Jun 13

SjoerdMeijer created D48128: [ARM] Parallel DSP IR Pass.
Wed, Jun 13, 7:46 AM
SjoerdMeijer accepted D48119: [AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A.

Thanks. Looks like a straightforward fix to me.

Wed, Jun 13, 5:44 AM
SjoerdMeijer added a comment to D48119: [AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A.

Nits title:

  • Think you can simplify the title to something along the lines of: "[AArch64] Support the FP16 VCVTA_U16 intrinsic". No need to mention tests are added in the subject (tests should always be added).
Wed, Jun 13, 4:03 AM

Sat, Jun 9

SjoerdMeijer added a comment to D47921: [ARM] Allow CMPZ transforms even if the input has multiple uses..

Showed some decent improvements in a few benchmarks. Cheers.

Sat, Jun 9, 1:49 AM

Fri, Jun 8

SjoerdMeijer accepted D47713: [AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions..

Looks OK to me.

Fri, Jun 8, 6:17 AM
SjoerdMeijer added a comment to D47711: [AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates..

Thanks, this looks a lot better IMO.

Fri, Jun 8, 5:58 AM
SjoerdMeijer accepted D47447: [NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part).
Fri, Jun 8, 5:51 AM
SjoerdMeijer added a comment to D47447: [NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part).

Agreed: supported architectures are v7/A32/A64.
Thanks for fixing this.

Fri, Jun 8, 5:51 AM
SjoerdMeijer accepted D47446: [NEON] Support VST1xN intrinsics in AArch32 mode (Clang part).

Agreed: supported architectures are v7/A32/A64.

Fri, Jun 8, 5:48 AM · Restricted Project
SjoerdMeijer accepted D47921: [ARM] Allow CMPZ transforms even if the input has multiple uses..

The initial commit showed problem in testing, and my fix added the check that the AND has a single use. Looks like indeed that we just forgot to remove the other check as I also don't see why it would be necessary. Thanks for fixing.

Fri, Jun 8, 12:47 AM

Mon, Jun 4

SjoerdMeijer accepted D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests.

I think this looks ok now, just some nits inline.

Mon, Jun 4, 9:07 AM
SjoerdMeijer added a comment to D47711: [AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates..

Hi, I might be missing something, but why do we need this? Why is this not the same as the already existing "k_FPImm" case (with perhaps an additional check somewhere to see if it is one of these 4 expected values)?

Mon, Jun 4, 8:23 AM

Fri, Jun 1

SjoerdMeijer accepted D47570: [AArch64][SVE] Asm: Support for indexed DUP instructions..

Looks okay to me.

Fri, Jun 1, 9:02 AM
SjoerdMeijer accepted D41953: [LoopUnroll] Unroll and Jam.

Cleanup/simplification of the tests looks good to me. Just a nit inlined.

Fri, Jun 1, 3:29 AM
SjoerdMeijer added inline comments to D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests.
Fri, Jun 1, 2:33 AM
SjoerdMeijer added inline comments to D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests.
Fri, Jun 1, 2:29 AM

Thu, May 31

SjoerdMeijer accepted D47120: [NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part).

Like I said in D47121, I agree that these intrinsics are available in v7/A32/A64.
This change looks reasonable to me.

Thu, May 31, 6:46 AM
SjoerdMeijer accepted D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part).

I agree: these intrinsics are available in v7/A32/A64.

Thu, May 31, 3:48 AM · Restricted Project
SjoerdMeijer accepted D47328: [AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction..

Thanks, looks OK to me.

Thu, May 31, 3:02 AM
SjoerdMeijer added inline comments to D47328: [AArch64][SVE] Asm: Support for DUPM (masked immediate) instruction..
Thu, May 31, 2:05 AM

Wed, May 30

SjoerdMeijer added a comment to D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part).

Had only a first brief look; see some first drive by comments inline.

Wed, May 30, 9:15 AM · Restricted Project
SjoerdMeijer added a comment to D41953: [LoopUnroll] Unroll and Jam.

Any suggestions on what would be at the top of the list? :)

Wed, May 30, 1:27 AM

Tue, May 29

SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

Hi Junmo, many thanks for sharing this. Looks like we have some more work and further tuning to do!

Tue, May 29, 1:45 AM

Mon, May 28

SjoerdMeijer added a comment to D41953: [LoopUnroll] Unroll and Jam.

This looked very reasonable to me as an initial commit. And I thought so too, like Hal and Eric, that different enhancements can still be made. But as I wrote, we can iterate further on this once we've got something in. So I don't think this was a misstep, and once the buildbot failures are resolved, it looks like this can be recommitted. However, as I also mentioned, another reason for an initial commit is more exposure, potentially more people looking at it. So this discussion is great, the more feedback, the better; especially if there's something that needs to be addressed before a recommit.

Mon, May 28, 1:53 AM

Fri, May 25

SjoerdMeijer accepted D47365: [AArch64][SVE] Asm: Support for predicated LSL/LSR (vectors).

Looks OK to me.

Fri, May 25, 3:16 AM

Thu, May 24

SjoerdMeijer added a comment to D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

Just out of curiousity:

  • How do you plan to implement this? Are you going to generate from the pragma some sort of "script" that dictates the transformation order which is going to be fed to the pass manager?
  • About the stacking of pragmas, in your example you apply the bottom one first, but would a user perhaps expect the first to be applied? In other words, is the expected behaviour described somewhere (in a spec)?
Thu, May 24, 2:54 AM

May 16 2018

SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

Yes, excellent, thank you!

May 16 2018, 11:57 PM
SjoerdMeijer accepted D41953: [LoopUnroll] Unroll and Jam.

Hi Dave,

May 16 2018, 8:56 AM
SjoerdMeijer accepted D46680: [AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+scalar) store instructions..

Looks OK to me.

May 16 2018, 7:13 AM

May 15 2018

SjoerdMeijer added inline comments to D46884: [AArch64] Cortex-A55 scheduler model.
May 15 2018, 8:33 AM
SjoerdMeijer created D46884: [AArch64] Cortex-A55 scheduler model.
May 15 2018, 8:29 AM

May 14 2018

SjoerdMeijer accepted D46686: [AArch64][SVE] Asm: Support for gather PRF prefetch instructions.

Looks OK to me.

May 14 2018, 1:30 AM
SjoerdMeijer accepted D46682: [AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions..

Looks okay to me.

May 14 2018, 1:22 AM
SjoerdMeijer accepted D46311: [AArch64] added FP16 vcvth intrinsic support.

Thanks, I think this looks OK.

May 14 2018, 1:19 AM

May 10 2018

SjoerdMeijer accepted D46688: [AArch64][SVE] Improve diagnostics for vectors with incorrect element-size..
May 10 2018, 6:14 AM
SjoerdMeijer added inline comments to D46688: [AArch64][SVE] Improve diagnostics for vectors with incorrect element-size..
May 10 2018, 4:07 AM

May 9 2018

SjoerdMeijer added inline comments to D41953: [LoopUnroll] Unroll and Jam.
May 9 2018, 2:18 AM

May 4 2018

SjoerdMeijer added inline comments to D46311: [AArch64] added FP16 vcvth intrinsic support.
May 4 2018, 7:11 AM

May 3 2018

SjoerdMeijer accepted D46009: [AArch64] Custom Lower MULLH{S,U} for v16i8, v8i16, and v4i32.

Thanks. This looks OK to me.

May 3 2018, 1:07 AM

May 2 2018

SjoerdMeijer requested changes to D46311: [AArch64] added FP16 vcvth intrinsic support.

Sorry, I had one more look, see question inlined.

May 2 2018, 8:12 AM
SjoerdMeijer accepted D46311: [AArch64] added FP16 vcvth intrinsic support.

Thanks, looks good to me now. One more nit inlined, but no need for another review.

May 2 2018, 6:29 AM
SjoerdMeijer added inline comments to D46311: [AArch64] added FP16 vcvth intrinsic support.
May 2 2018, 3:35 AM
SjoerdMeijer accepted D46248: [AArch64][SVE] Asm: Support for scatter ST1 store instructions..

Looks OK to me.

May 2 2018, 2:28 AM

May 1 2018

SjoerdMeijer added inline comments to D46311: [AArch64] added FP16 vcvth intrinsic support.
May 1 2018, 8:42 AM
SjoerdMeijer added inline comments to D46311: [AArch64] added FP16 vcvth intrinsic support.
May 1 2018, 7:43 AM
SjoerdMeijer added a comment to D46311: [AArch64] added FP16 vcvth intrinsic support.

Hi Luke, thanks for fixing this, some first comments inlined.

May 1 2018, 6:45 AM
SjoerdMeijer added inline comments to D46250: [AArch64][SVE] Asm: Support for LD1RQ load-and-replicate quad-word vector instructions..
May 1 2018, 5:25 AM
SjoerdMeijer added a comment to D46009: [AArch64] Custom Lower MULLH{S,U} for v16i8, v8i16, and v4i32.

Sorry, I needed to get up to speed here, also with this hacker's delight trick.
I think it would be good if you add some comments to LowerMULH what we are trying to do here.

May 1 2018, 1:28 AM

Apr 30 2018

SjoerdMeijer accepted D46250: [AArch64][SVE] Asm: Support for LD1RQ load-and-replicate quad-word vector instructions..

Looks OK to me.

Apr 30 2018, 12:47 PM

Apr 27 2018

SjoerdMeijer accepted D46107: [AArch64] Codegen for v8.2A dot product intrinsics.

Looks good to me.

Apr 27 2018, 2:51 AM
SjoerdMeijer accepted D46120: [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (vector + imm) load instructions..

Looks okay to me.

Apr 27 2018, 2:15 AM
SjoerdMeijer added inline comments to D46009: [AArch64] Custom Lower MULLH{S,U} for v16i8, v8i16, and v4i32.
Apr 27 2018, 1:47 AM

Apr 26 2018

SjoerdMeijer accepted D46109: [ARM,AArch64] Add intrinsics for dot product instructions.

I think this looks OK.

Apr 26 2018, 3:53 AM
SjoerdMeijer accepted D46108: [ARM] Add __ARM_FEATURE_DOTPROD pre-defined macro.

Looks like a straight forward fix/addition to me.

Apr 26 2018, 3:29 AM
SjoerdMeijer accepted D46106: [ARM] Codegen for v8.2A dot product intrinsics.

Looks good to me.

Apr 26 2018, 2:58 AM

Apr 25 2018

SjoerdMeijer added inline comments to D45879: [AsmMatcher] Extend PredicateMethod with optional DiagnosticPredicate.
Apr 25 2018, 12:12 AM

Apr 24 2018

SjoerdMeijer added inline comments to D45879: [AsmMatcher] Extend PredicateMethod with optional DiagnosticPredicate.
Apr 24 2018, 3:55 AM
SjoerdMeijer added a comment to D45879: [AsmMatcher] Extend PredicateMethod with optional DiagnosticPredicate.

Looks good to me too, just some comments about typos/nits inlined from my side. I will let Oliver approve as he did most of the work in this area.

Apr 24 2018, 2:51 AM

Apr 19 2018

SjoerdMeijer committed rL330313: [ARM] Add some missing FP16 VSEL test cases.
[ARM] Add some missing FP16 VSEL test cases
Apr 19 2018, 1:25 AM
SjoerdMeijer closed D45724: [ARM] Add some missing FP16 VSEL test cases.
Apr 19 2018, 1:25 AM
SjoerdMeijer accepted D45689: [AArch64][SVE] Added GPR64shifted and GPR64NoXZRshifted register classes..

Looks good to me, just one nit inlined.

Apr 19 2018, 1:08 AM
SjoerdMeijer accepted D45688: [AArch64][AsmParser] Extend RegOp with integrated 'shift/extend'..

Looks reasonable to me.

Apr 19 2018, 12:42 AM

Apr 18 2018

SjoerdMeijer added a comment to D41953: [LoopUnroll] Unroll and Jam.

Hi Dave, thanks for making this into a pass, I think this looks a lot better now. I skimmed through the whole diff for first time, and just wrote down some things I noticed, mostly nitpicks, see the comments inlined. I will now study the different bits and pieces in more detail.

Apr 18 2018, 7:31 AM
SjoerdMeijer added a comment to D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Thanks James!

Apr 18 2018, 7:07 AM · Restricted Project
SjoerdMeijer added a comment to D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Thanks, and I am going to try to get some clarity on this doc issue. But looks like it should be "ARMv7, ARMv8", as it used to be. Make sense to comment on this in the commit message, if that's what you mean.

Apr 18 2018, 6:14 AM · Restricted Project
SjoerdMeijer added a comment to D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Sorry, I have second thoughts on this.
This seems more like a doc issue than anything else. There is no reason why this could not be supported in A32. GCC is also supporting this, and removing it is a bit user unfriendly.
Would you mind reverting this?

Apr 18 2018, 3:18 AM · Restricted Project

Apr 17 2018

SjoerdMeijer accepted D45670: [NEON] Define vfma_n_f32() and vfmaq_n_f32() intrinsics in AArch32 mode.

The corresponding LLVM tests seem be there already, so this looks all good to me.

Apr 17 2018, 11:42 AM · Restricted Project
SjoerdMeijer created D45724: [ARM] Add some missing FP16 VSEL test cases.
Apr 17 2018, 8:05 AM
SjoerdMeijer accepted D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Yep, agreed, also on the new shiny https://developer.arm.com/technologies/neon/intrinsics it is listed as A64 only.

Apr 17 2018, 7:28 AM · Restricted Project

Apr 16 2018

SjoerdMeijer added inline comments to D45670: [NEON] Define vfma_n_f32() and vfmaq_n_f32() intrinsics in AArch32 mode.
Apr 16 2018, 5:35 AM · Restricted Project
SjoerdMeijer accepted D45669: [NEON] Fix the architecture condition for the crypto intrinsics.

This looks correct to me, the ACLE indeed says that:

Apr 16 2018, 5:24 AM · Restricted Project
SjoerdMeijer added a comment to D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Not really familiar with these 2 intrinsics, I had a quick look at the ACLE:

Apr 16 2018, 5:14 AM · Restricted Project

Apr 13 2018

SjoerdMeijer committed rL330034: [ARM] FP16 vmaxnm/vminnm scalar instructions.
[ARM] FP16 vmaxnm/vminnm scalar instructions
Apr 13 2018, 8:37 AM
SjoerdMeijer closed D44675: [ARM] Codegen FP16 vmaxnm/vminnm scalar instructions.
Apr 13 2018, 8:37 AM
SjoerdMeijer updated the diff for D44675: [ARM] Codegen FP16 vmaxnm/vminnm scalar instructions.

Thanks, I've updated the tests with more checks.

Apr 13 2018, 7:13 AM
SjoerdMeijer accepted D45544: [AAch64] Add the __ARM_FEATURE_DOTPROD macro definition.

LGTM, thanks.

Apr 13 2018, 1:53 AM

Apr 12 2018

SjoerdMeijer added a comment to D44675: [ARM] Codegen FP16 vmaxnm/vminnm scalar instructions.

I forgot to mention, that due to your last comment, I had a closer look at the generated code and spotted a missed optimisation. Therefore I've also added a new pattern, as it was not always generating a vmin/vmax pattern. I've changed the "safe" tests to check the actual expected pattern; before the tests were not really great because they were not testing the full pattern.

Apr 12 2018, 8:04 AM