Page MenuHomePhabricator

SjoerdMeijer (Sjoerd Meijer)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 26 2016, 2:17 AM (194 w, 6 d)

Recent Activity

Yesterday

SjoerdMeijer added a comment to D69040: [TTI][LV] preferPredicateOverEpilogue.

Thanks, will take that on board, and I will pick this up after the llvm dev conference.

Sun, Oct 20, 4:34 PM · Restricted Project

Fri, Oct 18

SjoerdMeijer committed rG9c155985f17f: [Arm][libsanitizer] Fix arm libsanitizer failure with bleeding edge glibc (authored by SjoerdMeijer).
[Arm][libsanitizer] Fix arm libsanitizer failure with bleeding edge glibc
Fri, Oct 18, 4:02 AM
SjoerdMeijer closed D69104: [Arm][libsanitizer] Fix arm libsanitizer failure with bleeding edge glibc.
Fri, Oct 18, 4:02 AM · Restricted Project, Restricted Project
SjoerdMeijer committed rL375220: [Arm][libsanitizer] Fix arm libsanitizer failure with bleeding edge glibc.
[Arm][libsanitizer] Fix arm libsanitizer failure with bleeding edge glibc
Fri, Oct 18, 4:01 AM

Thu, Oct 17

SjoerdMeijer added a comment to D69040: [TTI][LV] preferPredicateOverEpilogue.

Thinking about this some more, I think it would be best to at least check some features of the loop for legality:

    • no vector widths greater than 128 bits.
  • all vector operations should have the same number of lanes.
Thu, Oct 17, 6:03 AM · Restricted Project

Wed, Oct 16

SjoerdMeijer updated the summary of D69040: [TTI][LV] preferPredicateOverEpilogue.
Wed, Oct 16, 7:21 AM · Restricted Project
SjoerdMeijer updated the summary of D69040: [TTI][LV] preferPredicateOverEpilogue.
Wed, Oct 16, 7:21 AM · Restricted Project
SjoerdMeijer created D69040: [TTI][LV] preferPredicateOverEpilogue.
Wed, Oct 16, 7:21 AM · Restricted Project
SjoerdMeijer committed rL374992: Revert "[HardwareLoops] Optimisation remarks".
Revert "[HardwareLoops] Optimisation remarks"
Wed, Oct 16, 3:55 AM
SjoerdMeijer committed rG5a131889665f: Revert "[HardwareLoops] Optimisation remarks" (authored by SjoerdMeijer).
Revert "[HardwareLoops] Optimisation remarks"
Wed, Oct 16, 3:55 AM
SjoerdMeijer added a reverting change for rGad763751565b: [HardwareLoops] Optimisation remarks: rG5a131889665f: Revert "[HardwareLoops] Optimisation remarks".
Wed, Oct 16, 3:55 AM
SjoerdMeijer accepted D68838: [ARM] Fix arm_neon.h with -flax-vector-conversions=none, part 3.

Yep, thanks again!

Wed, Oct 16, 3:08 AM · Restricted Project
SjoerdMeijer added a comment to D68579: [HardwareLoops] Optimisation remarks.

Thanks for reviewing!

Wed, Oct 16, 2:11 AM · Restricted Project
SjoerdMeijer committed rGad763751565b: [HardwareLoops] Optimisation remarks (authored by SjoerdMeijer).
[HardwareLoops] Optimisation remarks
Wed, Oct 16, 2:09 AM
SjoerdMeijer closed D68579: [HardwareLoops] Optimisation remarks.
Wed, Oct 16, 2:09 AM · Restricted Project
SjoerdMeijer committed rL374980: [HardwareLoops] Optimisation remarks.
[HardwareLoops] Optimisation remarks
Wed, Oct 16, 2:08 AM

Tue, Oct 15

SjoerdMeijer added a comment to D67392: [ARM][ParallelDSP] Change smlad insertion order.

Thanks for clarifying.

Tue, Oct 15, 7:39 AM · Restricted Project
SjoerdMeijer added inline comments to D67392: [ARM][ParallelDSP] Change smlad insertion order.
Tue, Oct 15, 6:31 AM · Restricted Project

Mon, Oct 14

SjoerdMeijer added a comment to D68579: [HardwareLoops] Optimisation remarks.

Just wanted to check if we are happy with this as an initial commit?

Mon, Oct 14, 12:56 AM · Restricted Project
SjoerdMeijer committed rG52bfa73af841: [docs] loop pragmas: options implying transformations (authored by SjoerdMeijer).
[docs] loop pragmas: options implying transformations
Mon, Oct 14, 12:47 AM
SjoerdMeijer closed D66199: [docs] loop pragmas.
Mon, Oct 14, 12:47 AM · Restricted Project
SjoerdMeijer committed rL374756: [docs] loop pragmas: options implying transformations.
[docs] loop pragmas: options implying transformations
Mon, Oct 14, 12:38 AM

Fri, Oct 11

SjoerdMeijer added inline comments to D68862: [ARM] Allocatable Global Register Variables for ARM.
Fri, Oct 11, 6:22 AM · Restricted Project, Restricted Project
SjoerdMeijer added a comment to D68862: [ARM] Allocatable Global Register Variables for ARM.

Bit of a drive-by comment, but I can't say I am big fan of all the string matching on the register names. Not sure if this is a fair comment, because I haven't looked closely at it yet, but could we use more the ARM::R[0-9] values more? Perhaps that's difficult from the Clang parts?

Fri, Oct 11, 3:22 AM · Restricted Project, Restricted Project

Thu, Oct 10

SjoerdMeijer added a comment to D66199: [docs] loop pragmas.

Sure, will do, thanks again for taking a look.

Thu, Oct 10, 11:31 AM · Restricted Project
SjoerdMeijer updated the diff for D66199: [docs] loop pragmas.

Thanks! Typo fixed.

Thu, Oct 10, 11:12 AM · Restricted Project
SjoerdMeijer accepted D68567: [ARM] VQSUB instruction.

Yep, looks good to me.

Thu, Oct 10, 9:18 AM · Restricted Project
SjoerdMeijer added a comment to D66199: [docs] loop pragmas.

I have commit all my pragma patches, so now back to the last bit, this doc update.
This doc change should now reflect our implementation. Are we happy for this to go in?

Thu, Oct 10, 1:44 AM · Restricted Project
SjoerdMeijer committed rG80371c74ae63: Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)" (authored by SjoerdMeijer).
Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
Thu, Oct 10, 1:34 AM
SjoerdMeijer committed rL374288: Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)".
Recommit "[Clang] Pragma vectorize_width() implies vectorize(enable)"
Thu, Oct 10, 1:25 AM
SjoerdMeijer accepted D68743: [ARM] Fix arm_neon.h with -flax-vector-conversions=none, part 2..

Thanks again, looks good.

Thu, Oct 10, 12:48 AM · Restricted Project

Wed, Oct 9

SjoerdMeijer committed rGd1170dbe5831: [LV] Emitting SCEV checks with OptForSize (authored by SjoerdMeijer).
[LV] Emitting SCEV checks with OptForSize
Wed, Oct 9, 6:23 AM
SjoerdMeijer closed D68082: [LV] Emitting SCEV checks with OptForSize.
Wed, Oct 9, 6:23 AM · Restricted Project
SjoerdMeijer committed rL374166: [LV] Emitting SCEV checks with OptForSize.
[LV] Emitting SCEV checks with OptForSize
Wed, Oct 9, 6:23 AM
SjoerdMeijer added a comment to D68082: [LV] Emitting SCEV checks with OptForSize.

Many thanks again for reviewing. I will add the check-not before committing, and as I said, I will follow up soon to address the other improvements opportunities.

Wed, Oct 9, 5:59 AM · Restricted Project
SjoerdMeijer updated the diff for D68082: [LV] Emitting SCEV checks with OptForSize.

Cheers, moved the test back to directory LoopVectorize where it once was instead of LoopVectorize\X86 (which was indeed a mistake, and then added the extra runline out of frustration as the skx core was unknown to me).

Wed, Oct 9, 4:00 AM · Restricted Project
SjoerdMeijer accepted D68683: ARM] Fix arm_neon.h with -flax-vector-conversions=none.

Nice one, thanks for fixing! I didn't have the bandwidth to look into this.

Wed, Oct 9, 12:20 AM · Restricted Project

Tue, Oct 8

SjoerdMeijer updated the diff for D68579: [HardwareLoops] Optimisation remarks.

Added two more remarks

Tue, Oct 8, 3:58 AM · Restricted Project
SjoerdMeijer updated the diff for D68579: [HardwareLoops] Optimisation remarks.

Thanks for taking a look!
Comments addressed.

Tue, Oct 8, 2:09 AM · Restricted Project
SjoerdMeijer accepted D67990: [aarch64] fix generation of fp16 fmls .

Cheers, lgtm

Tue, Oct 8, 1:35 AM · Restricted Project

Mon, Oct 7

SjoerdMeijer added inline comments to D67990: [aarch64] fix generation of fp16 fmls .
Mon, Oct 7, 8:28 AM · Restricted Project
SjoerdMeijer created D68579: [HardwareLoops] Optimisation remarks.
Mon, Oct 7, 7:59 AM · Restricted Project
SjoerdMeijer accepted D68566: [ARM] VQADD instructions.

Looks like a nice bit of isel to me.

Mon, Oct 7, 5:07 AM · Restricted Project
SjoerdMeijer updated the diff for D68082: [LV] Emitting SCEV checks with OptForSize.

Comments addressed:

  • cleaned up the test case a bit.
  • couldn't reuse an existing run-line, I guess because of -mcpu=skx, but a separate run line seems fine to me.
Mon, Oct 7, 2:58 AM · Restricted Project

Sat, Oct 5

SjoerdMeijer updated the diff for D68082: [LV] Emitting SCEV checks with OptForSize.

Thanks for your comments and thoughts.

Sat, Oct 5, 1:19 AM · Restricted Project

Fri, Oct 4

SjoerdMeijer added a comment to D68082: [LV] Emitting SCEV checks with OptForSize.

The earlier "fix at the core" thought was for LV to do something along these lines:

Fri, Oct 4, 8:35 AM · Restricted Project
SjoerdMeijer added a comment to D68082: [LV] Emitting SCEV checks with OptForSize.

Apologies for the early ping and for the impatience! But I am just really keen on getting this fixed. :-)

Fri, Oct 4, 5:14 AM · Restricted Project

Tue, Oct 1

SjoerdMeijer added a comment to D68101: [MC][ELF] Prevent globals with an explicit section from being mergeable.

Hi Ben, this is not really my area of expertise, but it all starts to make some sense to me. I was expecting this transformation to happen earlier, but this is where the magic happens, and this probably where it belongs.
Just to make sure I haven't missed anything, I would like to take this patch and run some numbers, for which I need a little bit of time. If in the mean time someone with some more experience in this area here has a look too, that would be great....

Tue, Oct 1, 6:07 AM · Restricted Project
SjoerdMeijer added a comment to D68101: [MC][ELF] Prevent globals with an explicit section from being mergeable.

I am getting up to speed with this... and created this reproducer:

Tue, Oct 1, 2:58 AM · Restricted Project

Mon, Sep 30

SjoerdMeijer retitled D68082: [LV] Emitting SCEV checks with OptForSize from [LV] Emitting SCEV checks with OptForSize to [SCEV] Don't add Predicates with OptForSize.
Mon, Sep 30, 9:21 AM · Restricted Project
SjoerdMeijer updated the diff for D68082: [LV] Emitting SCEV checks with OptForSize.
  • modified SCEV so that it doesn't add Predicates when OptForSize is enabled. This has indeed the consequence that vectorisation will happen, it will scalarise the stores.
  • moved the test case to an exisiting optsize file.
Mon, Sep 30, 9:16 AM · Restricted Project

Fri, Sep 27

SjoerdMeijer added a comment to D68127: [ARM] Add a vrinta.f16.f16 alias.

Just checking because you mentioned:

Fri, Sep 27, 3:50 AM · Restricted Project
SjoerdMeijer updated the diff for D68082: [LV] Emitting SCEV checks with OptForSize.

Thanks for all the pointers and info, that helped me getting back on track.
Hopefully things are actually as simple as the change in this patch.

Fri, Sep 27, 3:23 AM · Restricted Project
SjoerdMeijer added a comment to D68101: [MC][ELF] Prevent globals with an explicit section from being mergeable.

Hello James! Thanks for informing us. If this does what you say, it essentially disables global merging with -fdata-sections, then indeed we would get a little bit unhappy about that.
I could run some numbers with this patch, but I can guess what the outcome will be..... I haven't looked at this patch or the PR, but does this need to be default behaviour, or can this e.g. be done under an option?

Fri, Sep 27, 1:28 AM · Restricted Project
SjoerdMeijer accepted D67921: [ARM][MVE] Change VCTP operand.

looks good to me

Fri, Sep 27, 12:12 AM · Restricted Project

Thu, Sep 26

SjoerdMeijer added a reviewer for D68082: [LV] Emitting SCEV checks with OptForSize: uabelho.
Thu, Sep 26, 5:41 AM · Restricted Project
SjoerdMeijer created D68082: [LV] Emitting SCEV checks with OptForSize.
Thu, Sep 26, 5:39 AM · Restricted Project
SjoerdMeijer added a comment to D67752: [ARM] Loop unrolling preferences for LOB cores.

The added tests check exactly what's being changed/added here, so that's excellent.
I was just wondering if it would be good if we also add a more end-to-end/unit llc test, that shows actual loloop creation for these cases? That is, if these tests are not already there. I had only a very quick look, and am not sure, but you'll probably know.

Thu, Sep 26, 1:50 AM

Wed, Sep 25

SjoerdMeijer accepted D67957: [ARM] Cortex-M4 schedule additions.

Cheers, nice one, LGTM

Wed, Sep 25, 3:00 AM · Restricted Project
SjoerdMeijer added a comment to D67957: [ARM] Cortex-M4 schedule additions.

Looks all like very mechanical changes to me. The only thing I am not a big fan of is this UnsupportedFeatures, but I understood that there is not really a much better alternative.

Wed, Sep 25, 1:35 AM · Restricted Project

Tue, Sep 24

SjoerdMeijer added a comment to D67392: [ARM][ParallelDSP] Change smlad insertion order.

The current implementation of comparing loads is quadratic, yes, but you could use a different algorithm, like splitting loads into a base pointer plus an offset, and constructing a map from base pointers to load offsets.

Tue, Sep 24, 11:31 AM · Restricted Project
SjoerdMeijer added a comment to D67392: [ARM][ParallelDSP] Change smlad insertion order.

I also hadn't though much about complexity, but indeed, function RecordMemoryOps, for example, is a bit of an expensive hobby.
Looking at it again, the bookkeeping looks essential, I don't see an easy way to reduce complexity. Delaying it may help a bit, but fundamentally that won't change much I think.
The usual way to deal with expensive hobbies is to introduce a threshold, and bail if it exceeds that.

Tue, Sep 24, 9:31 AM · Restricted Project
SjoerdMeijer added inline comments to D67921: [ARM][MVE] Change VCTP operand.
Tue, Sep 24, 3:47 AM · Restricted Project
SjoerdMeijer committed rG0fcb3afb401c: [LV] Forced vectorization with runtime checks and OptForSize (authored by SjoerdMeijer).
[LV] Forced vectorization with runtime checks and OptForSize
Tue, Sep 24, 1:03 AM
SjoerdMeijer committed rL372694: [LV] Forced vectorization with runtime checks and OptForSize.
[LV] Forced vectorization with runtime checks and OptForSize
Tue, Sep 24, 1:03 AM
SjoerdMeijer closed D67764: [LV] Forced vectorization with runtime checks and OptForSize.
Tue, Sep 24, 1:02 AM · Restricted Project
SjoerdMeijer added a comment to D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Many thanks for reviewing!
Will now look at PR43371

Tue, Sep 24, 12:53 AM · Restricted Project

Mon, Sep 23

SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Added minsize func attr, and also added the check for the new vectorization remark, which I had also missed porting to the new test file... :-( Thanks for spotting. The fix for the other assert will be better than this...

Mon, Sep 23, 12:40 PM · Restricted Project
SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.

sorry, forgot to clean that up, keep the original order

Mon, Sep 23, 8:17 AM · Restricted Project
SjoerdMeijer added a comment to D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Another assert which presumably requires similar update is the "Cannot SCEV check stride or overflow when optimizing for size" one.

Mon, Sep 23, 7:05 AM · Restricted Project
SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.

All comments addressed.

Mon, Sep 23, 7:04 AM · Restricted Project
SjoerdMeijer accepted D67709: [ARM][MVE] Cleanup tail-predicated loop.

Thanks, looks good to me.

Mon, Sep 23, 2:11 AM · Restricted Project

Sun, Sep 22

SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Restored the assert and expanded it with the !forcedToVectorize check

Sun, Sep 22, 12:19 AM · Restricted Project

Sep 21 2019

SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.
Sep 21 2019, 3:10 PM · Restricted Project
SjoerdMeijer added a comment to D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Ah, I had completely missed that this is about the interaction of OptForSize and the loop.vectorize loop hint!
Vectorization is forced, which means that we vectorize even though OptForSize is set and we need to emit runtime checks.

Sep 21 2019, 3:07 PM · Restricted Project
SjoerdMeijer added a comment to D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Which commit does this trace back to, D65197?

Sep 21 2019, 1:28 PM · Restricted Project
SjoerdMeijer updated the diff for D67764: [LV] Forced vectorization with runtime checks and OptForSize.

Thanks. The runtime check is now done earlier.

Sep 21 2019, 12:07 AM · Restricted Project

Sep 20 2019

SjoerdMeijer added a comment to D67709: [ARM][MVE] Cleanup tail-predicated loop.

So the pass shouldn't attempt to convert a loop that contains any vector intrinsics, other than masked.load and masked.store. So, we will actually need to do extra work for this to operate with MVE intrinsics. Even once we've done that, the pass only tries to remove the old icmp predicates, which we pattern match. If the user defined predicates match those that the vectorizer outputs, then there's no reason why we won't perform this transform. I would say that I'd add a test, but we don't have intrinsic support yet...

Sep 20 2019, 3:49 AM · Restricted Project
SjoerdMeijer added a comment to D67709: [ARM][MVE] Cleanup tail-predicated loop.

Just curious, if the input is already a tail-predicated loop (e.g. coming from intrinsics), would it be possible we incorrectly start removing/inserting things?

Sep 20 2019, 3:19 AM · Restricted Project

Sep 19 2019

SjoerdMeijer accepted D67645: [aarch64] add def-pats for dot product.

Yes, looks reasonable to me.
Perhaps wait a day with committing just in case someone has concerns.

Sep 19 2019, 9:05 AM · Restricted Project
SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

Instruction selection and vectorizers are orthogonal.

Sep 19 2019, 8:05 AM · Restricted Project
SjoerdMeijer created D67764: [LV] Forced vectorization with runtime checks and OptForSize.
Sep 19 2019, 7:44 AM · Restricted Project

Sep 18 2019

SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

yep, I think we need to generate those reduction intrinsics, which we can then lower/instruction select. I don't think there's anything controversial about that, intrinsics gets generated in a lot of different cases.

Sep 18 2019, 12:09 PM · Restricted Project
SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

To catch more dot product cases, we need to fix the passes above instruction selection.

Sep 18 2019, 7:58 AM · Restricted Project
SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

ok, sounds all good.

Sep 18 2019, 7:07 AM · Restricted Project
SjoerdMeijer added a comment to D61717: Fix arm_neon.h to be clean under -fno-lax-vector-conversions..

Many thanks! I've passed on the message, hopefully we can do something about this.

Sep 18 2019, 1:39 AM · Restricted Project

Sep 17 2019

SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

Many thanks for addressing this!
This pattern matching looks very reasonable to me, just a round of nits:

  • was just curious about the AddedComplexity = 30
  • perhaps I am wrong, but was wondering if clang-format agrees with some of the indentation,
  • would some more test bring any benefits/additonal value? Like test with a loops, or tests with no aliasing attributes?
  • please ignore if you disagree, but thought that some names like ldop, K1, K2 could be slightly more informative/consistent.
Sep 17 2019, 12:37 PM · Restricted Project
SjoerdMeijer added a comment to D67645: [aarch64] add def-pats for dot product.

I've got a cheeky request, and I appreciate that should go in a separate patch, but while you're at at it would you mind repeating this exercise for the ARM backend and AArch32?

Sep 17 2019, 1:47 AM · Restricted Project
SjoerdMeijer committed rGe573a9c03566: [Clang] Pragma vectorize_width() implies vectorize(enable) (authored by SjoerdMeijer).
[Clang] Pragma vectorize_width() implies vectorize(enable)
Sep 17 2019, 1:43 AM
SjoerdMeijer committed rL372082: [Clang] Pragma vectorize_width() implies vectorize(enable).
[Clang] Pragma vectorize_width() implies vectorize(enable)
Sep 17 2019, 1:42 AM
SjoerdMeijer closed D66290: [clang] Pragma vectorize_width() implies vectorize(enable).
Sep 17 2019, 1:42 AM · Restricted Project
SjoerdMeijer updated the summary of D66290: [clang] Pragma vectorize_width() implies vectorize(enable).
Sep 17 2019, 1:42 AM · Restricted Project
SjoerdMeijer updated the diff for D66290: [clang] Pragma vectorize_width() implies vectorize(enable).

Just uploading new diff for completeness; I only had to change a test-case, and thus thought that committing this is okay.

Sep 17 2019, 1:37 AM · Restricted Project

Sep 16 2019

SjoerdMeijer committed rGc2bafadd7a33: [LV] Add ARM MVE tail-folding tests (authored by SjoerdMeijer).
[LV] Add ARM MVE tail-folding tests
Sep 16 2019, 7:56 AM
SjoerdMeijer committed rL371996: [LV] Add ARM MVE tail-folding tests.
[LV] Add ARM MVE tail-folding tests
Sep 16 2019, 7:55 AM
SjoerdMeijer accepted D67444: [ARM][MVE] Add invalidForTailPredication to TSFlags.

We're not generating any TP loops yet, so we're okay!

Sep 16 2019, 5:33 AM · Restricted Project
SjoerdMeijer added a comment to D67444: [ARM][MVE] Add invalidForTailPredication to TSFlags.

They are different things. I actually don't think that the specification prevents any instructions from being in a TP loop, we just need to be careful of these. We should be able to allow most (all?) of these once we implement some better checks in the backend to look at the operands.

Sep 16 2019, 5:03 AM · Restricted Project
SjoerdMeijer added a comment to D67444: [ARM][MVE] Add invalidForTailPredication to TSFlags.

Nice one.

Sep 16 2019, 4:01 AM · Restricted Project
SjoerdMeijer committed rG5f349d56a843: Added return statement to fix compile and build warning: (authored by SjoerdMeijer).
Added return statement to fix compile and build warning:
Sep 16 2019, 3:30 AM
SjoerdMeijer committed rL371972: Added return statement to fix compile and build warning:.
Added return statement to fix compile and build warning:
Sep 16 2019, 3:28 AM