Page MenuHomePhabricator

hsaito (Hideki Saito)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 25 2016, 9:02 AM (217 w, 6 d)

Recent Activity

Jan 14 2020

hsaito added a comment to D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..

We shouldn't make this either/or. Ability to runtime check unit-stride is good, and ability to use gather/scatter is also good. Depending on the target, I see the following situations

  1. don't vectorize the loop ---- unit-strided code alone isn't profitable or loop itself is profitable but not good enough to cover runtime check cost.
  2. vectorize with runtime check, with scalar code to fall back ---- when gather/scatter code is deemed not profitable.
  3. vectorize with runtime check with gather/scatter code to fall back
  4. vectorize with gather/scatter
Jan 14 2020, 3:23 PM · Restricted Project

Dec 2 2019

hsaito accepted D70920: [VPlan] Add dump function to VPlan class..

LGTM. Thanks!

Dec 2 2019, 1:48 PM · Restricted Project

Nov 21 2019

hsaito added a comment to D69563: [LV] Strip wrap flags from vectorized reductions.

This patch looks good to me. I agree it's nicer if we could fix it during widenInstruction, but let's leave that fix to the VPlan centric code generation, where we expect to (eventually) have the necessary context.

Nov 21 2019, 10:50 AM · Restricted Project
hsaito added a comment to D69563: [LV] Strip wrap flags from vectorized reductions.

@hsaito: What do you think?

Nov 21 2019, 10:41 AM · Restricted Project

Nov 8 2019

hsaito added a comment to D69563: [LV] Strip wrap flags from vectorized reductions.

Are we essentially saying that any reassociation can't preserve NSW/NUW flags?

Nov 8 2019, 12:15 PM · Restricted Project

Nov 1 2019

hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

We have observed some performance regressions, presumably because the vectorized code started to kick-in on short estimated trip count loops (as opposed to skipping vector code and execute scalar code). We'll try following up with cost model tuning. I'm not too surprised if others also hit a similar issue. Overall, though, that's the right direction to head to.

Nov 1 2019, 5:11 PM · Restricted Project
hsaito added inline comments to D68577: [LV] Apply sink-after & interleave-groups as VPlan transformations (NFC).
Nov 1 2019, 2:44 PM · Restricted Project

Oct 28 2019

hsaito added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

+1 on what Simon said.

Oct 28 2019, 2:24 PM · Restricted Project
hsaito updated subscribers of D67948: [LV] Interleaving should not exceed estimated loop trip count..

Updated comment as agreed.

Got it. First time committing through git (SVN is read-only now). Expecting some learning curve. If you'd like this in sooner, you might want to ask someone familiar with the process. Will see if I hit any issues.

Needed to have done this process before SVN went readonly. Please ask @reames to commit. He must be better prepared. Sorry about that.
https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access-to-the-github-repository

Oct 28 2019, 10:21 AM · Restricted Project

Oct 25 2019

hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

Updated comment as agreed.

Got it. First time committing through git (SVN is read-only now). Expecting some learning curve. If you'd like this in sooner, you might want to ask someone familiar with the process. Will see if I hit any issues.

Oct 25 2019, 6:37 PM · Restricted Project
hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

Updated comment as agreed.

Oct 25 2019, 5:27 PM · Restricted Project

Oct 18 2019

hsaito added inline comments to D69040: [TTI][LV] preferPredicateOverEpilogue.
Oct 18 2019, 3:46 PM · Restricted Project
hsaito added a comment to D69088: [Lex] #pragma clang transform.

I'd rather just enable them with a command-line switch, such as -fexperimental-transform.

Oct 18 2019, 3:18 PM · Restricted Project
hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

@hsaito, Please commit if you find this version acceptable.

@ebrevnov, I recommend you visit http://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access and obtain your own commit access. Your committed and under review patches deserve it.
Thank you very much for your contribution. Let me know if you still would like me to commit this one.

I was going to do that right after this patch lands. Please assist me (hopefully last time) in landing the patch.

Oct 18 2019, 2:50 PM · Restricted Project
hsaito added a comment to D69088: [Lex] #pragma clang transform.

Personally, I like the intent. I don't foresee a clear (enough) path to get there. This leads to hesitation of adding a new non-experimental pragma and present it to programmers. If you call it experimental, it's easier for me to swallow.

Is there anything more to do than mentioning as being it experimental in the (no-patch-available-yet) docs?

Oct 18 2019, 12:49 PM · Restricted Project

Oct 17 2019

hsaito added a comment to D69088: [Lex] #pragma clang transform.

Have we established general consensus for the desire to have the flexible enough loop optimization pass ordering to accomplish the outcome of the new directive, and shared vision for the path to get there? If we are making this a general clang directive, I'd like to see the vision to get there w/o depending on polly. If this is already discussed and settled, pointer to that is appreciated so that I can learn.

Response to the RFCs was meager. However, I got positive feedback at various conferences, including last year's DevMtg where my version for loop transformations was a technical talk.

Oct 17 2019, 8:10 PM · Restricted Project
hsaito added a comment to D69088: [Lex] #pragma clang transform.

Have we established general consensus for the desire to have the flexible enough loop optimization pass ordering to accomplish the outcome of the new directive, and shared vision for the path to get there? If we are making this a general clang directive, I'd like to see the vision to get there w/o depending on polly. If this is already discussed and settled, pointer to that is appreciated so that I can learn.

Oct 17 2019, 5:09 PM · Restricted Project
hsaito added a comment to D69088: [Lex] #pragma clang transform.

@Meinersbur, if I remember correctly, there was an RFC discussion on this topic, right? If yes, would you post the pointer to that? I need a refresher on what has been discussed/settled in the past.

Oct 17 2019, 2:18 PM · Restricted Project

Oct 16 2019

hsaito added inline comments to D67948: [LV] Interleaving should not exceed estimated loop trip count..
Oct 16 2019, 10:18 AM · Restricted Project
hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

@hsaito, Please commit if you find this version acceptable.

Oct 16 2019, 10:10 AM · Restricted Project

Oct 14 2019

hsaito added a comment to D68814: [LV] Allow assume calls in predicated blocks..

If conditional assumes are to be dropped, better do so on entry to VPlan, as in DeadInstructions, rather than representing them in ReplicateRecipe (as do unconditional assumes) and silencing their code generation.

To retain conditional assumes along with their control flow, they could be marked under isScalarWithPredication; but this complicates vectorization, plus what use are such assumes when all else is if-converted(?)

Conditional assumes under uniform control flow could be retained, along with the uniform control flow they depend upon; this may be mostly relevant for outerloop vectorization.

Oct 14 2019, 10:52 AM · Restricted Project

Oct 11 2019

hsaito added a comment to D68651: [InstCombine] Signed saturation patterns.

I'm in favor of treating signed saturation as canonical. The issue in delaying detection of such cases to instruction selection is the volatility of the IR: there is no guarantee that the IR will remain in the same form (expected by isel) from one day to the next. For example, some optimization may decide to just promote the operations to the wider type and only do the extension/truncate once, depending on how many saturating operations may be near one another. Handling this variability in isel is just not feasible.

Oct 11 2019, 5:00 PM · Restricted Project

Oct 10 2019

hsaito accepted D66199: [docs] loop pragmas.

LGTM. Please wait for a few days in case others have more comments.

Oct 10 2019, 11:22 AM · Restricted Project
hsaito added inline comments to D66199: [docs] loop pragmas.
Oct 10 2019, 9:54 AM · Restricted Project

Oct 9 2019

hsaito added a comment to D68651: [InstCombine] Signed saturation patterns.

I haven't looked at the patch in detail, but as author of at least part of the prior art cited here, I agree with the direction*. I also participated in some of the vector idioms discussions from a few years ago. There's overlap with the vector idiom problems, but as noted, these are generic (scalar too) math ops, so it's not exactly the same. We invested significantly in IR analysis and codegen for the math intrinsics, so that may have changed the thinking. I don't remember the sequence of events or if there was a dedicated llvm-dev thread for this, but the general idea is that if we have a generic intrinsic for the math and can easily invert the transform in the backend for targets/types that are not supported, try to canonicalize to the intrinsic.

Oct 9 2019, 11:08 AM · Restricted Project

Oct 8 2019

hsaito added a comment to D68651: [InstCombine] Signed saturation patterns.

We do form uadd_sat as in rL357012 and usub_sat from selects.

I really just need some way to generate sadd_sats for vectorisation. If there's a better way than this, I'm all ears :)

Oct 8 2019, 2:33 PM · Restricted Project

Oct 7 2019

hsaito accepted D67948: [LV] Interleaving should not exceed estimated loop trip count..

Vectorizer code change looks fine with me. I'd like to see the comments updated, though. Any more changes needed for the LIT tests?

Oct 7 2019, 11:32 AM · Restricted Project

Oct 4 2019

hsaito added inline comments to D67948: [LV] Interleaving should not exceed estimated loop trip count..
Oct 4 2019, 4:40 PM · Restricted Project
hsaito added a comment to D68082: [LV] Emitting SCEV checks with OptForSize.
>> In this regard, I do not like pass managers running Analyses for a Transformation pass w/o first letting the Transformation pass inspect the incoming IR, but that's a totally different discussion and I don't have a solution for that problem.

Maybe I'm misunderstanding, but SCEV is lazy so if a transform looks at the IR and decides not to ask SCEV for trip count or call getSCEV then SCEV will not do any actual work.

Oct 4 2019, 3:44 PM · Restricted Project
hsaito added a comment to D67948: [LV] Interleaving should not exceed estimated loop trip count..

Vectorizer code change looks fine with me. I'd like to see the comments updated, though. Any more changes needed for the LIT tests?

Oct 4 2019, 2:21 PM · Restricted Project
hsaito added a comment to D68082: [LV] Emitting SCEV checks with OptForSize.

I agree with @Ayal about changing getAsAddRec for OptForSize. In general, it's better to not "Analyze" if we know we won't be using the result of analysis.

Oct 4 2019, 12:35 PM · Restricted Project

Sep 18 2019

hsaito accepted D67690: [LV][NFC] Factor out calculation of "best" estimated trip count..

LGTM

Sep 18 2019, 12:50 PM · Restricted Project

Sep 17 2019

hsaito accepted D67690: [LV][NFC] Factor out calculation of "best" estimated trip count..

Nice clean up of the code as well. LGTM.

Sep 17 2019, 11:18 PM · Restricted Project

Sep 9 2019

hsaito added a comment to D66796: [clang] Loop pragma vectorize(disable).

That's exactly the reason why I think vectorize(disable) should disable vectorisation for that loop. I just don't see what else a user would expect.

Sep 9 2019, 5:38 PM
hsaito added a comment to D66796: [clang] Loop pragma vectorize(disable).

There are two ways to think.

  1. vectorize(disable) as in disable the LoopVectorize pass itself.
  2. vectorize(disable) as in disabling the loop vectorization transformation
Sep 9 2019, 12:56 PM

Aug 6 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

Nope, Asan failures from the sanitizer commit.

Aug 6 2019, 9:43 AM · Restricted Project
hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

Buildbot failures reported so far. Just FYI. I don't think any actions from this commit is needed.

Aug 6 2019, 9:33 AM · Restricted Project

Aug 5 2019

hsaito committed rGec818d7fb3c4: [LV][NFC] Share the LV illegality reporting with LoopVectorize. (authored by hsaito).
[LV][NFC] Share the LV illegality reporting with LoopVectorize.
Aug 5 2019, 11:09 PM
hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

@rengolin @hsaito I understand you were busy during the version 9 pre-releasing. But if you have time and the diff looks good for you, could you land it? Thank you.

I can land this today.

Aug 5 2019, 11:08 PM · Restricted Project
hsaito committed rL367980: [LV][NFC] Share the LV illegality reporting with LoopVectorize..
[LV][NFC] Share the LV illegality reporting with LoopVectorize.
Aug 5 2019, 11:08 PM
hsaito closed D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..
Aug 5 2019, 11:08 PM · Restricted Project
hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

@rengolin @hsaito I understand you were busy during the version 9 pre-releasing. But if you have time and the diff looks good for you, could you land it? Thank you.

Aug 5 2019, 1:33 AM · Restricted Project

Aug 2 2019

hsaito added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Reported buildbot fails for Evgueni to follow up as needed:

Aug 2 2019, 4:19 PM · Restricted Project
hsaito added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

LIT test has been moved to X86 subdir, in response to ps4-buildslave1 buildbot failure.

Aug 2 2019, 12:26 AM · Restricted Project
hsaito committed rG8871ac41a728: Moves the newly added test interleaved-accesses-waw-dependency.ll to X86… (authored by hsaito).
Moves the newly added test interleaved-accesses-waw-dependency.ll to X86…
Aug 2 2019, 12:25 AM
hsaito committed rL367659.
Aug 2 2019, 12:25 AM

Aug 1 2019

hsaito added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

r367654 | hsaito | 2019-08-01 23:31:50 -0700 (Thu, 01 Aug 2019) | 12 lines

Aug 1 2019, 11:41 PM · Restricted Project
hsaito committed rG09fac2450b19: [LV] Avoid building interleaved group in presence of WAW dependency (authored by hsaito).
[LV] Avoid building interleaved group in presence of WAW dependency
Aug 1 2019, 11:36 PM
hsaito committed rL367654.
Aug 1 2019, 11:31 PM
hsaito closed D63981: [LV] Avoid building interleaved group in presence of WAW dependency.
Aug 1 2019, 11:31 PM · Restricted Project
hsaito added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Thanks for quick response @hsaito ! May I ask you to commit it as well since I don't have committer rights yet.

Aug 1 2019, 5:29 PM · Restricted Project

Jul 31 2019

hsaito accepted D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Indeed, this is a much cleaner fix. LGTM.

Jul 31 2019, 10:14 AM · Restricted Project
hsaito added a comment to D63981: [LV] Avoid building interleaved group in presence of WAW dependency.

Hi @hsaito!

Would be nice if you find time to take a look.

Jul 31 2019, 9:56 AM · Restricted Project
hsaito accepted D65197: [LV] Tail-loop Folding.

LGTM, pending the discussion about the exact meaning of the newly introduced "vector predicate" pragma (expect this to happen outside of this review). Please wait for another day to give others last minute opportunity to give feedback.

Jul 31 2019, 9:54 AM · Restricted Project

Jul 29 2019

hsaito added a comment to D65197: [LV] Tail-loop Folding.

Friendly ping :-)

Jul 29 2019, 12:53 PM · Restricted Project

Jul 24 2019

hsaito added a comment to D65197: [LV] Tail-loop Folding.

We probably need to discuss whether vectorize_predicate(enable) should (or should not) implicitly turns on vectorize(enable) or not. I guess the current behavior is "does not", right? We don't have to discuss that in this review, but we still want to make a conscious decision one way or the other, or did I miss that discussion?

Jul 24 2019, 12:42 PM · Restricted Project
hsaito accepted D64916: [LV] Scalar Epilogue Lowering. NFC..

LGTM

Jul 24 2019, 12:12 PM · Restricted Project
hsaito added a comment to D64916: [LV] Scalar Epilogue Lowering. NFC..

Just one comment from me.

Jul 24 2019, 10:14 AM · Restricted Project

Jul 19 2019

hsaito added a comment to D64916: [LV] Scalar Epilogue Lowering. NFC..

Looks to be a good change. Can we add a little more improvement to this patch --- adding more crispness in ORE message?

Jul 19 2019, 9:24 AM · Restricted Project

Jul 18 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

Aha, that was a "hacky" way to get "loop contains a switch statement" along with the warning. I see. I suppose we can't blindly use LV_NAME, then.
I then suppose some tests (like pr38800.ll) didn't even need -pass-remarks-missed flag (which is incorrectly used, I think).

Jul 18 2019, 11:43 AM · Restricted Project

Jul 17 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

The test no_switch.ll has been updated: the LV doesn't report remarks without -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize', so the test from the CHECK section is removed. I have no idea why, but remarks with lambdas through the code emitted via OptimizationRemarkMissed, not OptimizationRemarkAnalysis (if I right understand, 'Missed' means the pass has not been applied by any reasons during optimization).

Jul 17 2019, 10:58 AM · Restricted Project

Jun 25 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

Looking at the expected output and the explanations on -Rpass* flags, it could be that those tests should be using -pass-remarks-analysis=loop-vectorize, instead of -pass-remarks-missed=loop-vectorize. Would you try?

Jun 25 2019, 10:38 AM · Restricted Project

Jun 21 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

The diff has been updated: private methods were removed, the function 'reportVectorizationFailure' was moved to LoopVectorize.cpp and declared in LoopVectorize.h (namespace llvm). I've removed the passName parameter and pass LV_NAME as pass name to ORE but this change breaks 3 regression tests:

I'm assuming that the pass name isn't just LV_NAME, but whichever pass is using that function, which now that it's higher level, can be anything.

@rengolin @hsaito I need you suggestions how to deal with the Hints.vectorizeAnalysisPassName() method, it is very long to always call it whenever the reportVectorizationFailure is invoked.

I don't have a better idea on the top of my head, but this sounds like a change that can be done later as a quick refactory once someone comes up with a clever replacement.

I'd be ok having that for now... @hsaito may have a better idea, though.

--renato

Jun 21 2019, 12:26 PM · Restricted Project

Jun 14 2019

hsaito added a comment to D62997: [LV] Share the LV illegality reporting with LoopVectorize. NFC..

Sorry, I missed the original review request in the e-mail pile up. Yes, this is the direction I was suggesting. Thank you.

Jun 14 2019, 12:16 PM · Restricted Project

May 30 2019

Vladimir Lazarev <vladimir.lazarev@intel.com> committed rG45b17d6c63f7: [SYCL] add optimized vec class constructors (authored by hsaito).
[SYCL] add optimized vec class constructors
May 30 2019, 8:05 AM
vladimirlaz <vladimir.lazarev@intel.com> committed rG60befd246ea6: [SYCL] Change Indexer to use C++11 feature, from C++14. (authored by hsaito).
[SYCL] Change Indexer to use C++11 feature, from C++14.
May 30 2019, 8:02 AM
vladimirlaz <vladimir.lazarev@intel.com> committed rG8b1894b4f3a6: [SYCL] Improve SYCL vector implementation (authored by hsaito).
[SYCL] Improve SYCL vector implementation
May 30 2019, 8:01 AM

May 28 2019

hsaito added a comment to D62478: [LV] Wrap LV illegality reporting in a function. NFC..

@rengolin It is because the function uses OptimizationRemarkEmitter *ORE, a member of the LoopVectorizationLegality class.

I see. Given the function has a lot of arguments already and it really isn't used elsewhere, I'd rather just add ORE to the args list.

Unless @hsaito or @fhahn think this could be used elsewhere in the vectorizer, then it shouldn't be in that class anyway.

May 28 2019, 10:37 AM · Restricted Project

May 24 2019

hsaito added a comment to D62311: [LV] Inform about exactly reason of loop illegality.

@hsaito I agree, to have a function to report about an error is a good idea.

I'm new in LLVM community, so what does NFC mean? Should I close this review and open a new one or you mean just to upload a new diff for comments?

May 24 2019, 11:08 AM · Restricted Project
hsaito added a comment to D62311: [LV] Inform about exactly reason of loop illegality.

While we are looking at this, I'd like to discuss how to make these things easier. I think there a merit in using a utility function that takes three strings, something along the lines of
the following pseudo code:

May 24 2019, 10:38 AM · Restricted Project

May 9 2019

hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

What's the status of this? It seems like discussion has died down a bit. I think Graham's idea to change from <scalable 2 x float> to <vscale x 2 x float> will make the IR more readable/understandable but it's not a show-stopper for me.

Are there any other outstanding issues to address before this lands?

May 9 2019, 10:45 AM · Restricted Project

Apr 30 2019

hsaito added inline comments to D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize..
Apr 30 2019, 9:27 AM · Restricted Project

Apr 29 2019

hsaito added a comment to D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize..

I verified that the two unroll flags propagate to the option set in the PassManagerBuilder. This change + the clang change in D61142 should not make any visible change for clang users.

Apr 29 2019, 6:07 PM · Restricted Project
hsaito added reviewers for D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.: hfinkel, rengolin, mkuper, fhahn, hsaito.
Apr 29 2019, 6:06 PM · Restricted Project

Apr 25 2019

hsaito added a comment to D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize..

You mean, clang setting EnableLoopInterleaving.

Actually I meant DisableUnrollLoops in the PassManagerBuilder. It's currently always set to false, not based on a flag, and I found a single instance where it's value is changed in clang (see clang patch).

I'm happy to make whatever change needed to clang first, but I don't see what that change is. If you could point me to what I've missed, that would be great!

Apr 25 2019, 5:41 PM · Restricted Project
hsaito added a comment to D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize..

Thanks a lot for the suggestion! Sent: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131968.html

My intention was to not make any change visible to clang.
If clang currently sets the DisableUnrollLoops, and llvm will not use that for LoopVectorization, then have clang set LoopsInterleaved to the same value as the one used for unroll loops.

Apr 25 2019, 12:42 PM · Restricted Project
hsaito added a comment to D61030: [PassManagerBuilder] Add option for interleaved loops, for loop vectorize..

Update test.

Apr 25 2019, 11:30 AM · Restricted Project

Apr 24 2019

hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

Exactly. Non-constant values can become constant. Constant values can be guarded by vscale-dependent runtime guards (both hand-written and compiler generated). My preference is to leave this not restricted to vscale == 1 values, but rather allow all values that can be supported at runtime, and have it be UB if, at runtime, the relevant index is not available.

+1

Apr 24 2019, 10:23 AM · Restricted Project

Apr 23 2019

hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Good point. 100% agree. I was only considering the constant case.

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

Apr 23 2019, 10:39 AM · Restricted Project

Apr 17 2019

hsaito added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

Do we really need both vp.fadd() and vp.constrained.fadd()? Can't we just use the latter with rmInvalid/ebInvalid? That should prevent vp.constrained.fadd from losing optimizations w/o good reasons.

According to the LLVM langref, "fpexcept.ignore" seems to be the right option for exceptions whereas there is no "round.permissive" option for the rounding behavior. Abusing rmInvalid/ebInvalid seems hacky.

Apr 17 2019, 9:54 AM · Restricted Project

Apr 16 2019

hsaito added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

Updates

  • added constrained fp intrinsics (IR level only).
  • initial support for mapping llvm.experimental.constrained.* intrinsics to llvm.vp.constrained.*.
Apr 16 2019, 2:27 PM · Restricted Project

Apr 15 2019

hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

The main difference is for 'shufflevector'. For a splat it's simple, since you just use a zeroinitializer mask. For anything else, though, you currently need a constant vector with immediate values; this obviously won't work if you don't know how many elements there are.

Apr 15 2019, 1:04 PM · Restricted Project
hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

  1. Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.

Given the current state, this is the easiest path.

I agree, although this is an important part of the model, so we should start having this discussion in parallel (sooner rather than later). I had been under the impression that a set of intrinsics were being proposed for this, but extending shufflevector is also an option worth considering. If these are first-class types, then having first-class instruction support is probably the right path. This deserves it's own RFC.

  1. Introduce additional intrinsics for the other shuffle variants as needed
  2. Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

This warrants a larger discussion, which would hinder the current progress.

I agree. We should have a separate RFC on this.

Apr 15 2019, 10:11 AM · Restricted Project

Apr 12 2019

hsaito added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

Apr 12 2019, 11:27 AM · Restricted Project

Apr 5 2019

hsaito added inline comments to D32530: [SVE][IR] Scalable Vector IR Type.
Apr 5 2019, 4:02 PM · Restricted Project

Mar 29 2019

hsaito added inline comments to D59723: [NewPassManager] Adding pass tuning options: loop vectorize..
Mar 29 2019, 12:34 PM · Restricted Project

Mar 28 2019

hsaito accepted D59952: [VPLAN] Minor improvement to testing and debug messages..

LGTM

Mar 28 2019, 2:57 PM · Restricted Project
hsaito added inline comments to D59952: [VPLAN] Minor improvement to testing and debug messages..
Mar 28 2019, 12:25 PM · Restricted Project
hsaito added inline comments to D59952: [VPLAN] Minor improvement to testing and debug messages..
Mar 28 2019, 12:19 PM · Restricted Project
hsaito added a comment to D57598: [VPLAN] Determine Vector Width programmatically..

Thanks Francesco!

Mar 28 2019, 11:03 AM · Restricted Project

Mar 27 2019

hsaito added a comment to D57598: [VPLAN] Determine Vector Width programmatically..

Thanks Francesco! I'll commit the change tomorrow, unless @hsaito does it today :)

Mar 27 2019, 2:36 PM · Restricted Project

Mar 25 2019

hsaito added a comment to D57598: [VPLAN] Determine Vector Width programmatically..

@hsaito, I don't have commit access, could you commit this change for me?

Mar 25 2019, 7:58 PM · Restricted Project

Mar 22 2019

hsaito added a comment to D57598: [VPLAN] Determine Vector Width programmatically..

@npanchen , @fhahn , gentle ping :)

Francesco

Mar 22 2019, 3:37 PM · Restricted Project
hsaito added inline comments to D59149: [LV] move useEmulatedMaskMemRefHack() functionality to TTI..
Mar 22 2019, 2:26 PM · Restricted Project

Mar 19 2019

hsaito removed a reviewer for D57978: [CodeGen] Generate follow-up metadata for loops with more than one transformation.: hsaito.
Mar 19 2019, 2:54 PM · Restricted Project, Restricted Project
hsaito added a comment to D57978: [CodeGen] Generate follow-up metadata for loops with more than one transformation..

ping

Mar 19 2019, 2:51 PM · Restricted Project, Restricted Project
hsaito added inline comments to D59149: [LV] move useEmulatedMaskMemRefHack() functionality to TTI..
Mar 19 2019, 10:06 AM · Restricted Project

Mar 14 2019

hsaito accepted D57598: [VPLAN] Determine Vector Width programmatically..

LGTM. Please wait for a few days to give others a chance to go over your updated patch.

Mar 14 2019, 12:21 PM · Restricted Project
hsaito added inline comments to D57598: [VPLAN] Determine Vector Width programmatically..
Mar 14 2019, 10:26 AM · Restricted Project
hsaito added inline comments to D57598: [VPLAN] Determine Vector Width programmatically..
Mar 14 2019, 9:54 AM · Restricted Project