Page MenuHomePhabricator

SjoerdMeijer (Sjoerd Meijer)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 26 2016, 2:17 AM (207 w, 4 d)

Recent Activity

Fri, Jan 17

SjoerdMeijer updated the diff for D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..

Thanks for looking! Comments addressed, the main ones being:

  • restored the statistics, and
  • the code that I forgot to delete.
Fri, Jan 17, 5:22 AM · Restricted Project
SjoerdMeijer accepted D71837: [ARM][MVE] Tail Predicate IsSafeToRemove.

Thanks, LGTM

Fri, Jan 17, 5:13 AM · Restricted Project
SjoerdMeijer added inline comments to D71837: [ARM][MVE] Tail Predicate IsSafeToRemove.
Fri, Jan 17, 2:06 AM · Restricted Project
SjoerdMeijer added a comment to D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

Thanks guys, will address this before committing.

Fri, Jan 17, 1:28 AM · Restricted Project
SjoerdMeijer updated the diff for D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

I think this should cover it now, and address the comments:

  • RematerializeIterCount() is only called when we've cloned the VCTP instruction in an exit block, this makes the unrelated test case go away.
  • RematerializeIterCount() has become really simple with just a call to formLCSSARecursively() (and of course rewriteLoopExitValues()) because when there's nothing to do, formLCSSARecursively is a cheap operation, so we indeed don't need to check first if the loop is in LCSSA form.
  • One of the motivating examples for this work and optimisation is a reduction in vector-reduce-mve-tail.ll. I decided to use `formLCSSARecursively, instead of manually inserting a phi, because there are a few values live-out, the ones that feed the final reduction. We need to insert 3 Phi nodes, thus I don't think it is worth optimising and calling formLCSSARecursively is best.
  • This is tested by exiting tests. For example, the reductions vector-reduce-mve-tail.ll will bring it in LCSSA form first, and is testing that. Simpler tests elsewhere check when this doesn't need to happen first.
Fri, Jan 17, 12:55 AM · Restricted Project

Wed, Jan 15

SjoerdMeijer updated the diff for D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

I have removed WIP and TODO from the patch subject/description, as this revision now demonstrates I think what this patch should be doing:

  • I have removed LoopPass from this, as this indeed causes a few more passes to run, which we don't need, and causes (unrelated) changes elsewhere. Now, the only test changes are in the LowOverheadLoop tests, which is what we want.
  • I use formLCSSA() to transform the loop into LCSSA form if it isn't.
Wed, Jan 15, 7:16 AM · Restricted Project
SjoerdMeijer added a comment to D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

Thanks for sharing your thoughts!

Wed, Jan 15, 2:56 AM · Restricted Project

Tue, Jan 14

SjoerdMeijer added a child revision for D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC.: D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.
Tue, Jan 14, 9:27 AM · Restricted Project
SjoerdMeijer added a parent revision for D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks: D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..
Tue, Jan 14, 9:27 AM · Restricted Project
SjoerdMeijer created D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.
Tue, Jan 14, 9:08 AM · Restricted Project
SjoerdMeijer committed rGa08c0adee072: [ARM][MVE] VTP Block Pass fix (authored by SjoerdMeijer).
[ARM][MVE] VTP Block Pass fix
Tue, Jan 14, 8:12 AM
SjoerdMeijer closed D72699: [ARM][MVE] VTP Block fix.
Tue, Jan 14, 8:12 AM · Restricted Project
SjoerdMeijer updated the diff for D72699: [ARM][MVE] VTP Block fix.

Just added a comment for clarity saying that the IR function was intentionally left empty in the MIR test (the offending MIR sequence was extracted from a larger reproducer, and there's no need to copy the corresponding IR).

Tue, Jan 14, 6:35 AM · Restricted Project
SjoerdMeijer created D72699: [ARM][MVE] VTP Block fix.
Tue, Jan 14, 5:56 AM · Restricted Project

Mon, Jan 13

SjoerdMeijer added a comment to D72633: [ARM][MVE] Fix a corner case of checking for MVE-I with -mfpu=none.

I guess this makes probably sense, but just checking why this should have the effect of enabling MVE-I? Is there prior art -mfpu=none has a similar effect? Has this been synchronised with the GCC community?
Do we need more tests, e.g. target parser unit tests, and some more negative tests fp16 isn't set or hard float abi?

Mon, Jan 13, 11:31 AM
SjoerdMeijer added a comment to D71837: [ARM][MVE] Tail Predicate IsSafeToRemove.

A round of nits:

Mon, Jan 13, 7:06 AM · Restricted Project
SjoerdMeijer accepted D72504: [ARM][LowOverheadLoops] Change predicate inspection.

LGTM, with just one nit:

Mon, Jan 13, 5:57 AM · Restricted Project
SjoerdMeijer accepted D72509: [ARM][LowOverheadLoops] Allow all MVE instrs..

LGTM, with just a nit inlined.

Mon, Jan 13, 5:38 AM · Restricted Project
SjoerdMeijer created D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..
Mon, Jan 13, 5:20 AM · Restricted Project
SjoerdMeijer committed rGadd04b965384: ARMLowOverheadLoops: return earlier to avoid printing irrelevant dbg msg. NFC (authored by SjoerdMeijer).
ARMLowOverheadLoops: return earlier to avoid printing irrelevant dbg msg. NFC
Mon, Jan 13, 2:25 AM
SjoerdMeijer added a comment to D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

Thanks, and comment addressed in rG07028b5a8780.

Mon, Jan 13, 1:14 AM · Restricted Project
SjoerdMeijer committed rG07028b5a8780: [SCEV] Follow up of D71563: addressing post commit comment. NFC. (authored by SjoerdMeijer).
[SCEV] Follow up of D71563: addressing post commit comment. NFC.
Mon, Jan 13, 1:02 AM

Fri, Jan 10

SjoerdMeijer committed rG4569f63ae1cb: ARMLowOverheadLoops: a few more dbg msgs to better trace rejected TP loops. NFC. (authored by SjoerdMeijer).
ARMLowOverheadLoops: a few more dbg msgs to better trace rejected TP loops. NFC.
Fri, Jan 10, 6:17 AM
SjoerdMeijer committed rG356685a1d897: Follow up of 67bf9a6154d4b82c, minor fix in test case, removed duplicate option (authored by SjoerdMeijer).
Follow up of 67bf9a6154d4b82c, minor fix in test case, removed duplicate option
Fri, Jan 10, 1:48 AM
SjoerdMeijer committed rG67bf9a6154d4: [SVEV] Recognise hardware-loop intrinsic loop.decrement.reg (authored by SjoerdMeijer).
[SVEV] Recognise hardware-loop intrinsic loop.decrement.reg
Fri, Jan 10, 1:36 AM
SjoerdMeijer closed D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.
Fri, Jan 10, 1:36 AM · Restricted Project
SjoerdMeijer added a comment to D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

Thanks, and I will add that before committing.

Fri, Jan 10, 1:24 AM · Restricted Project

Thu, Jan 9

SjoerdMeijer updated the diff for D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

I have added test cases for LFTR and loopunrolling showing that the said transformations don't trigger as int_loop_decrement_reg is described as IntrNoDuplicate

Thu, Jan 9, 6:56 AM · Restricted Project
SjoerdMeijer added inline comments to D68203: Add support for (expressing) vscale..
Thu, Jan 9, 2:08 AM · Restricted Project
SjoerdMeijer committed rG8f1887456ab4: [LV] Still vectorise when tail-folding can't find a primary inducation variable (authored by SjoerdMeijer).
[LV] Still vectorise when tail-folding can't find a primary inducation variable
Thu, Jan 9, 1:21 AM
SjoerdMeijer closed D72324: [LV] Still vectorise when tail-folding can't find a primary inducation variable.
Thu, Jan 9, 1:21 AM · Restricted Project

Wed, Jan 8

SjoerdMeijer updated subscribers of D57054: [MachineOutliner][ARM][RFC] Add Machine Outliner support for ARM.

Just a quick message, linking in @samparker, and I guess moving Low Overhead Loops to run before ConstantIslands could be problematic, but we can/should have a proper look tomorrow.

Wed, Jan 8, 1:09 PM · Restricted Project, Restricted Project
SjoerdMeijer added inline comments to D72324: [LV] Still vectorise when tail-folding can't find a primary inducation variable.
Wed, Jan 8, 8:18 AM · Restricted Project
SjoerdMeijer added inline comments to D68203: Add support for (expressing) vscale..
Wed, Jan 8, 7:59 AM · Restricted Project
SjoerdMeijer added inline comments to D68203: Add support for (expressing) vscale..
Wed, Jan 8, 6:16 AM · Restricted Project
SjoerdMeijer updated the diff for D72324: [LV] Still vectorise when tail-folding can't find a primary inducation variable.

Thanks for looking at this! And also for encouraging me to look at this (my own) spaghetti logic again. But to be fair, we have quite a few factors that play a role here: optimising for minsize takes precedence over the prefer predicate options, which take precedence over the loop hints, which take precedence over the TTI hook. I have explained this in the comments, and have reshuffled the logic accordingly. I am now bailing earlier on PredicateOptDisabled, as you suggested, loop hints need to be checked lastly, and thus this addresses your comments, I think.

Wed, Jan 8, 5:23 AM · Restricted Project

Tue, Jan 7

SjoerdMeijer added a comment to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.

Thanks, LGTM again

Tue, Jan 7, 8:39 AM · Restricted Project
SjoerdMeijer committed rGee811808a9a0: [ARM][MVE] Renamed VPT Block tests and files to something more informative. NFC (authored by SjoerdMeijer).
[ARM][MVE] Renamed VPT Block tests and files to something more informative. NFC
Tue, Jan 7, 8:21 AM
SjoerdMeijer added a comment to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.

A last question/request, I guess we need 2 more tests? For completeness probably best to have a test with {{-enable-arm-maskedgatscat=false}} and another one with {{-mattr=-mve}}?

Tue, Jan 7, 6:55 AM · Restricted Project
SjoerdMeijer committed rGe34801c8e6df: [ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS (authored by SjoerdMeijer).
[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS
Tue, Jan 7, 6:00 AM
SjoerdMeijer closed D71470: Recommit "[ARM][MVE] findVCMPToFoldIntoVPS".
Tue, Jan 7, 6:00 AM · Restricted Project
SjoerdMeijer created D72324: [LV] Still vectorise when tail-folding can't find a primary inducation variable.
Tue, Jan 7, 5:13 AM · Restricted Project
SjoerdMeijer added a comment to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.

looks good to me too, other than the irrelevant nits, I have one question though, see inlined.

Tue, Jan 7, 3:51 AM · Restricted Project

Mon, Jan 6

SjoerdMeijer updated the diff for D71470: Recommit "[ARM][MVE] findVCMPToFoldIntoVPS".

Thanks for taking a look, I have added test case mve-vpt-block-fold-vcmp.mir.

Mon, Jan 6, 7:15 AM · Restricted Project
SjoerdMeijer added inline comments to D71837: [ARM][MVE] Tail Predicate IsSafeToRemove.
Mon, Jan 6, 3:03 AM · Restricted Project
SjoerdMeijer committed rG0efc9e5a8cc1: [ARM][MVE] More MVETailPredication debug messages. NFC. (authored by SjoerdMeijer).
[ARM][MVE] More MVETailPredication debug messages. NFC.
Mon, Jan 6, 1:59 AM
SjoerdMeijer closed D71549: [ARM][MVE] Some more dbg messages for MVETailPredication. NFC..
Mon, Jan 6, 1:58 AM · Restricted Project

Fri, Dec 20

SjoerdMeijer added a comment to D71696: [CodeGen] WIP MachinePostRAUpdater.

Good stuff!

Fri, Dec 20, 11:11 AM · Restricted Project

Dec 18 2019

SjoerdMeijer accepted D71107: [ARM][MVE] Tail predicate in the presence of vcmp.

Cheers, nice one, LGTM

Dec 18 2019, 1:14 AM · Restricted Project

Dec 17 2019

SjoerdMeijer added a comment to D71609: [ARM][MVE] Fixes for tail predication..

Can you upload the patch with context?

Dec 17 2019, 7:49 AM · Restricted Project
SjoerdMeijer added a comment to D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

Thanks again for sharing your thoughts and comments. I've done a bit of my homework:

Dec 17 2019, 3:47 AM · Restricted Project

Dec 16 2019

SjoerdMeijer added a comment to D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

Thanks for taking a look!

Dec 16 2019, 1:14 PM · Restricted Project
SjoerdMeijer created D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.
Dec 16 2019, 12:07 PM · Restricted Project
SjoerdMeijer created D71549: [ARM][MVE] Some more dbg messages for MVETailPredication. NFC..
Dec 16 2019, 7:01 AM · Restricted Project
SjoerdMeijer added a comment to D71107: [ARM][MVE] Tail predicate in the presence of vcmp.

One more nit about adding comments. This is a nice summary / description:

Dec 16 2019, 2:39 AM · Restricted Project
SjoerdMeijer accepted D71465: [ARM][MVE] Tail predicate bottom/top muls..

Looks reasonable

Dec 16 2019, 2:21 AM · Restricted Project
SjoerdMeijer closed D71426: [ARM] Move MVEOpcodes helpers to ARMBaseInstrInfo. NFC..

Committed in 049f9672d8566f0d0a115f11e2a53018ea502b10
(I had a typo in the tag Diferential Revision, so this didn't get closed automatically)

Dec 16 2019, 1:54 AM · Restricted Project
SjoerdMeijer updated the diff for D71470: Recommit "[ARM][MVE] findVCMPToFoldIntoVPS".

Delayed deleting the VCMP, in order not to invalidate the RDA analysis.

Dec 16 2019, 1:45 AM · Restricted Project
SjoerdMeijer committed rG049f9672d856: [ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC. (authored by SjoerdMeijer).
[ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.
Dec 16 2019, 1:18 AM

Dec 13 2019

SjoerdMeijer added inline comments to D71470: Recommit "[ARM][MVE] findVCMPToFoldIntoVPS".
Dec 13 2019, 9:38 AM · Restricted Project
SjoerdMeijer created D71470: Recommit "[ARM][MVE] findVCMPToFoldIntoVPS".
Dec 13 2019, 8:24 AM · Restricted Project
SjoerdMeijer committed rGe91420e17da3: Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC." (authored by SjoerdMeijer).
Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC."
Dec 13 2019, 3:59 AM
SjoerdMeijer added a reverting change for rG9468e3334ba5: [ARM][MVE] findVCMPToFoldIntoVPS. NFC.: rGe91420e17da3: Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC.".
Dec 13 2019, 3:59 AM
SjoerdMeijer updated the diff for D71426: [ARM] Move MVEOpcodes helpers to ARMBaseInstrInfo. NFC..

Ah yes, thanks, I forgot about ARMBaseInstrInfo, but that's definitely the place where this should live.

Dec 13 2019, 3:31 AM · Restricted Project

Dec 12 2019

SjoerdMeijer retitled D71426: [ARM] Move MVEOpcodes helpers to ARMBaseInstrInfo. NFC. from [ARM] Create utility MVEOpcodes to [ARM] Utility functions MVEOpcodes. NFC..
Dec 12 2019, 9:37 AM · Restricted Project
SjoerdMeijer created D71426: [ARM] Move MVEOpcodes helpers to ARMBaseInstrInfo. NFC..
Dec 12 2019, 9:37 AM · Restricted Project
SjoerdMeijer committed rG9468e3334ba5: [ARM][MVE] findVCMPToFoldIntoVPS. NFC. (authored by SjoerdMeijer).
[ARM][MVE] findVCMPToFoldIntoVPS. NFC.
Dec 12 2019, 7:45 AM
SjoerdMeijer closed D71330: [ARM][MVE] findVCMPToFoldIntoVPS. NFC..
Dec 12 2019, 7:45 AM · Restricted Project
SjoerdMeijer accepted D71410: [ARM][MVE] Make VPT invalid for tail predication.

Most of this is a test change!

Dec 12 2019, 7:36 AM · Restricted Project
SjoerdMeijer added a comment to D71410: [ARM][MVE] Make VPT invalid for tail predication.

Can we test this?

Dec 12 2019, 5:10 AM · Restricted Project
SjoerdMeijer updated the diff for D71330: [ARM][MVE] findVCMPToFoldIntoVPS. NFC..

Yep, more RDA

Dec 12 2019, 5:03 AM · Restricted Project
SjoerdMeijer updated the diff for D71330: [ARM][MVE] findVCMPToFoldIntoVPS. NFC..

Now using RDA

Dec 12 2019, 3:13 AM · Restricted Project
SjoerdMeijer added a comment to D71330: [ARM][MVE] findVCMPToFoldIntoVPS. NFC..

it's not that this is important, but the code would benefit from a rewrite. This was a more easy/straightforward change, but agreed that RDA would be even better, so will see what I can do.

Dec 12 2019, 12:16 AM · Restricted Project

Dec 11 2019

SjoerdMeijer added a comment to D71107: [ARM][MVE] Tail predicate in the presence of vcmp.

Yep, that's a lot of changes... :-)
I haven't read or digested everything yet, so this a first round of nits.

Dec 11 2019, 9:33 AM · Restricted Project
SjoerdMeijer committed rG021685491727: [Clang] Pragma vectorize_width() implies vectorize(enable) (authored by SjoerdMeijer).
[Clang] Pragma vectorize_width() implies vectorize(enable)
Dec 11 2019, 2:47 AM
SjoerdMeijer closed D69628: [Clang] Pragma vectorize_width() implies vectorize(enable), take 3.
Dec 11 2019, 2:47 AM · Restricted Project
SjoerdMeijer committed rGd97cf1f88902: [ARM][LowOverheadLoops] Remove dead loop update instructions. (authored by SjoerdMeijer).
[ARM][LowOverheadLoops] Remove dead loop update instructions.
Dec 11 2019, 2:26 AM
SjoerdMeijer closed D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.
Dec 11 2019, 2:25 AM · Restricted Project
SjoerdMeijer added a comment to D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.

Thanks for the reviews.

Dec 11 2019, 2:11 AM · Restricted Project
SjoerdMeijer created D71330: [ARM][MVE] findVCMPToFoldIntoVPS. NFC..
Dec 11 2019, 1:57 AM · Restricted Project

Dec 10 2019

SjoerdMeijer updated the diff for D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.

Thanks, comments addressed.

Dec 10 2019, 8:46 AM · Restricted Project
SjoerdMeijer added a comment to D71200: [TypePromotion] Query target register width.

Ah, didn't see the LGTM until I pressed the button myself.

Dec 10 2019, 3:45 AM · Restricted Project
SjoerdMeijer accepted D71200: [TypePromotion] Query target register width.

Looks like a good fix to me.

Dec 10 2019, 3:44 AM · Restricted Project
SjoerdMeijer updated the diff for D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.

I'm not sure this is doing what you want.. wouldn't the vctp likely be a use before def..?

Dec 10 2019, 3:26 AM · Restricted Project

Dec 6 2019

SjoerdMeijer accepted D71109: [ARM] Disable VLD4 under MVE.

Currently this prevents us from compiling quite a few codes, so this LGTM while we sort things out.

Dec 6 2019, 8:29 AM · Restricted Project
SjoerdMeijer added inline comments to D71109: [ARM] Disable VLD4 under MVE.
Dec 6 2019, 8:11 AM · Restricted Project
SjoerdMeijer updated the diff for D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.

Comments addressed and thanks for catching that problem, a test case has been added.

Dec 6 2019, 7:43 AM · Restricted Project
SjoerdMeijer added inline comments to D71109: [ARM] Disable VLD4 under MVE.
Dec 6 2019, 4:11 AM · Restricted Project

Dec 4 2019

SjoerdMeijer accepted D70998: [ARM] Enable TypePromotion by default.

We've had this enabled by default for almost 1 year downstream now, which is a decent amount of time to shake out some codegen bugs. I agree that flipping the switch now make sense, and hopefully similar non-X86 targets enjoy decent performance improvements too.

Dec 4 2019, 5:29 AM · Restricted Project
SjoerdMeijer created D71007: [ARM][LowOverheadLoops] Remove dead loop update instructions.
Dec 4 2019, 5:11 AM · Restricted Project

Nov 29 2019

SjoerdMeijer accepted D69556: [CodeGen] Move ARMCodegenPrepare to TypePromotion.

Hi Sam, about the AArch64 results: I assume you targeted the A32 ISA, can you confirm that? Because that would mean we have 2 data points, at least 2 ISA for which this is beneficial: T32 and A32.

Nov 29 2019, 7:25 AM · Restricted Project
SjoerdMeijer accepted D70841: [ARM][MVE] Sink vector shift operand.

LGTM

Nov 29 2019, 4:29 AM · Restricted Project
SjoerdMeijer accepted D70822: [ARM] Add some VCMP folding and canonicalisation.

Yep, nice combine

Nov 29 2019, 2:08 AM · Restricted Project
SjoerdMeijer accepted D70790: [ARM] Favour post inc for MVE loops.

LGTM

Nov 29 2019, 1:33 AM · Restricted Project

Nov 28 2019

SjoerdMeijer accepted D70824: [ARM] Add ARMCC constants to tablegen. NFC.

Oh yes! Nice one!

Nov 28 2019, 7:09 AM · Restricted Project

Nov 27 2019

SjoerdMeijer added a comment to D69350: [ARM] Replace arm_neon_vqadds with sadd_sat.

Still LGTM

Nov 27 2019, 3:44 AM · Restricted Project, Restricted Project
SjoerdMeijer accepted D70669: [AArch64TTI] Compute imm materialization cost for AArch64 intrinsics.

Thanks, and looks reasonable to me.

Nov 27 2019, 1:28 AM · Restricted Project

Nov 26 2019

SjoerdMeijer added a comment to D70669: [AArch64TTI] Compute imm materialization cost for AArch64 intrinsics.

I wanted to look into this to make a less hand wavy suggestion, but I got lost a little bit lost, which always happens when I look in that jungle that is TTI and TTIImpl etc., so will just ask a question instead.

Nov 26 2019, 1:39 AM · Restricted Project

Nov 25 2019

SjoerdMeijer added inline comments to D70669: [AArch64TTI] Compute imm materialization cost for AArch64 intrinsics.
Nov 25 2019, 11:50 AM · Restricted Project

Nov 23 2019

SjoerdMeijer added a comment to rG825235c140e7: Revert "[Sema] Use the canonical type in function isVector".

I don't think I have full context of the problem.... but in this diff I see type __fp16, which is the storage-only type. But if we are talking about v8.2-A and FP16, then we are talking about the native type (e.g. source-language type _Float16), and for that you need to enable the +fp16 architecture extension. I would have to refresh my memory, but I thought -fnative-half-type was used to enable OpenCL fp16 types? Either way, it looks like there's interaction between __fp16 and native FP16 types, but I don't think adding -fnative-half-type is the right approach. But as I said, I don't know exactly what the problem is, and would need to have a proper look when I'm back at my desk.

Nov 23 2019, 1:55 AM