rengolin (Renato Golin)
Toolchain Engineer

Projects

User does not belong to any projects.

User Details

User Since
Oct 19 2012, 12:57 AM (309 w, 8 h)

Recent Activity

Fri, Sep 14

rengolin added a comment to D18086: Fix default processor name for armv6k..

Corrections are always welcome. Please submit an update to this thread (or a new patch) to fix the changes you propose.

Fri, Sep 14, 6:07 AM
rengolin added a comment to D18086: Fix default processor name for armv6k..

If j-s doesn't exists (remember, most of that list came from the ancient days of llvm, so could very easily be completely wrong, but "works"), I all for removing it and making a new (correct) CPU name as the default for armv6k.

Fri, Sep 14, 4:45 AM

Thu, Sep 13

rengolin added inline comments to D18086: Fix default processor name for armv6k..
Thu, Sep 13, 12:50 PM
rengolin added a comment to D18086: Fix default processor name for armv6k..

I seem to have a related issue: I am using -march=armv6k and -no-integrated-as, this generates the following output:
/tmp/empty-bc2ea3.s:4: Error: unknown cpu `arm1176j-s'

Thu, Sep 13, 11:40 AM

Tue, Sep 11

rengolin accepted D49488: [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC).

Thanks Florian! LGTM too.

Tue, Sep 11, 9:02 AM

Thu, Aug 30

rengolin added a comment to D51465: Revamp test-suite documentation.

Very detailed, using the modern infrastructure and helpful even to those not using clang or wanting to run external benchmarks.

Thu, Aug 30, 2:59 AM

Jul 19 2018

rengolin added reviewers for D49563: [ARM] Add new target feature to fuse literal generation: efriedma, SjoerdMeijer, peter.smith, thopre, kristof.beyls, aadg.

I shall, at least for Exynos, but I'd like to hear from our friends at ARM too.

Jul 19 2018, 1:22 PM
rengolin added reviewers for D49168: [LV] Add a new reduction pattern match: fhahn, RKSimon, dcaballe, hsaito.

Hi Takahiro,

Jul 19 2018, 1:13 PM
rengolin added a comment to D49563: [ARM] Add new target feature to fuse literal generation.

Hi Evandro, looks great, short and simple!

Jul 19 2018, 1:11 PM
rengolin added a comment to D49488: [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC).

Moved getMemInstAlignment and getMemInstAddressSpace to IR/Instructions.h which already contains similar helpers. Should I rename them to getLoadStoreAlignment and getLoadStoreAddressSpace to be more in line with the existing getLoadStorePointerOperand?

Jul 19 2018, 12:38 PM
rengolin added a comment to D49488: [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC).

Thanks for having a look so quickly! Maybe there is a better place to put it, maybe we should keep it local to lib/Transforms/Vectorize/? I initially put it in VectorUtils.h, because I wanted to avoid creating unnecessary new files (and splitting things up unnecessarily), but given that VectorUtils.h is used in quite a few places, I am happy to put it wherever it would fit best :)

Jul 19 2018, 6:27 AM
rengolin added a comment to D49491: [RFC][VPlan, SLP] Add simple SLP analysis on top of VPlan..

I'll let Florian/Hideki reply about timeframes and strategies, and will just focus on specific items you list.

Jul 19 2018, 5:17 AM

Jul 18 2018

rengolin added a comment to D49491: [RFC][VPlan, SLP] Add simple SLP analysis on top of VPlan..

The fact of the matter is that the loop vectorization has a need to understand SLP and SLP vectorizer needs to understand Loop. As such, unless we want to build/maintain separate LoopVectorize+SLP and SLPVectorize+Loop, consolidation of LoopVectorization and SLPVectorization will inevitably happen sooner or later. From that perspective, ensuring that VPlan is the right infrastructure for such consolidation is a very important thing for us to do.

Jul 18 2018, 12:57 PM
rengolin added a comment to D49491: [RFC][VPlan, SLP] Add simple SLP analysis on top of VPlan..

My tuppence...

Jul 18 2018, 11:58 AM
rengolin added a comment to D49488: [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC).

I like the idea of the interleave analysis to be at a higher ground, but we have to be careful with LV-specific logic and only hoist what it truly generic.

Jul 18 2018, 8:41 AM

Jun 21 2018

rengolin added a comment to D48332: [AArch64] Add custom lowering for v4i8 trunc store.

Thanks! I'm happy with Eli is happy. :)

Jun 21 2018, 6:12 AM

Jun 20 2018

rengolin added inline comments to D48332: [AArch64] Add custom lowering for v4i8 trunc store.
Jun 20 2018, 1:11 PM

Jun 19 2018

rengolin added inline comments to D48332: [AArch64] Add custom lowering for v4i8 trunc store.
Jun 19 2018, 1:34 PM

Jun 8 2018

rengolin added reviewers for D47943: Sample code for porting MachinePipeliner to AArch64+SVE: rengolin, t.p.northover, huntergr, sdesmalen, fhahn, qcolombet, MatzeB, sebpop.

Adding some reviewers + folks on the original review:

Jun 8 2018, 7:26 AM

Jun 1 2018

rengolin added a comment to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.
401.bzip2-2.04

I will check if 401.bzip2 slight drop is just noise or something related to this patch, but regardless I do think this change should yield better performance in most scenarios.

Jun 1 2018, 6:53 AM

May 31 2018

rengolin added a comment to D47575: [ASAN] Sanitize testsuite for ARM..

I'm guessing this is a fix for: http://lab.llvm.org:8011/builders/clang-cmake-thumbv8-full-sh/builds/185

May 31 2018, 1:58 AM

May 23 2018

rengolin added a comment to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.

SPEC06 results look too noisy to conclude anything, especially bzip, xalan and povray. Can you find a more stable machine?

May 23 2018, 6:45 AM

May 18 2018

rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 7:52 AM
rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 6:59 AM
rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 5:25 AM

May 15 2018

rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

It does seem like a wiki would be nice to maintain this kind of information. In the absence of that, I think that a file in the test-suite repository, or a page in www are about equally easy/hard to maintain: it requires commit access to make any changes.
A file in www in theory could be more visible as it becomes part of the llvm.org web pages. That being said, source code is also viewable online, so it's easy to browse this text too.

May 15 2018, 2:20 AM

May 10 2018

rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

I should have clarified: Regarding SPEC, I meant adding CMakeLists in the External directory.

May 10 2018, 1:06 PM
rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

It's odd to have this in the repository, but admittedly we don't really have a wiki or similar in LLVM so I may be ok.

May 10 2018, 1:04 PM
rengolin added a reviewer for D46714: [test-suite] Add list of programs we might add.: maxim-kuvyrkov.

We can't add SPEC, as it's commercial. I'm not sure about others, but please make sure they are open source.

May 10 2018, 12:54 PM

May 4 2018

rengolin accepted D46010: [AArch64] Improve cost of vector division by constant.

LGTM with the line removed. :)

May 4 2018, 11:39 AM
rengolin added a comment to D46010: [AArch64] Improve cost of vector division by constant.

I should not change other architectures than AArch64 because 'isArithmeticDivFast' is a new method (only used by this patch).

May 4 2018, 8:17 AM
rengolin accepted D46302: [LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block.

Nice catch! Sorry for the delay, LGTM.

May 4 2018, 1:42 AM

May 1 2018

rengolin accepted D45875: [zorg] Throttle down parallelism of AArch64 and AArch32 libcxx bots.
May 1 2018, 3:25 AM

Apr 30 2018

rengolin added a comment to D46278: [AArch64] Fold B = csel A, A into B = COPY A.

It's not really csel vs. mov; the COPY likely gets coalesced away, and it might allow erasing the condition which feeds the select, which might allow erasing more code, etc.

Apr 30 2018, 2:22 PM
rengolin requested changes to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.

[1] [llvm] r305960 - Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r305990 - Revert "Enable vectorizer-maximize-bandwidth by default." (Diana Picus)
[llvm] r306336 - Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r306344 - revert r306336 for breaking ppc test. (Dehao)
[llvm] r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r306792 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." (Daniel)
[llvm] r306933 - Enable vectorizer-maximize-bandwidth by default.
[llvm] r306934 - revert r306336 for breaking ppc test.
[llvm] r306935 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
[llvm] r306936 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."

Apr 30 2018, 2:17 PM
rengolin added a comment to D46278: [AArch64] Fold B = csel A, A into B = COPY A.

This does look like a heavy hammer for a small fix. From the optimisation guides, CSEL and MOV have the same latency/bandwidth and the condition is probably pipelined in anyway.

Apr 30 2018, 1:50 PM
rengolin added a reviewer for D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true: mcrosier.

This seems like an improvement, but we have to be careful with wide variations and little gain. The geomean is almost null but the standard deviation is higher than 75% of the results.

Apr 30 2018, 1:47 PM
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

There seems like a disconnect.

Apr 30 2018, 1:27 PM
rengolin added a comment to D46254: [LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC)..

We need to consciously try using (or even creating) generically reusable code like this, instead of each file having it's own "skip this, skip that".

Apr 30 2018, 1:13 PM · debug-info
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

For example, in loop fusion scenarios, doing this is not a rocket science.

Apr 30 2018, 10:03 AM
rengolin accepted D46254: [LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC)..

makes sense. thanks!

Apr 30 2018, 3:29 AM · debug-info
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

In this case, the most immediate client for us is actually our internal fast-track VPLan vectorizer (it's outside of LV so that we don't mess up LV accidentally),

Apr 30 2018, 3:28 AM

Apr 29 2018

rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

A public header has zero impact in compile time for the files that are not including it, for obvious reasons.

Apr 29 2018, 10:45 AM

Apr 25 2018

rengolin accepted D45558: [test-suite] Save stats for LTO step too..

LGTM, thanks!

Apr 25 2018, 6:16 AM
rengolin added a comment to D46010: [AArch64] Improve cost of vector division by constant.

Do we have similar cost tests for x86, Arm and others? It would be nice to make sure the change to BasicTTIImplBase didn't break other targets' costs.

Apr 25 2018, 3:59 AM

Apr 24 2018

rengolin added inline comments to D45880: [AArch64][SVE] Enable DiagnosticPredicates for SVE LD1 instructions..
Apr 24 2018, 2:04 AM
rengolin accepted D42447: [LV][VPlan] Detect outer loops for explicit vectorization..

Thank you! LGTM!

Apr 24 2018, 1:50 AM

Apr 23 2018

rengolin added a comment to D45879: [AsmMatcher] Extend PredicateMethod with optional DiagnosticPredicate.

Can't remember if it was @olista01 or @SjoerdMeijer who was doing the partial match in the assembler, but the idea sounds good to me. I'll let them (or someone else closer to the changes) to approve.

Apr 23 2018, 2:28 PM
rengolin added a comment to D45952: [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector (32bit elts, unscaled)) load instructions..

These patches all look very similar, I think they could be all in the same commit? Even being large, may be easier to see how they are similar and do a quick review.

Apr 23 2018, 2:22 PM
rengolin accepted D45946: [AArch64][SVE] Asm: Support for contiguous, first-faulting LDFF1 (scalar+scalar) load instructions..
Apr 23 2018, 2:18 PM

Apr 22 2018

rengolin added a comment to D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree.

Side effect of that is D45552 will have to land in Transform first and then move to Analysis when the descriptors move.

Apr 22 2018, 8:30 AM
rengolin accepted D45684: [AArch64][SVE] Asm: Support for contiguous, non-faulting LDNF1 (scalar+imm) load instructions.
Apr 22 2018, 8:23 AM
rengolin accepted D45681: [AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+imm) store instructions..
Apr 22 2018, 8:23 AM

Apr 19 2018

rengolin added a reviewer for D45818: Fix BNF nits in TableGen language reference.: stoklund.

Adding TableGen's code owner.

Apr 19 2018, 8:16 AM

Apr 17 2018

rengolin added reviewers for D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree: dcaballe, fhahn.

Adding Diego and Florian. We just had a good chat about this and maybe keeping the headers inside Transform for now would be better for the changes to come. We can move it later when it becomes clearer what the purpose is.

Apr 17 2018, 5:40 AM

Apr 15 2018

rengolin added reviewers for D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree: hfinkel, mkuper, chandlerc.

So, I never liked LoopUtils and I agree a lot of it should go to Analysis but having two copies of LoopUtils is bound to confuse people when including the headers.

Apr 15 2018, 6:42 AM
rengolin accepted D45623: [AArch64][SVE] Asm: Support for structured LD3 (scalar+imm) load instructions..

Minus the whitespace change, LGTM. Thanks!

Apr 15 2018, 6:25 AM
rengolin accepted D45624: [AArch64][SVE] Asm: Support for structured LD4 (scalar+imm) load instructions..

Minus the whitespace change, LGTM. Thanks!

Apr 15 2018, 6:24 AM
rengolin accepted D45622: [AArch64][SVE] Asm: Support for structured LD2 (scalar+imm) load instructions..

Mechanical changes, LGTM. Thanks!

Apr 15 2018, 6:23 AM

Apr 13 2018

rengolin added inline comments to D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree.
Apr 13 2018, 8:23 AM
rengolin accepted D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

Right, pure code movement, and a good clean up at that, LGTM.

Apr 13 2018, 8:17 AM
rengolin added inline comments to D45618: [AArch64][SVE] Asm: Support for contiguous LD1 (scalar+imm) load instructions.
Apr 13 2018, 7:06 AM
rengolin accepted D45618: [AArch64][SVE] Asm: Support for contiguous LD1 (scalar+imm) load instructions.

LGTM, thanks!

Apr 13 2018, 5:08 AM
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

Hi Hideki,

Apr 13 2018, 3:39 AM
rengolin accepted D45432: [AArch64][SVE] Asm: Support for contiguous ST1 (scalar+imm) store instructions..

LTM, thanks!

Apr 13 2018, 3:22 AM

Apr 12 2018

rengolin added inline comments to D45432: [AArch64][SVE] Asm: Support for contiguous ST1 (scalar+imm) store instructions..
Apr 12 2018, 9:21 AM
rengolin added inline comments to D45429: [AArch64][AsmParser] Make parse function for VectorLists generic to other vector types..
Apr 12 2018, 4:26 AM

Apr 11 2018

rengolin accepted D45429: [AArch64][AsmParser] Make parse function for VectorLists generic to other vector types..

Mechanical update, looks good, thanks!

Apr 11 2018, 6:22 AM

Apr 10 2018

rengolin accepted D45428: [AArch64][AsmParser] Split index parsing from vector list..

LGTM.

Apr 10 2018, 10:29 AM

Apr 9 2018

rengolin accepted D45430: [AArch64][AsmParser] Unify 'addVectorListOperands' functions..

Simple and NFC. LGTM. Thanks!

Apr 9 2018, 12:16 PM
rengolin added inline comments to D45428: [AArch64][AsmParser] Split index parsing from vector list..
Apr 9 2018, 12:10 PM
rengolin added a comment to D45427: [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors..

Indeed, nice cleanup. I think we discussed this earlier and it's a much welcome change, to merge NEON and SVE parsing.

Apr 9 2018, 11:53 AM
rengolin added inline comments to D44338: [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops..
Apr 9 2018, 7:31 AM
rengolin accepted D45370: [AArch64][SVE] Asm: Add support for SVE INDEX instructions..
Apr 9 2018, 2:07 AM
rengolin accepted D45371: [AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions..
Apr 9 2018, 2:07 AM

Apr 8 2018

rengolin accepted D45072: [NFC][LV] Move InterleaveInfo from Legal to CostModel.

Hi Hideki,

Apr 8 2018, 3:47 PM

Apr 6 2018

rengolin added a reviewer for D45371: [AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions.: evandro.
Apr 6 2018, 8:06 AM
rengolin added a reviewer for D45370: [AArch64][SVE] Asm: Add support for SVE INDEX instructions.: evandro.
Apr 6 2018, 8:06 AM
rengolin added a comment to D45366: Support generic expansion of ordered vector reduction (PR36732).

For now, that the shuffle reduction is in loop utils, this patch is fine. But this really ought to be in target transform info.

Apr 6 2018, 8:02 AM
rengolin updated subscribers of D45240: [ARM] Compute a target feature which corresponds to the ARM version..

LTO is something I never considered before in the context of the target parser, but I understand the issues are similar to what the build attributes were trying to solve, so adding more info shouldn't make it worse.

Apr 6 2018, 5:04 AM

Apr 4 2018

rengolin accepted D44812: Remove MachineLoopInfo dependency from AsmPrinter..

I had seen the first version of this patch and thought it confusing, so refrained from reviewing (too broad context), but this version is much clearer now, and I can see it as straightforward benefit.

Apr 4 2018, 11:14 PM
rengolin added inline comments to D44812: Remove MachineLoopInfo dependency from AsmPrinter..
Apr 4 2018, 4:27 PM

Apr 3 2018

rengolin accepted D44678: [ARM] Do not convert some vmov instructions.
Apr 3 2018, 1:07 PM
rengolin accepted D44853: [AArch64] Change std::sort to llvm::sort in response to r327219.

This patch is one of a series of patches to replace *all* std::sort to llvm::sort in llvm/clang/polly (and other tools). I have a patch (https://reviews.llvm.org/D44363) which replaces all std::sort to llvm::sort. But as per review comments, it was suggested to break down that patch target-wise to smaller chunks. Hence I am now pushing smaller target-specific patches.

Apr 3 2018, 1:03 PM
rengolin added inline comments to D44338: [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops..
Apr 3 2018, 1:00 PM
rengolin added a comment to D42447: [LV][VPlan] Detect outer loops for explicit vectorization..

Is -vplan-build-stress-test flag in Patch #3 (D44338) aligned with what you had in mind? :)

Apr 3 2018, 12:47 PM

Mar 28 2018

rengolin added a comment to D44982: [zorg] Adding two new builders for armv7 and aarch64.

Nice! Great for bisecting! I'll let @maxim-kuvyrkov approve, as he's the one taking care of the Buildbots now.

Mar 28 2018, 11:37 PM

Mar 23 2018

rengolin requested changes to D44853: [AArch64] Change std::sort to llvm::sort in response to r327219.

I don't like this being on by default. I get it it uses EXPENSIVE_CHECKS, but still, most developers use Debug/Asserts builds.

Mar 23 2018, 8:39 PM

Mar 16 2018

rengolin accepted D44460: [ARM] Fix a check in vmov/vmvn immediate parsing.

LGTM, thanks!

Mar 16 2018, 5:29 AM

Mar 15 2018

rengolin accepted D44489: [TTI, AArch64] Allow the cost model analysis to test vector reduce intrinsics.

LGTM, thanks!

Mar 15 2018, 8:05 AM
rengolin accepted D44467: [ARM] Convert more invalid NEON immediate loads.

LGTM, thanks!

Mar 15 2018, 5:15 AM

Mar 14 2018

rengolin added a comment to D44467: [ARM] Convert more invalid NEON immediate loads.

Nice cleanup!

Mar 14 2018, 3:40 PM
rengolin accepted D44495: [CleanUp] Remove NumInstructions field from LoopVectorizer's RegisterUsage struct..

The original code was calculating the unroll factor by NumInstructions like:

Mar 14 2018, 3:19 PM
rengolin added a comment to D44489: [TTI, AArch64] Allow the cost model analysis to test vector reduce intrinsics.

Hi Mathew,

Mar 14 2018, 3:07 PM

Mar 10 2018

rengolin added a comment to D44355: [AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/str.

Can't this be done in the DAGCombine?

Mar 10 2018, 4:23 PM
rengolin accepted D43971: [AArch64] Implement native TLS for Windows.

Right, now it's clear what it's doing, thanks! I'm happy with it, LGTM.

Mar 10 2018, 5:52 AM

Mar 9 2018

rengolin added a comment to D43971: [AArch64] Implement native TLS for Windows.

Right, this makes a lot more sense! :)

Mar 9 2018, 2:55 PM
rengolin closed D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.
Mar 9 2018, 1:09 PM
rengolin added a comment to D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.

r327173

Mar 9 2018, 1:09 PM
rengolin committed rL327173: [NFC] Consolidate six getPointerOperand() utility functions into one place.
[NFC] Consolidate six getPointerOperand() utility functions into one place
Mar 9 2018, 1:09 PM