rengolin (Renato Golin)
Toolchain Engineer

Projects

User does not belong to any projects.

User Details

User Since
Oct 19 2012, 12:57 AM (295 w, 3 d)

Recent Activity

Fri, Jun 8

rengolin added reviewers for D47943: Sample code for porting MachinePipeliner to AArch64+SVE: rengolin, t.p.northover, huntergr, sdesmalen, fhahn, qcolombet, MatzeB, sebpop.

Adding some reviewers + folks on the original review:

Fri, Jun 8, 7:26 AM

Fri, Jun 1

rengolin added a comment to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.
401.bzip2-2.04

I will check if 401.bzip2 slight drop is just noise or something related to this patch, but regardless I do think this change should yield better performance in most scenarios.

Fri, Jun 1, 6:53 AM

Thu, May 31

rengolin added a comment to D47575: [ASAN] Sanitize testsuite for ARM..

I'm guessing this is a fix for: http://lab.llvm.org:8011/builders/clang-cmake-thumbv8-full-sh/builds/185

Thu, May 31, 1:58 AM

Wed, May 23

rengolin added a comment to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.

SPEC06 results look too noisy to conclude anything, especially bzip, xalan and povray. Can you find a more stable machine?

Wed, May 23, 6:45 AM

May 18 2018

rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 7:52 AM
rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 6:59 AM
rengolin added inline comments to D46695: [RFC] [Patch 1/3] Add a new class of predicates for variant scheduling classes..
May 18 2018, 5:25 AM

May 15 2018

rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

It does seem like a wiki would be nice to maintain this kind of information. In the absence of that, I think that a file in the test-suite repository, or a page in www are about equally easy/hard to maintain: it requires commit access to make any changes.
A file in www in theory could be more visible as it becomes part of the llvm.org web pages. That being said, source code is also viewable online, so it's easy to browse this text too.

May 15 2018, 2:20 AM

May 10 2018

rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

I should have clarified: Regarding SPEC, I meant adding CMakeLists in the External directory.

May 10 2018, 1:06 PM
rengolin added a comment to D46714: [test-suite] Add list of programs we might add..

It's odd to have this in the repository, but admittedly we don't really have a wiki or similar in LLVM so I may be ok.

May 10 2018, 1:04 PM
rengolin added a reviewer for D46714: [test-suite] Add list of programs we might add.: maxim-kuvyrkov.

We can't add SPEC, as it's commercial. I'm not sure about others, but please make sure they are open source.

May 10 2018, 12:54 PM

May 4 2018

rengolin accepted D46010: [AArch64] Improve cost of vector division by constant.

LGTM with the line removed. :)

May 4 2018, 11:39 AM
rengolin added a comment to D46010: [AArch64] Improve cost of vector division by constant.

I should not change other architectures than AArch64 because 'isArithmeticDivFast' is a new method (only used by this patch).

May 4 2018, 8:17 AM
rengolin accepted D46302: [LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block.

Nice catch! Sorry for the delay, LGTM.

May 4 2018, 1:42 AM

May 1 2018

rengolin accepted D45875: [zorg] Throttle down parallelism of AArch64 and AArch32 libcxx bots.
May 1 2018, 3:25 AM

Apr 30 2018

rengolin added a comment to D46278: [AArch64] Fold B = csel A, A into B = COPY A.

It's not really csel vs. mov; the COPY likely gets coalesced away, and it might allow erasing the condition which feeds the select, which might allow erasing more code, etc.

Apr 30 2018, 2:22 PM
rengolin requested changes to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.

[1] [llvm] r305960 - Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r305990 - Revert "Enable vectorizer-maximize-bandwidth by default." (Diana Picus)
[llvm] r306336 - Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r306344 - revert r306336 for breaking ppc test. (Dehao)
[llvm] r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default. (Dehao)
[llvm] r306792 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." (Daniel)
[llvm] r306933 - Enable vectorizer-maximize-bandwidth by default.
[llvm] r306934 - revert r306336 for breaking ppc test.
[llvm] r306935 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default.
[llvm] r306936 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."

Apr 30 2018, 2:17 PM
rengolin added a comment to D46278: [AArch64] Fold B = csel A, A into B = COPY A.

This does look like a heavy hammer for a small fix. From the optimisation guides, CSEL and MOV have the same latency/bandwidth and the condition is probably pipelined in anyway.

Apr 30 2018, 1:50 PM
rengolin added a reviewer for D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true: mcrosier.

This seems like an improvement, but we have to be careful with wide variations and little gain. The geomean is almost null but the standard deviation is higher than 75% of the results.

Apr 30 2018, 1:47 PM
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

There seems like a disconnect.

Apr 30 2018, 1:27 PM
rengolin added a comment to D46254: [LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC)..

We need to consciously try using (or even creating) generically reusable code like this, instead of each file having it's own "skip this, skip that".

Apr 30 2018, 1:13 PM · debug-info
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

For example, in loop fusion scenarios, doing this is not a rocket science.

Apr 30 2018, 10:03 AM
rengolin accepted D46254: [LV] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC)..

makes sense. thanks!

Apr 30 2018, 3:29 AM · debug-info
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

In this case, the most immediate client for us is actually our internal fast-track VPLan vectorizer (it's outside of LV so that we don't mess up LV accidentally),

Apr 30 2018, 3:28 AM

Apr 29 2018

rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

A public header has zero impact in compile time for the files that are not including it, for obvious reasons.

Apr 29 2018, 10:45 AM

Apr 25 2018

rengolin accepted D45558: [test-suite] Save stats for LTO step too..

LGTM, thanks!

Apr 25 2018, 6:16 AM
rengolin added a comment to D46010: [AArch64] Improve cost of vector division by constant.

Do we have similar cost tests for x86, Arm and others? It would be nice to make sure the change to BasicTTIImplBase didn't break other targets' costs.

Apr 25 2018, 3:59 AM

Apr 24 2018

rengolin added inline comments to D45880: [AArch64][SVE] Enable DiagnosticPredicates for SVE LD1 instructions..
Apr 24 2018, 2:04 AM
rengolin accepted D42447: [LV][VPlan] Detect outer loops for explicit vectorization..

Thank you! LGTM!

Apr 24 2018, 1:50 AM

Apr 23 2018

rengolin added a comment to D45879: [AsmMatcher] Extend PredicateMethod with optional DiagnosticPredicate.

Can't remember if it was @olista01 or @SjoerdMeijer who was doing the partial match in the assembler, but the idea sounds good to me. I'll let them (or someone else closer to the changes) to approve.

Apr 23 2018, 2:28 PM
rengolin added a comment to D45952: [AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector (32bit elts, unscaled)) load instructions..

These patches all look very similar, I think they could be all in the same commit? Even being large, may be easier to see how they are similar and do a quick review.

Apr 23 2018, 2:22 PM
rengolin accepted D45946: [AArch64][SVE] Asm: Support for contiguous, first-faulting LDFF1 (scalar+scalar) load instructions..
Apr 23 2018, 2:18 PM

Apr 22 2018

rengolin added a comment to D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree.

Side effect of that is D45552 will have to land in Transform first and then move to Analysis when the descriptors move.

Apr 22 2018, 8:30 AM
rengolin accepted D45684: [AArch64][SVE] Asm: Support for contiguous, non-faulting LDNF1 (scalar+imm) load instructions.
Apr 22 2018, 8:23 AM
rengolin accepted D45681: [AArch64][SVE] Asm: Support for structured ST2, ST3 and ST4 (scalar+imm) store instructions..
Apr 22 2018, 8:23 AM

Apr 19 2018

rengolin added a reviewer for D45818: Fix BNF nits in TableGen language reference.: stoklund.

Adding TableGen's code owner.

Apr 19 2018, 8:16 AM

Apr 17 2018

rengolin added reviewers for D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree: dcaballe, fhahn.

Adding Diego and Florian. We just had a good chat about this and maybe keeping the headers inside Transform for now would be better for the changes to come. We can move it later when it becomes clearer what the purpose is.

Apr 17 2018, 5:40 AM

Apr 15 2018

rengolin added reviewers for D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree: hfinkel, mkuper, chandlerc.

So, I never liked LoopUtils and I agree a lot of it should go to Analysis but having two copies of LoopUtils is bound to confuse people when including the headers.

Apr 15 2018, 6:42 AM
rengolin accepted D45623: [AArch64][SVE] Asm: Support for structured LD3 (scalar+imm) load instructions..

Minus the whitespace change, LGTM. Thanks!

Apr 15 2018, 6:25 AM
rengolin accepted D45624: [AArch64][SVE] Asm: Support for structured LD4 (scalar+imm) load instructions..

Minus the whitespace change, LGTM. Thanks!

Apr 15 2018, 6:24 AM
rengolin accepted D45622: [AArch64][SVE] Asm: Support for structured LD2 (scalar+imm) load instructions..

Mechanical changes, LGTM. Thanks!

Apr 15 2018, 6:23 AM

Apr 13 2018

rengolin added inline comments to D45420: [NFC] [LoopUtil] Moved RecurrenceDescriptor/LoopDescriptor from Transform/Utils/LoopUtils.* to Analysis tree.
Apr 13 2018, 8:23 AM
rengolin accepted D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

Right, pure code movement, and a good clean up at that, LGTM.

Apr 13 2018, 8:17 AM
rengolin added inline comments to D45618: [AArch64][SVE] Asm: Support for contiguous LD1 (scalar+imm) load instructions.
Apr 13 2018, 7:06 AM
rengolin accepted D45618: [AArch64][SVE] Asm: Support for contiguous LD1 (scalar+imm) load instructions.

LGTM, thanks!

Apr 13 2018, 5:08 AM
rengolin added a comment to D45552: [NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file.

Hi Hideki,

Apr 13 2018, 3:39 AM
rengolin accepted D45432: [AArch64][SVE] Asm: Support for contiguous ST1 (scalar+imm) store instructions..

LTM, thanks!

Apr 13 2018, 3:22 AM

Apr 12 2018

rengolin added inline comments to D45432: [AArch64][SVE] Asm: Support for contiguous ST1 (scalar+imm) store instructions..
Apr 12 2018, 9:21 AM
rengolin added inline comments to D45429: [AArch64][AsmParser] Make parse function for VectorLists generic to other vector types..
Apr 12 2018, 4:26 AM

Apr 11 2018

rengolin accepted D45429: [AArch64][AsmParser] Make parse function for VectorLists generic to other vector types..

Mechanical update, looks good, thanks!

Apr 11 2018, 6:22 AM

Apr 10 2018

rengolin accepted D45428: [AArch64][AsmParser] Split index parsing from vector list..

LGTM.

Apr 10 2018, 10:29 AM

Apr 9 2018

rengolin accepted D45430: [AArch64][AsmParser] Unify 'addVectorListOperands' functions..

Simple and NFC. LGTM. Thanks!

Apr 9 2018, 12:16 PM
rengolin added inline comments to D45428: [AArch64][AsmParser] Split index parsing from vector list..
Apr 9 2018, 12:10 PM
rengolin added a comment to D45427: [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors..

Indeed, nice cleanup. I think we discussed this earlier and it's a much welcome change, to merge NEON and SVE parsing.

Apr 9 2018, 11:53 AM
rengolin added inline comments to D44338: [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops..
Apr 9 2018, 7:31 AM
rengolin accepted D45370: [AArch64][SVE] Asm: Add support for SVE INDEX instructions..
Apr 9 2018, 2:07 AM
rengolin accepted D45371: [AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions..
Apr 9 2018, 2:07 AM

Apr 8 2018

rengolin accepted D45072: [NFC][LV] Move InterleaveInfo from Legal to CostModel.

Hi Hideki,

Apr 8 2018, 3:47 PM

Apr 6 2018

rengolin added a reviewer for D45371: [AArch64][SVE] Asm: Add support for unpredicated LSL/LSR (shift by immediate) instructions.: evandro.
Apr 6 2018, 8:06 AM
rengolin added a reviewer for D45370: [AArch64][SVE] Asm: Add support for SVE INDEX instructions.: evandro.
Apr 6 2018, 8:06 AM
rengolin added a comment to D45366: Support generic expansion of ordered vector reduction (PR36732).

For now, that the shuffle reduction is in loop utils, this patch is fine. But this really ought to be in target transform info.

Apr 6 2018, 8:02 AM
rengolin updated subscribers of D45240: [ARM] Compute a target feature which corresponds to the ARM version..

LTO is something I never considered before in the context of the target parser, but I understand the issues are similar to what the build attributes were trying to solve, so adding more info shouldn't make it worse.

Apr 6 2018, 5:04 AM

Apr 4 2018

rengolin accepted D44812: Remove MachineLoopInfo dependency from AsmPrinter..

I had seen the first version of this patch and thought it confusing, so refrained from reviewing (too broad context), but this version is much clearer now, and I can see it as straightforward benefit.

Apr 4 2018, 11:14 PM
rengolin added inline comments to D44812: Remove MachineLoopInfo dependency from AsmPrinter..
Apr 4 2018, 4:27 PM

Apr 3 2018

rengolin accepted D44678: [ARM] Do not convert some vmov instructions.
Apr 3 2018, 1:07 PM
rengolin accepted D44853: [AArch64] Change std::sort to llvm::sort in response to r327219.

This patch is one of a series of patches to replace *all* std::sort to llvm::sort in llvm/clang/polly (and other tools). I have a patch (https://reviews.llvm.org/D44363) which replaces all std::sort to llvm::sort. But as per review comments, it was suggested to break down that patch target-wise to smaller chunks. Hence I am now pushing smaller target-specific patches.

Apr 3 2018, 1:03 PM
rengolin added inline comments to D44338: [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops..
Apr 3 2018, 1:00 PM
rengolin added a comment to D42447: [LV][VPlan] Detect outer loops for explicit vectorization..

Is -vplan-build-stress-test flag in Patch #3 (D44338) aligned with what you had in mind? :)

Apr 3 2018, 12:47 PM

Mar 28 2018

rengolin added a comment to D44982: [zorg] Adding two new builders for armv7 and aarch64.

Nice! Great for bisecting! I'll let @maxim-kuvyrkov approve, as he's the one taking care of the Buildbots now.

Mar 28 2018, 11:37 PM

Mar 23 2018

rengolin requested changes to D44853: [AArch64] Change std::sort to llvm::sort in response to r327219.

I don't like this being on by default. I get it it uses EXPENSIVE_CHECKS, but still, most developers use Debug/Asserts builds.

Mar 23 2018, 8:39 PM

Mar 16 2018

rengolin accepted D44460: [ARM] Fix a check in vmov/vmvn immediate parsing.

LGTM, thanks!

Mar 16 2018, 5:29 AM

Mar 15 2018

rengolin accepted D44489: [TTI, AArch64] Allow the cost model analysis to test vector reduce intrinsics.

LGTM, thanks!

Mar 15 2018, 8:05 AM
rengolin accepted D44467: [ARM] Convert more invalid NEON immediate loads.

LGTM, thanks!

Mar 15 2018, 5:15 AM

Mar 14 2018

rengolin added a comment to D44467: [ARM] Convert more invalid NEON immediate loads.

Nice cleanup!

Mar 14 2018, 3:40 PM
rengolin accepted D44495: [CleanUp] Remove NumInstructions field from LoopVectorizer's RegisterUsage struct..

The original code was calculating the unroll factor by NumInstructions like:

Mar 14 2018, 3:19 PM
rengolin added a comment to D44489: [TTI, AArch64] Allow the cost model analysis to test vector reduce intrinsics.

Hi Mathew,

Mar 14 2018, 3:07 PM

Mar 10 2018

rengolin added a comment to D44355: [AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/str.

Can't this be done in the DAGCombine?

Mar 10 2018, 4:23 PM
rengolin accepted D43971: [AArch64] Implement native TLS for Windows.

Right, now it's clear what it's doing, thanks! I'm happy with it, LGTM.

Mar 10 2018, 5:52 AM

Mar 9 2018

rengolin added a comment to D43971: [AArch64] Implement native TLS for Windows.

Right, this makes a lot more sense! :)

Mar 9 2018, 2:55 PM
rengolin closed D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.
Mar 9 2018, 1:09 PM
rengolin added a comment to D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.

r327173

Mar 9 2018, 1:09 PM
rengolin committed rL327173: [NFC] Consolidate six getPointerOperand() utility functions into one place.
[NFC] Consolidate six getPointerOperand() utility functions into one place
Mar 9 2018, 1:09 PM
rengolin committed rL327155: [LV] Adding test for r327109.
[LV] Adding test for r327109
Mar 9 2018, 10:06 AM
rengolin added a comment to D43971: [AArch64] Implement native TLS for Windows.

No, that's probably because it aarch64-windows didn't exist at all back then (there's even no publicly available hardware yet), so this just did something else (looks like ELF relocations).

Mar 9 2018, 6:10 AM
rengolin accepted D43316: [test-suite] Update litsupport/module/microbenchmark.py to report individual timing results from 1 test..

LGTM after D43314 goes in. Thanks!

Mar 9 2018, 5:26 AM
rengolin accepted D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

LGTM with the sorting comment. Thanks!

Mar 9 2018, 5:26 AM
rengolin added a comment to D43971: [AArch64] Implement native TLS for Windows.

First: Currently there's the pseudoinstruction LOADgot which lowers into a "adrp + ldr" instruction pair, but the ldr always loads a full X-register. Here I needed to do that, but load a W-register instead, to only read 32 bits. I tried to do this by adding a separate LOADgot32 pseudoinstruction, but I wasn't able to make it return a 32 bit value on the SelectionDAG level.

Mar 9 2018, 5:20 AM
rengolin added a comment to D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.

Hi Hideki,

Mar 9 2018, 3:26 AM
rengolin committed rL327109: [LV] Fix vectorizer's isUniform() abuse triggers assert in SCEV.
[LV] Fix vectorizer's isUniform() abuse triggers assert in SCEV
Mar 9 2018, 2:34 AM
rengolin closed D43536: [LV] Fix for PR36311, vectorizer's isUniform() abuse triggers assert in SCEV.

r327109

Mar 9 2018, 2:34 AM
rengolin added inline comments to D42447: [LV][VPlan] Detect outer loops for explicit vectorization..
Mar 9 2018, 1:43 AM
rengolin accepted D43323: [NFC] Consolidate six getPointerOperand() utility functions into one place.

I agree this is a nice cleanup, too.

Mar 9 2018, 1:13 AM

Mar 8 2018

rengolin added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

I think it should be fine if it's not in the list of patterns recognised as tests...

Mar 8 2018, 11:44 AM
rengolin accepted D43536: [LV] Fix for PR36311, vectorizer's isUniform() abuse triggers assert in SCEV.

LGTM, thanks!

Mar 8 2018, 4:06 AM

Mar 7 2018

rengolin added a comment to D43971: [AArch64] Implement native TLS for Windows.

I don't know much about COFF TLS to tell why you need the special treatment, but the AArch64 asm looks correct (though very inefficient, but you know that:).

Mar 7 2018, 5:57 PM
rengolin added inline comments to D42447: [LV][VPlan] Detect outer loops for explicit vectorization..
Mar 7 2018, 5:46 PM
rengolin added a comment to D43536: [LV] Fix for PR36311, vectorizer's isUniform() abuse triggers assert in SCEV.

Fair enough, I can't see what case this is trying to cover anyway. I'm happy with the line going out without a big comment, commit message is usually ok.

Mar 7 2018, 5:12 PM
rengolin accepted D44234: [AArch64] Fix UB about shift amount exceeds data bit-width.

Nice! LGTM, thanks!

Mar 7 2018, 4:06 PM

Mar 5 2018

rengolin accepted D44084: [ARM][Asm] VMOVSRR and VMOVRRS need sequential S registers.
Mar 5 2018, 3:12 AM

Mar 2 2018

rengolin added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

Tried some things and was able to get it working using /dev/stdout.

Mar 2 2018, 5:11 AM