rengolin (Renato Golin)
Toolchain Engineer

Projects

User does not belong to any projects.

User Details

User Since
Oct 19 2012, 12:57 AM (244 w, 21 h)

Recent Activity

Tue, Jun 6

rengolin accepted D33786: [ARM] Fix Neon vector type alignment to 64-bit.

Thanks @srhines @javed.absar, LGTM!

Tue, Jun 6, 9:45 AM
rengolin added a comment to D33786: [ARM] Fix Neon vector type alignment to 64-bit.

@srhines: Do you need Android to *always* be 64-bit, even if someone forces aapcs with it?

Tue, Jun 6, 9:36 AM
rengolin added a comment to D33410: Correct test for ARM targets.

I don't see why we should force the target, given that this is a CodeGen test, should worr (and be tested) on all targets.

Tue, Jun 6, 8:36 AM
rengolin added inline comments to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.
Tue, Jun 6, 3:16 AM
rengolin added a comment to D33410: Correct test for ARM targets.

Just FYI, when you submit a patch, use "git diff -U999" so that we can see the context in the web interface. Makes it easier to review. :)

Tue, Jun 6, 2:48 AM
rengolin added a comment to D33410: Correct test for ARM targets.

Where is this failing?

Tue, Jun 6, 2:46 AM

Mon, Jun 5

rengolin added inline comments to D33786: [ARM] Fix Neon vector type alignment to 64-bit.
Mon, Jun 5, 1:35 AM
rengolin committed rL304699: Revert "[sanitizer-coverage] test for -fsanitize-coverage=inline-8bit-counters".
Revert "[sanitizer-coverage] test for -fsanitize-coverage=inline-8bit-counters"
Mon, Jun 5, 12:36 AM
rengolin committed rL304698: Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize….
Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize…
Mon, Jun 5, 12:36 AM
rengolin committed rL304697: Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize….
Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize…
Mon, Jun 5, 12:36 AM

Fri, Jun 2

rengolin added inline comments to D33836: [AArch64] Enable FeatureFuseAES for the generic processor model..
Fri, Jun 2, 3:47 PM
rengolin added a comment to D33836: [AArch64] Enable FeatureFuseAES for the generic processor model..

Makes sense to me. But now that "generic" is the default, it'll impact all cores equally, so I'll let other people comment before approving.

Fri, Jun 2, 8:53 AM

Wed, May 31

rengolin added reviewers for D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour: delena, qcolombet, sanjoy, craig.topper.

I have just looked at performance. It was a while since I tried to evaluate the compile time impact of a single pass. How would you do this?

Wed, May 31, 2:42 AM
rengolin added a comment to D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour.

I tried this - adding LSR just before loop vectorizer, and found that it did indeed affect a few benchmarks noticeably, with mixed results.

Wed, May 31, 2:18 AM

Tue, May 30

rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

A few more comments... :)

Tue, May 30, 5:09 AM

Fri, May 26

rengolin committed rL304010: [LSAN-ARM] Marking new test unsupported on ARMHF due to bot failures.
[LSAN-ARM] Marking new test unsupported on ARMHF due to bot failures
Fri, May 26, 10:31 AM
rengolin committed rL303996: Revert "[OpenCL] An error shall occur if any scalar operand has greater rank….
Revert "[OpenCL] An error shall occur if any scalar operand has greater rank…
Fri, May 26, 8:33 AM
rengolin added a comment to D33586: ARMv8.1 support for LLVM AArch64.

Hi, can you please re-share the patch with more context?

Fri, May 26, 3:49 AM

Thu, May 25

rengolin accepted D33558: [AArch64]: add 'a' inline asm operand modifier..

Silly inline comment, but otherwise, LGTM. Thanks!

Thu, May 25, 11:49 AM

May 24 2017

rengolin accepted D33495: [ARM] Add tests for 6-M memcpy/memset code generation.
May 24 2017, 7:16 AM

May 23 2017

rengolin accepted D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Ok, I think it's good as it is. Looking forward to more patches in this area. :)

May 23 2017, 9:08 AM
rengolin accepted D33205: ARM] Fix Neon vector type alignment to 64-bit.

Approving, so that when Steve does the same, the review is free to commit.

May 23 2017, 9:07 AM
rengolin added a comment to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Hi all:
I have made the changes (excluding Android) as recommended. Please have a look if it is now fine.

May 23 2017, 9:06 AM
rengolin added a comment to D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Hi Javed,

May 23 2017, 5:44 AM

May 22 2017

rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Some initial comments...

May 22 2017, 10:44 AM
rengolin added reviewers for D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour: mssimpso, mkuper.

In this sense, isn't the loop vectorizer run too early?

May 22 2017, 10:39 AM
rengolin added inline comments to D33205: ARM] Fix Neon vector type alignment to 64-bit.
May 22 2017, 9:30 AM
rengolin added a comment to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Ah, I just found the email thread, looks like it never reached phabricator. Sorry for the confusion.

May 22 2017, 9:28 AM
rengolin requested changes to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Wait, Stephen had a concern, we need his feedback on my reply before approving.

May 22 2017, 9:20 AM
rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Hi Ayal, sorry for the delay. Can you commit the plan doc separately?

May 22 2017, 2:42 AM

May 21 2017

rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

Put differently, this patch makes sense *once* we clearly have consensus on llvm-dev. So far, the only thread I can find did not reach any meaningful consensus. Notably, none of the code generator people were heavily contributing to that thread, and there remain large unmentioned technical concerns (stack spills, alloca size, etc)

May 21 2017, 3:41 PM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Which comment? FWIW, I didn't see a particular response.

May 21 2017, 1:17 PM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

Just to be clear, I'm not *against* the idea of an intrinsic, nor I'm pushing this patch for any personal/professional agenda. I hope I have made that perfectly clear on my previous reviews on the same patches before.

May 21 2017, 11:02 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

May 21 2017, 10:51 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)?

May 21 2017, 4:52 AM
rengolin added inline comments to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.
May 21 2017, 4:38 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

I agree with Chandler on this. Making something a first class type in LLVM has wide reaching effects, including requiring the ability to load/store/phi the value, pass it by argument, etc.

May 21 2017, 4:32 AM

May 20 2017

rengolin added reviewers for D32737: [Constants][SVE] Represent the runtime length of a scalable vector: jmolloy, kristof.beyls, aadg, hfinkel, aschwaighofer, mssimpso, Ayal, mkuper.
May 20 2017, 3:17 PM
rengolin added reviewers for D32737: [Constants][SVE] Represent the runtime length of a scalable vector: delena, asb.

Uh, where was it discussed? It's entirely possible I missed it, but I can't find any consensus on any of the threads that we actually want to support runtime vector width in LLVM's IR.

May 20 2017, 3:08 PM
rengolin requested changes to D32833: [Triple] Add method for triple canonicalization.

Right, I think this is a useful thing to have, but it needs a better plan.

May 20 2017, 1:10 PM

May 19 2017

rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

@echristo @chandlerc @lattner @majnemer Ping.

May 19 2017, 5:09 AM

May 18 2017

rengolin added a comment to D33128: [AsmParser] Mnemonic Spell Corrector.

Hi Sjoerd,

May 18 2017, 6:34 AM

May 17 2017

rengolin added a comment to D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Hi Javed,

May 17 2017, 8:37 AM
rengolin added a comment to D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI).

Right, I see what you mean about the "address computation" to be forced into scalar registers. Does that mean that you have to use the same GR to load into all lanes of the vector, so supposedly load+increment+load+increment...?

It doesn't have to be the same GR for all the lanes, but it must be a GR (since we don't use the slow "element gather/scatter")

Regardless, between iterations of the loop, you want to keep the addresses in the GRs throughout the execution of the loop. Is that correct?

Yes, that's better than having to extract elements. The extracts are relatively expensive.

May 17 2017, 8:33 AM
rengolin added reviewers for D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI): delena, jmolloy.

I don't quite follow how this has to do with vector shuffles...? On SystemZ, all addresses residing in vectors must be extracted before use. (There are slow vector element gather/scatter, which are not used by the backend, so they are irrelevant).

May 17 2017, 7:50 AM
rengolin added a comment to D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI).

Hi Jonas,

May 17 2017, 7:02 AM

May 16 2017

rengolin committed rL303193: Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove".
Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
May 16 2017, 11:12 AM
rengolin committed rL303146: Merging rr302639:.
Merging rr302639:
May 16 2017, 12:10 AM

May 13 2017

rengolin accepted D32605: Recognize CTLZ builtin.

Perfect, thanks! LGTM.

May 13 2017, 3:32 AM

May 12 2017

rengolin added inline comments to D32605: Recognize CTLZ builtin.
May 12 2017, 1:42 PM
rengolin added a comment to D32605: Recognize CTLZ builtin.

Right, just to make sure it's doing what we expect it to, did you try to run this with targets "armv7a" and "armv4t"? They should have different results.

May 12 2017, 1:13 AM

May 11 2017

rengolin added a comment to D30324: [ARM] Thumb2: favor R4-R7 over R12/LR in allocation order when opt for minsize.

Hi Weiming,

May 11 2017, 11:52 AM
rengolin committed rL302786: [MSAN] test failed randomly on ARM when XFAILED for MIPS.
[MSAN] test failed randomly on ARM when XFAILED for MIPS
May 11 2017, 4:19 AM

May 10 2017

rengolin accepted D32792: [LLVM][inline-asm] Altmacro string escape character '!'.

As parsers go, simplicity and elegance are usually mutually exclusive.

May 10 2017, 3:06 AM
rengolin added a comment to D32605: Recognize CTLZ builtin.

That's not true. ARMv4 for example has no clz, it's a V5T feature.

May 10 2017, 1:22 AM

May 9 2017

rengolin added a comment to D32605: Recognize CTLZ builtin.

I agree. If the CPU has it, it will be beneficial. If it doesn't, it is only a useful transformation if the intrinsic can be constant folded.

May 9 2017, 1:07 PM
rengolin added a comment to D31801: Performance enhancements for Cavium ThunderX2 T99.

Ping2!! :-)

May 9 2017, 12:36 PM
rengolin added a comment to D32605: Recognize CTLZ builtin.

Wait, we also need the TTI callbacks. I thought you were going to introduce them.

May 9 2017, 12:32 PM
rengolin committed rL302520: [Dwarf] Disable reference verification for now (PR32972).
[Dwarf] Disable reference verification for now (PR32972)
May 9 2017, 5:50 AM

May 8 2017

rengolin added a comment to D30086: Add generic IR vector reductions.

When people add new intrinsics they don't really add tests to check the IR since you get assertion failures during the FunctionType creation if something goes wrong.

May 8 2017, 6:04 AM
rengolin added a comment to D30086: Add generic IR vector reductions.

Right, this is looking much better. Now, what about tests?

May 8 2017, 5:43 AM
rengolin added a comment to D30086: Add generic IR vector reductions.

Hi Amara,

May 8 2017, 5:27 AM

May 5 2017

rengolin added a comment to D32833: [Triple] Add method for triple canonicalization.

Also, where would this be used?

May 5 2017, 7:39 AM
rengolin added inline comments to D32833: [Triple] Add method for triple canonicalization.
May 5 2017, 7:29 AM
rengolin accepted D32534: [ARM] Add support for ORR and ORN instruction substitutions.

Sorry for the delay, LGTM. Thanks!

May 5 2017, 3:26 AM
rengolin added a comment to D27620: [Assembler] Report multiple near misses for invalid instructions.

@echristo, any comments on this?

May 5 2017, 3:24 AM
rengolin added inline comments to D32792: [LLVM][inline-asm] Altmacro string escape character '!'.
May 5 2017, 3:22 AM

May 4 2017

rengolin committed rL302165: [test-release] Status update *before* long gzip.
[test-release] Status update *before* long gzip
May 4 2017, 9:34 AM
rengolin accepted D32850: [ArgPromotion] Fix a truncated variable.

Right, makes sense. This is an obvious fix, so I feel comfortable approving this one. LGTM. Thanks!

May 4 2017, 3:15 AM
rengolin added inline comments to D32200: [LV] Refactor ILV.vectorize[Loop]() by introducing LVP.executePlan().
May 4 2017, 2:39 AM
rengolin added a comment to D32850: [ArgPromotion] Fix a truncated variable.

Wouldn't it be better to set the type manually to something like uint32_t or uint64_t? To make sure they're the same across all arches?

May 4 2017, 2:25 AM
rengolin accepted D32701: [LLVM][inline-asm][Altmacor] Altmacro string delimiter '<..>'.

LGTM. Thanks!

May 4 2017, 2:09 AM
rengolin added a comment to D31801: Performance enhancements for Cavium ThunderX2 T99.

I'm ok with the change, as long as @javed.absar is happy. :)

May 4 2017, 1:36 AM

May 3 2017

rengolin accepted D32780: [AArch64] Make the TargetParser add CPU exts provided by the arch..

I really thought it was there already! Thanks for the patch! LGTM.

May 3 2017, 10:45 AM
rengolin added a comment to D32701: [LLVM][inline-asm][Altmacor] Altmacro string delimiter '<..>'.

You seem to have lost the negative test.

May 3 2017, 7:31 AM
rengolin added a comment to D32701: [LLVM][inline-asm][Altmacor] Altmacro string delimiter '<..>'.

With that in mind, I think you should then go back to your original implementation (ignore quotes) and let the expression evaluation crash if there's any single quote in between (or whatever is the expected behaviour).

May 3 2017, 7:14 AM
rengolin added inline comments to D32701: [LLVM][inline-asm][Altmacor] Altmacro string delimiter '<..>'.
May 3 2017, 4:07 AM
rengolin added reviewers for D32737: [Constants][SVE] Represent the runtime length of a scalable vector: chandlerc, echristo, lattner, majnemer.

Hi Graham,

May 3 2017, 3:46 AM

May 2 2017

rengolin added a comment to D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour.

Hi Jonas,

May 2 2017, 12:36 PM
rengolin added a comment to D30086: Add generic IR vector reductions.

Ok, so I've restructured the two functions a bit so that the simple (non minmax) reductions are generated from the createSimpleTargetReduction() function and the recurrence descriptor uses that for the simple cases, passing in the opcode.

May 2 2017, 11:12 AM
rengolin added inline comments to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.
May 2 2017, 11:05 AM
rengolin added inline comments to D30086: Add generic IR vector reductions.
May 2 2017, 4:41 AM
rengolin added inline comments to D32701: [LLVM][inline-asm][Altmacor] Altmacro string delimiter '<..>'.
May 2 2017, 4:25 AM
rengolin added inline comments to D30086: Add generic IR vector reductions.
May 2 2017, 4:06 AM

May 1 2017

rengolin accepted D32526: [LLVM][inline-asm] Altmacro absolute expression '%' feature.

Thanks! LGTM with the test change.

May 1 2017, 4:05 AM
rengolin added a comment to D32526: [LLVM][inline-asm] Altmacro absolute expression '%' feature.

Can you merge D32523 with this one?

May 1 2017, 3:06 AM

Apr 28 2017

rengolin added a comment to D32605: Recognize CTLZ builtin.

Yes.

Apr 28 2017, 9:12 AM
rengolin added a comment to D31801: Performance enhancements for Cavium ThunderX2 T99.

So, then, this would be a case of "some micro-architectures can do this for any floating-point type, some others need special treatment".

Apr 28 2017, 7:09 AM
rengolin added a comment to D31801: Performance enhancements for Cavium ThunderX2 T99.

Right, I clicked the link and it took me to the old code. :)

Apr 28 2017, 6:51 AM
rengolin added a comment to D32200: [LV] Refactor ILV.vectorize[Loop]() by introducing LVP.executePlan().

Hi Ayal,

Apr 28 2017, 3:28 AM
rengolin added inline comments to D31801: Performance enhancements for Cavium ThunderX2 T99.
Apr 28 2017, 2:44 AM
rengolin added a comment to D32460: ADT: handle special case of ARM environment for SuSE.

Yes, the idea is that the textual IR uses the correct triple, unless it is overidden at the command line.

Why not just handle this in clang?

Apr 28 2017, 2:17 AM
rengolin added a comment to D32605: Recognize CTLZ builtin.

Guarding is easier. Let's do this only if someone complains about performance.

Apr 28 2017, 2:02 AM

Apr 27 2017

rengolin added a comment to D32605: Recognize CTLZ builtin.

In this case builtin should be expanded at codegen in the most profitable way.

Apr 27 2017, 12:13 PM
rengolin added a comment to D28975: [LV] Introducing VPlan to model the vectorized code and drive its transformation.

So, I think the document is so good and can be very helpful in further communications, that I think it should be committed separately.

Apr 27 2017, 12:10 PM
rengolin added a comment to D32605: Recognize CTLZ builtin.

Hi Evgeny,

Apr 27 2017, 11:59 AM

Apr 26 2017

rengolin added inline comments to D32526: [LLVM][inline-asm] Altmacro absolute expression '%' feature.
Apr 26 2017, 8:17 AM
rengolin added reviewers for D32530: [SVE] Scalable Vector IR Type: echristo, chandlerc, hfinkel.

Hi Graham, thanks for this work.

Apr 26 2017, 5:24 AM
rengolin added a comment to D32526: [LLVM][inline-asm] Altmacro absolute expression '%' feature.

I think this patch and D32523 are small and simple enough that it could be merged into one...

Apr 26 2017, 4:07 AM
rengolin added a reviewer for D32296: [ARM] Modify lowering of setjmp: t.p.northover.

Adding Tim, as this is likely going to affect Darwin more than Linux.

Apr 26 2017, 3:26 AM