rengolin (Renato Golin)
Toolchain Engineer

Projects

User does not belong to any projects.

User Details

User Since
Oct 19 2012, 12:57 AM (252 w, 2 d)

Recent Activity

Yesterday

rengolin accepted D36899: [ARM] Check the right order for halves of VZIP/VUZP if both parts are used.

Hi Martin,

Sat, Aug 19, 11:30 AM

Fri, Aug 18

rengolin added a comment to D36731: [ARM][AArch64] Cortex-A75 and Cortex-A55 tests.

I'm happy with the patch, but I'll let @SjoerdMeijer approve.

Fri, Aug 18, 6:50 AM
rengolin added a comment to D36731: [ARM][AArch64] Cortex-A75 and Cortex-A55 tests.

This change seems to have nothing to do with adding core support but exposing features from CPU names.

Fri, Aug 18, 4:11 AM
rengolin committed rL311153: [Triple] Define OS Check for Haiku.
[Triple] Define OS Check for Haiku
Fri, Aug 18, 3:38 AM
rengolin closed D36814: [Triple] Define OS Check for Haiku.

This patch is needed in-tree for upstreaming my Swift port for Haiku, which is part of my Google Summer of Code project.

The swift-llvm fork regularly pulls changes from LLVM master into the upstream-with-swift branch, so adding this patch here should make sense.

Fri, Aug 18, 3:37 AM

Thu, Aug 17

rengolin accepted D36814: [Triple] Define OS Check for Haiku.

Curiosity: Do you have a use for this in-tree or just off-tree?

Thu, Aug 17, 10:57 AM

Tue, Aug 15

rengolin accepted D36732: [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs.

Ok, LGTM. Thanks!

Tue, Aug 15, 5:04 AM
rengolin added inline comments to D36732: [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputs.
Tue, Aug 15, 4:21 AM
rengolin added a reviewer for D36705: Avoid creating duplicate ANDs in SelectionDAG.: john.brawn.
Tue, Aug 15, 2:23 AM

Mon, Aug 14

rengolin added a reviewer for D31878: [Asm] Add debug tracing in table-generated assembly matcher: echristo.

Adding Eric, for consistency.

Mon, Aug 14, 11:58 AM
rengolin accepted D31537: [ARM, Asm] Use correct source location for register tokens.
Mon, Aug 14, 11:57 AM
rengolin accepted D36689: [ARM, Asm] Change grammar of immediate operand diagnostics.

NFC. LGTM. Thanks!

Mon, Aug 14, 11:56 AM
rengolin accepted D36692: [ARM, Asm] Add diagnostics for general-purpose register operands.

Other than the one comment, LGTM. Thanks!

Mon, Aug 14, 11:54 AM
rengolin accepted D36693: [ARM, Asm] Add diagnostics for floating-point register operands.
Mon, Aug 14, 11:52 AM
rengolin added a comment to D35821: Have ARM call setBooleanContents(ZeroOrOneBooleanContent)..

Submit as a different revision and we can discuss the merits there.

Mon, Aug 14, 11:46 AM
rengolin accepted D35821: Have ARM call setBooleanContents(ZeroOrOneBooleanContent)..

LGTM, thanks!

Mon, Aug 14, 11:16 AM
rengolin added a comment to D27620: [Assembler] Report multiple near misses for invalid instructions.

Hi Oliver,

Mon, Aug 14, 9:05 AM

Sat, Aug 12

rengolin accepted D34682: [Triple] Add isThumb and isARM functions..

Yup. LGTM. Thanks!

Sat, Aug 12, 8:24 AM
rengolin added a comment to D35821: Have ARM call setBooleanContents(ZeroOrOneBooleanContent)..

There are two patches here, please split them.

Sat, Aug 12, 6:29 AM

Fri, Aug 11

rengolin added a comment to D31220: [builtins][ARM] Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.

ping @compnerd

Fri, Aug 11, 3:05 PM
rengolin added inline comments to D34682: [Triple] Add isThumb and isARM functions..
Fri, Aug 11, 6:06 AM
rengolin accepted D36609: [AArch64] Remove dotprod from base extension list.
Fri, Aug 11, 5:51 AM

Thu, Aug 3

rengolin accepted D35569: [ARM] Remove FeatureNoARM implies ModeThumb..

Sorry, long holidays... :)

Thu, Aug 3, 11:34 PM
rengolin accepted D35826: [Driver] Error if ARM mode was selected explicitly for M-profile CPUs..

This makes sense to me. LGTM. Thanks!

Thu, Aug 3, 11:30 PM

Fri, Jul 28

rengolin accepted D35568: [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53..

LGTM too. Thanks!

Fri, Jul 28, 3:23 PM
rengolin added inline comments to D34682: [Triple] Add isThumb and isARM functions..
Fri, Jul 28, 3:22 PM

Tue, Jul 25

rengolin added a reviewer for D35826: [Driver] Error if ARM mode was selected explicitly for M-profile CPUs.: charles.baylis.
Tue, Jul 25, 7:20 AM

Mon, Jul 24

rengolin added a comment to D35568: [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53..

Using 8 byte alignment gives a 0.25% speedup on execution time (was 0.23% with 16 bytes), a 0.82% improvement
in benchmark scores (was 0.93% with 16 bytes) and a 0.20% increase in binary size (was 0.55%). So for the score related benchmarks, the 8 byte alignment makes things worse quite a bit, but the impact on size is much smaller. Should we use 8 byte alignment, to keep the binary size down?

Mon, Jul 24, 5:31 AM

Sun, Jul 23

rengolin accepted D35620: [AArch64] Add test for function alignment for a optsize function (NFC). .
Sun, Jul 23, 11:06 AM

Fri, Jul 21

rengolin added a comment to D35568: [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53..

So, it seems it was sphinx, but that was loop alignment, 4 bytes on A53, 8 bytes on A57, to do with the fetch alignment. Maybe this is a related issue. Why 16, though?

Fri, Jul 21, 4:32 PM
rengolin added a comment to D35568: [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53..

We found similar results on spec2k6 for aarch64 that we attributed to function alignment. Have you tried that? I need to dig the one culprit...

Fri, Jul 21, 4:24 PM

Jul 20 2017

rengolin added reviewers for D35696: Fix compiler-rt build with aarch64 using -march=armv8-a: zatrazz, peter.smith.
Jul 20 2017, 1:55 PM
rengolin added reviewers for D35697: Fix libcxx build with glibc 2.26+: zatrazz, peter.smith.
Jul 20 2017, 1:55 PM
rengolin added a comment to D35569: [ARM] Remove FeatureNoARM implies ModeThumb..

With FeatureARM, the generic CPU would have to have this feature, but when the features of the architecture (v6m) are applied, FeatureARM cannot be removed, so the behavior will be different to the current one (Architecture features are applied after CPU features I think). We could rename FeatureARM to FeatureThumbOnly, which is a bit clearer in my opinion.

Jul 20 2017, 9:15 AM
rengolin accepted D33410: Correct test for ARM targets.

Apologies, you had already sent all I needed. :)

Jul 20 2017, 4:12 AM

Jul 19 2017

rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Hi Ayal,

Jul 19 2017, 6:37 AM
rengolin added inline comments to D35498: [LoopVectorizer] Use two step casting for float to pointer type..
Jul 19 2017, 3:48 AM

Jul 18 2017

rengolin added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer type..
typedef struct CvNode1D
{
    float val;
    struct CvNode1D *next;
}
CvNode1D;

And we're trying to vectorize code that loads these structs. Since the pointer and the float have the same width, we can load four of the structs as, e.g. 2 * <4 x float>, and then use shuffles to get a vector of 4 floats and a vectors of 4 pointers.

Jul 18 2017, 9:13 AM

Jul 17 2017

rengolin added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer type..

Setting aside from the fact that no one should ever be casting floating point to pointers, perhaps we shouldn't be even trying to do this transformation here, as I'm not sure this could have a guaranteed semantics in this case, and probably better to just bail the vectorisation?

Jul 17 2017, 12:50 PM

Jul 14 2017

rengolin committed rL308006: [RelTest] Diana is doing both releases now.
[RelTest] Diana is doing both releases now
Jul 14 2017, 1:34 AM

Jul 13 2017

rengolin added a comment to D35307: [AArch64] Initial SVE register definitions.

This seems a very raw change, without any further description, comments or proper usage, other than a few changes on random places.

I don't agree. The additions are in line with the level of documentation in the rest of the file.

Jul 13 2017, 10:38 AM
rengolin accepted D35076: [AArch64] Add an SVE target feature to the backend and TargetParser.

The Clang counterpart was approved, this looks good, too. Thanks!

Jul 13 2017, 3:55 AM
rengolin accepted D35118: [AArch64] Add support for handling the +sve target feature.

Thanks. LGTM.

Jul 13 2017, 3:55 AM · Restricted Project
rengolin added a comment to D35118: [AArch64] Add support for handling the +sve target feature.

@jmolloy Can you check this change, please?

I'm not really removing FPUMode, I'm just converting an enum to a bit field. FPUMode, i.e. no NEON, after this change is now represented by simply having all bits be 0. See the equivalent implementation for ARM which does the same thing.

Jul 13 2017, 2:41 AM · Restricted Project

Jul 12 2017

rengolin added a reviewer for D35118: [AArch64] Add support for handling the +sve target feature: jmolloy.

@jmolloy Can you check this change, please?

Jul 12 2017, 9:17 AM · Restricted Project
rengolin added a comment to D35307: [AArch64] Initial SVE register definitions.

Hi Amara,

Jul 12 2017, 9:10 AM

Jul 7 2017

rengolin added a reviewer for D35118: [AArch64] Add support for handling the +sve target feature: t.p.northover.
Jul 7 2017, 4:46 AM · Restricted Project
rengolin added a comment to D35076: [AArch64] Add an SVE target feature to the backend and TargetParser.

Yes I have, it depends on this patch to go in first. I'll send that out for review tomorrow.

Jul 7 2017, 3:28 AM

Jul 6 2017

rengolin added a comment to D35076: [AArch64] Add an SVE target feature to the backend and TargetParser.

Hi Amara,

Jul 6 2017, 12:36 PM

Jul 5 2017

rengolin accepted D33128: [AsmParser] Mnemonic Spell Corrector.

Hi Sjoerd,

Jul 5 2017, 3:43 AM

Jul 1 2017

rengolin added a comment to D34910: Make LLVM_TARGETS_TO_BUILD=all build all targets.

Changing the default to stable removes any concerns about zorg. I quite like this change as I do agree that all should be all. I also don't like the -W distinction, and this is where this change makes sense.

Jul 1 2017, 5:09 AM

Jun 30 2017

rengolin added reviewers for D34875: ARM: Report error for invalid use of AAPCS_VFP calling convention: compnerd, joerg.
Jun 30 2017, 5:36 AM

Jun 26 2017

rengolin added a comment to D33410: Correct test for ARM targets.

Yes using current HEAD (rev 306279) we are still getting the same failure, for both Linux and Windows.

Jun 26 2017, 5:57 AM

Jun 6 2017

rengolin accepted D33786: [ARM] Fix Neon vector type alignment to 64-bit.

Thanks @srhines @javed.absar, LGTM!

Jun 6 2017, 9:45 AM
rengolin added a comment to D33786: [ARM] Fix Neon vector type alignment to 64-bit.

@srhines: Do you need Android to *always* be 64-bit, even if someone forces aapcs with it?

Jun 6 2017, 9:36 AM
rengolin added a comment to D33410: Correct test for ARM targets.

I don't see why we should force the target, given that this is a CodeGen test, should worr (and be tested) on all targets.

Jun 6 2017, 8:36 AM
rengolin added inline comments to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.
Jun 6 2017, 3:16 AM
rengolin added a comment to D33410: Correct test for ARM targets.

Just FYI, when you submit a patch, use "git diff -U999" so that we can see the context in the web interface. Makes it easier to review. :)

Jun 6 2017, 2:48 AM
rengolin added a comment to D33410: Correct test for ARM targets.

Where is this failing?

Jun 6 2017, 2:46 AM

Jun 5 2017

rengolin added inline comments to D33786: [ARM] Fix Neon vector type alignment to 64-bit.
Jun 5 2017, 1:35 AM
rengolin committed rL304699: Revert "[sanitizer-coverage] test for -fsanitize-coverage=inline-8bit-counters".
Revert "[sanitizer-coverage] test for -fsanitize-coverage=inline-8bit-counters"
Jun 5 2017, 12:36 AM
rengolin committed rL304698: Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize….
Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize…
Jun 5 2017, 12:36 AM
rengolin committed rL304697: Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize….
Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize…
Jun 5 2017, 12:36 AM

Jun 2 2017

rengolin added inline comments to D33836: [AArch64] Enable FeatureFuseAES for the generic processor model..
Jun 2 2017, 3:47 PM
rengolin added a comment to D33836: [AArch64] Enable FeatureFuseAES for the generic processor model..

Makes sense to me. But now that "generic" is the default, it'll impact all cores equally, so I'll let other people comment before approving.

Jun 2 2017, 8:53 AM

May 31 2017

rengolin added reviewers for D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour: delena, qcolombet, sanjoy, craig.topper.

I have just looked at performance. It was a while since I tried to evaluate the compile time impact of a single pass. How would you do this?

May 31 2017, 2:42 AM
rengolin added a comment to D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour.

I tried this - adding LSR just before loop vectorizer, and found that it did indeed affect a few benchmarks noticeably, with mixed results.

May 31 2017, 2:18 AM

May 30 2017

rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

A few more comments... :)

May 30 2017, 5:09 AM

May 26 2017

rengolin committed rL304010: [LSAN-ARM] Marking new test unsupported on ARMHF due to bot failures.
[LSAN-ARM] Marking new test unsupported on ARMHF due to bot failures
May 26 2017, 10:31 AM
rengolin committed rL303996: Revert "[OpenCL] An error shall occur if any scalar operand has greater rank….
Revert "[OpenCL] An error shall occur if any scalar operand has greater rank…
May 26 2017, 8:33 AM
rengolin added a comment to D33586: ARMv8.1 support for LLVM AArch64.

Hi, can you please re-share the patch with more context?

May 26 2017, 3:49 AM

May 25 2017

rengolin accepted D33558: [AArch64]: add 'a' inline asm operand modifier..

Silly inline comment, but otherwise, LGTM. Thanks!

May 25 2017, 11:49 AM

May 24 2017

rengolin accepted D33495: [ARM] Add tests for 6-M memcpy/memset code generation.
May 24 2017, 7:16 AM

May 23 2017

rengolin accepted D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Ok, I think it's good as it is. Looking forward to more patches in this area. :)

May 23 2017, 9:08 AM
rengolin accepted D33205: ARM] Fix Neon vector type alignment to 64-bit.

Approving, so that when Steve does the same, the review is free to commit.

May 23 2017, 9:07 AM
rengolin added a comment to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Hi all:
I have made the changes (excluding Android) as recommended. Please have a look if it is now fine.

May 23 2017, 9:06 AM
rengolin added a comment to D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Hi Javed,

May 23 2017, 5:44 AM

May 22 2017

rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Some initial comments...

May 22 2017, 10:44 AM
rengolin added reviewers for D32166: Improve LoopVectorizers estimation of scalar loops by predicting LSR behaviour: mssimpso, mkuper.

In this sense, isn't the loop vectorizer run too early?

May 22 2017, 10:39 AM
rengolin added inline comments to D33205: ARM] Fix Neon vector type alignment to 64-bit.
May 22 2017, 9:30 AM
rengolin added a comment to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Ah, I just found the email thread, looks like it never reached phabricator. Sorry for the confusion.

May 22 2017, 9:28 AM
rengolin requested changes to D33205: ARM] Fix Neon vector type alignment to 64-bit.

Wait, Stephen had a concern, we need his feedback on my reply before approving.

May 22 2017, 9:20 AM
rengolin added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Hi Ayal, sorry for the delay. Can you commit the plan doc separately?

May 22 2017, 2:42 AM

May 21 2017

rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

Put differently, this patch makes sense *once* we clearly have consensus on llvm-dev. So far, the only thread I can find did not reach any meaningful consensus. Notably, none of the code generator people were heavily contributing to that thread, and there remain large unmentioned technical concerns (stack spills, alloca size, etc)

May 21 2017, 3:41 PM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Which comment? FWIW, I didn't see a particular response.

May 21 2017, 1:17 PM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

Just to be clear, I'm not *against* the idea of an intrinsic, nor I'm pushing this patch for any personal/professional agenda. I hope I have made that perfectly clear on my previous reviews on the same patches before.

May 21 2017, 11:02 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

May 21 2017, 10:51 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)?

May 21 2017, 4:52 AM
rengolin added inline comments to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.
May 21 2017, 4:38 AM
rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

I agree with Chandler on this. Making something a first class type in LLVM has wide reaching effects, including requiring the ability to load/store/phi the value, pass it by argument, etc.

May 21 2017, 4:32 AM

May 20 2017

rengolin added reviewers for D32737: [Constants][SVE] Represent the runtime length of a scalable vector: jmolloy, kristof.beyls, aadg, hfinkel, aschwaighofer, mssimpso, Ayal, mkuper.
May 20 2017, 3:17 PM
rengolin added reviewers for D32737: [Constants][SVE] Represent the runtime length of a scalable vector: delena, asb.

Uh, where was it discussed? It's entirely possible I missed it, but I can't find any consensus on any of the threads that we actually want to support runtime vector width in LLVM's IR.

May 20 2017, 3:08 PM
rengolin requested changes to D32833: [Triple] Add method for triple canonicalization.

Right, I think this is a useful thing to have, but it needs a better plan.

May 20 2017, 1:10 PM

May 19 2017

rengolin added a comment to D32737: [Constants][SVE] Represent the runtime length of a scalable vector.

@echristo @chandlerc @lattner @majnemer Ping.

May 19 2017, 5:09 AM

May 18 2017

rengolin added a comment to D33128: [AsmParser] Mnemonic Spell Corrector.

Hi Sjoerd,

May 18 2017, 6:34 AM

May 17 2017

rengolin added a comment to D33120: [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..

Hi Javed,

May 17 2017, 8:37 AM
rengolin added a comment to D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI).

Right, I see what you mean about the "address computation" to be forced into scalar registers. Does that mean that you have to use the same GR to load into all lanes of the vector, so supposedly load+increment+load+increment...?

It doesn't have to be the same GR for all the lanes, but it must be a GR (since we don't use the slow "element gather/scatter")

Regardless, between iterations of the loop, you want to keep the addresses in the GRs throughout the execution of the loop. Is that correct?

Yes, that's better than having to extract elements. The extracts are relatively expensive.

May 17 2017, 8:33 AM
rengolin added reviewers for D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI): delena, jmolloy.

I don't quite follow how this has to do with vector shuffles...? On SystemZ, all addresses residing in vectors must be extracted before use. (There are slow vector element gather/scatter, which are not used by the backend, so they are irrelevant).

May 17 2017, 7:50 AM
rengolin added a comment to D32422: LoopVectorizer: let target prefer scalar addressing computations (+ minor improvements in SystemZTTI).

Hi Jonas,

May 17 2017, 7:02 AM

May 16 2017

rengolin committed rL303193: Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove".
Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"
May 16 2017, 11:12 AM