aemerson (Amara Emerson)
Asian George Costanza

Projects

User does not belong to any projects.

User Details

User Since
Sep 9 2013, 3:45 AM (241 w, 3 d)

Compilers at a fruit company

Recent Activity

Today

aemerson accepted D46095: [GlobalISel] Reporting rules covered as part of the InstructionSelect's debug-only printing.
Thu, Apr 26, 8:11 AM

Yesterday

aemerson committed rL330831: [AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic..
[AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic.
Wed, Apr 25, 7:47 AM

Tue, Apr 24

aemerson updated the summary of D46018: [GlobalISel][IRTranslator] Split aggregates during IR translation.
Tue, Apr 24, 9:00 AM
aemerson created D46018: [GlobalISel][IRTranslator] Split aggregates during IR translation.
Tue, Apr 24, 8:55 AM

Mon, Apr 23

aemerson accepted D45543: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64.

Maybe we could have all the tests in one file called prelegalize-combine-extloads.mir

I went with three files to match how we've been organizing the tests for the other passes but you raise a good point here. There's a good argument for the combiner being tested with one file per CombinerHelper::try*() function. I think that's probably a better organization.

I'm assuming this is in Target/AArch64 because other targets haven't been updated to use the new opcodes yet? We do eventually want to use these representations for every target though right?

That's right. As you say, we'll want to support it in every target that supports the extending loads (which is most if not all of the in-tree targets). However, until the new opcodes are legal for those targets, there's not much point in combining them only to revert them back to load+extend in the legalizer.

One other thing to mention is that we don't have a target-independent combiner in GlobalISel at the moment. Each target implements its own combiner(s) and makes use of code in CombinerHelper (where appropriate) to share code. I expect these combines to be used by multiple targets so I've put the bulk of the code in CombinerHelper but each target will need to add a pass and call to it.

Mon, Apr 23, 8:34 AM

Thu, Apr 19

aemerson committed rL330384: Move a dump() implementation out of line..
Move a dump() implementation out of line.
Thu, Apr 19, 5:48 PM

Wed, Apr 18

aemerson added a comment to D45543: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64.

I'm assuming this is in Target/AArch64 because other targets haven't been updated to use the new opcodes yet? We do eventually want to use these representations for every target though right?

Wed, Apr 18, 5:38 PM
aemerson accepted D45540: [globalisel][legalizerinfo] Introduce dedicated extending loads and add lowerings for them.

LGTM, with addition of -verify-machineinstrs to the tests.

Wed, Apr 18, 5:30 PM
aemerson accepted D45466: [globalisel][legalizerinfo] Add support for legalization based on the MachineMemOperand.

LGTM.

Wed, Apr 18, 5:14 PM
aemerson committed rL330276: [AArch64] Add isel pattern for v8i8->v2f32 NVCASTs..
[AArch64] Add isel pattern for v8i8->v2f32 NVCASTs.
Wed, Apr 18, 10:14 AM

Wed, Apr 11

aemerson abandoned D45023: [GlobalOpt] Implement static evaluation of memcpy intrinsics for const i8 arrays..

The motivating use case for this was actually fixed in clang, r325478. Although this may be useful in future for other front-ends/clients of LLVM, this can be revived if needed later.

Wed, Apr 11, 8:37 AM

Tue, Apr 10

aemerson committed rL329743: [AArch64] Fix isel failure when BUILD_PAIR nodes are left over..
[AArch64] Fix isel failure when BUILD_PAIR nodes are left over.
Tue, Apr 10, 12:05 PM

Mon, Apr 9

aemerson accepted D45067: [GISel] Refactor MachineIRBuilder so we can optionally do constant folding/other transformations during building.

LGTM, thanks.

Mon, Apr 9, 3:09 AM
aemerson accepted D45366: Support generic expansion of ordered vector reduction (PR36732).
Mon, Apr 9, 2:41 AM

Fri, Apr 6

aemerson added a comment to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).

The other issue is that while these intrinsics were experimental (it was on my todo list for later this year to change that), AArch64 has been using them with their original intended semantics for a while now. If we change that, IR generated from a released compiler will be miscompiled since it now becomes legal to fold away undef accumulator reductions to undef, unless we do some IR auto-upgrading based on the bitcode version.

If they are flagged as experimental surely we can't be held resposible for changes in behaviour?

AArch64 was the test guinea pig for this new representation, and it's proved to be a smooth transition. If you change the semantics now, even if the intrinsics are still experimental in name, IR from LLVM 6.0 may be silently miscompiled if someone implements a valid optimization based on your new proposal. That's a fact, and I don't think this patch review is the right place to discuss that if you want to do this, I suggest you send a new RFC or revive the old one.

Fri, Apr 6, 1:27 PM
aemerson added a comment to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).

Simon asked this on the PR, let's continue the discussion in one place:

Do we really want to completely ignore an intrinsic argument depending on the fast flags? There might be valid reasons to want to include an accumulation value.

The raison d'être for the argument is for ordered reductions, and the intrinsics were designed to allow the expression of the reduction idiom only, in light of newer vector architectures where the previous representation was inadequate. The use of an accumulator argument for fast reductions wasn't necessary, so the semantics were supposed to be defined in the most minimal form. The accumulator can easily be expressed as a extractelement->op->insertelement sequence. The question here I think becomes our good old friend: what should the canonical form be?

Fri, Apr 6, 11:29 AM
aemerson added a comment to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).

@aemerson

I think I missed out a detail when I wrote the langref, original motivation of the scalar accumulator argument was for the use in strictly ordered FP reductions only. I.e. when the intrinsic call has no FMF flags attached then the accumulator argument is used, otherwise if there are no FMF flags then the argument is meant to be ignored.

Why do we need the accumulator for this case? That is, why can't we just do:

result = vector[0];
for i in [1, vector.len) {
    result = binary_op(result, vector[i]);
}
return result;

I also wonder whether requiring fast-math to allow tree reductions is overkill. Tree reductions can be implemented reasonably efficiently in many architectures, while linearly ordered reduction appear to me to be more of a niche. Therefore, I wonder if it wouldn't make more sense to add llvm.experimental.vector.reduce.tree.{add,mul} that perform tree reductions without requiring fast math, and to just call those from here if fast-math is enabled.

Fri, Apr 6, 9:08 AM
aemerson added inline comments to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).
Fri, Apr 6, 7:11 AM

Thu, Mar 29

aemerson created D45023: [GlobalOpt] Implement static evaluation of memcpy intrinsics for const i8 arrays..
Thu, Mar 29, 6:13 AM

Mar 26 2018

aemerson added a comment to D43962: [GlobalISel][utils] Adding the init version of Instruction Select Testgen.

Hi Roman,

Mar 26 2018, 4:04 PM

Mar 23 2018

aemerson committed rL328311: [GlobalISel] Fix legalizer combine to not use illegal input G_EXTRACT..
[GlobalISel] Fix legalizer combine to not use illegal input G_EXTRACT.
Mar 23 2018, 5:51 AM

Mar 22 2018

aemerson accepted D44762: [GISel]: Fix incorrect IRTranslation while translating null pointer types.

LGTM.

Mar 22 2018, 10:17 AM

Mar 19 2018

aemerson accepted D44291: [ARM,AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes.
Mar 19 2018, 11:08 AM

Mar 16 2018

aemerson added inline comments to D44291: [ARM,AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes.
Mar 16 2018, 7:19 AM

Mar 15 2018

aemerson added inline comments to D44291: [ARM,AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes.
Mar 15 2018, 6:17 AM

Mar 13 2018

aemerson added a comment to D42512: [X86] When using Win64 ABI, exit with error if SSE is disabled for varargs.
In D42512#1026016, @rnk wrote:

This doesn't fix the crash, though. We just assert later now.

Mar 13 2018, 10:44 AM

Mar 8 2018

aemerson accepted D44245: Propagate flags to SDValue in SplitVecOp_VECREDUCE.

LGTM, can you also make the test name a little more specific, e.g. 'vecreduce-propagate-sd-flags.ll'?

Mar 8 2018, 2:32 PM
aemerson added a comment to D44245: Propagate flags to SDValue in SplitVecOp_VECREDUCE.

Please add a test.

Mar 8 2018, 8:23 AM

Mar 1 2018

aemerson added a comment to D43108: Support for the mno-stack-arg-probe flag.

By default, stack probes are enabled (i.e., -mstack-arg-probe is the default behavior) and have the size of 4K in x86.

This part what I wanted to clarify, -mstack-probe-arg is enabling stack probes if the ABI requires it only, not for other reasons like security.

Mar 1 2018, 3:39 AM

Feb 28 2018

aemerson added a comment to D43108: Support for the mno-stack-arg-probe flag.

Can we clarify the meaning of this option a bit. The doc you've added here is saying that -mno-stack-arg-probe disables stack probes. Then what does -mstack-arg-probe mean specifically? Does it mean that only stack probes for ABI required reasons are enabled, or probes are done even in cases where the ABI doesn't require them? Either way, the doc needs to be clearer on the exact purpose.

Feb 28 2018, 4:17 PM

Feb 27 2018

aemerson accepted D43796: [GISel]: Print useful remarks when GISelAbort = 1 .

Seems reasonable.

Feb 27 2018, 5:33 AM

Feb 26 2018

aemerson updated the diff for D42512: [X86] When using Win64 ABI, exit with error if SSE is disabled for varargs.

Changed to use errorUnsupported() so we get some diagnostics first.

Feb 26 2018, 8:08 AM

Feb 21 2018

aemerson added inline comments to D43444: [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.
Feb 21 2018, 11:00 PM
aemerson closed D43444: [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.

r325550.

Feb 21 2018, 4:02 AM

Feb 19 2018

aemerson committed rL325550: [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to….
[AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to…
Feb 19 2018, 9:16 PM
aemerson accepted D43444: [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.

As stated in PR36345 I'm committing this now to get it into 6.0.

Feb 19 2018, 8:25 PM

Feb 18 2018

aemerson created D43444: [AArch64][GlobalISel] When copying from a gpr32 to an fpr16 reg, convert to fpr32 first.
Feb 18 2018, 9:36 AM
aemerson committed rL325464: Fix unused assertion variable warning..
Fix unused assertion variable warning.
Feb 18 2018, 9:30 AM
aemerson committed rL325463: [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied.
[AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied
Feb 18 2018, 9:13 AM
aemerson closed D43310: [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied to GPR register banks.
Feb 18 2018, 9:12 AM
aemerson committed rL325462: [AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits..
[AArch64][GlobalISel] Support G_INSERT/G_EXTRACT of types < s32 bits.
Feb 18 2018, 9:05 AM
aemerson added a comment to D43310: [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied to GPR register banks.

Thanks, I'll commit this with those changes and put up another patch shortly for the copies in the other direction.

Feb 18 2018, 9:05 AM

Feb 16 2018

aemerson accepted D42356: [AArch64] Implement dynamic stack probing for windows.

Thanks for refactoring it. LGTM with a minor comment addition.

Feb 16 2018, 10:50 PM

Feb 14 2018

aemerson created D43310: [AArch64][GlobalISel] Fix an assert fail/miscompile when fp16 types are copied to GPR register banks.
Feb 14 2018, 11:40 AM

Feb 9 2018

aemerson closed D38128: Handle COPYs of physregs better (regalloc hints).
Feb 9 2018, 2:41 AM
aemerson accepted D38128: Handle COPYs of physregs better (regalloc hints).

Sorry I didn't see this, I need to fix my email filters.

Feb 9 2018, 2:41 AM

Feb 2 2018

aemerson added a comment to D42860: [ReleaseNotes] Add note for the new -fexperimental-isel flag..

@hans if you're happy with this could you commit to the branch?

Feb 2 2018, 1:17 PM
aemerson added a comment to D42861: [ReleaseNotes] Add note for enabling GlobalISel for AArch64 -O0.

@hans if you're happy with this could you commit to the branch?

Feb 2 2018, 1:17 PM
aemerson created D42860: [ReleaseNotes] Add note for the new -fexperimental-isel flag..
Feb 2 2018, 11:34 AM
aemerson created D42861: [ReleaseNotes] Add note for enabling GlobalISel for AArch64 -O0.
Feb 2 2018, 11:34 AM
aemerson committed rL324110: [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy..
[AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy.
Feb 2 2018, 10:07 AM
aemerson closed D42832: [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy..
Feb 2 2018, 10:07 AM
aemerson added inline comments to D42832: [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy..
Feb 2 2018, 9:47 AM

Feb 1 2018

aemerson created D42832: [AArch64][GlobalISel] Use getRegClassForTypeOnBank() in selectCopy..
Feb 1 2018, 6:38 PM
aemerson committed rL324051: [AArch64][GlobalISel] Fix old use of % sigil in test..
[AArch64][GlobalISel] Fix old use of % sigil in test.
Feb 1 2018, 6:18 PM
aemerson committed rL324048: Fix debug spelling in ResetMachineFunction pass..
Fix debug spelling in ResetMachineFunction pass.
Feb 1 2018, 5:51 PM
aemerson committed rL324047: [GlobalISel] Constrain the dest reg of IMPLICT_DEF..
[GlobalISel] Constrain the dest reg of IMPLICT_DEF.
Feb 1 2018, 5:48 PM
aemerson closed D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
Feb 1 2018, 5:48 PM
aemerson added a comment to D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.

Thanks, I'll commit this first and follow up on that. At first glance they're using different slightly different register classes so I'm not sure how correct it is to use that.

Feb 1 2018, 5:30 PM
aemerson committed rL324028: [GlobalISel][Legalizer] Relax a legalization loop detecting assert..
[GlobalISel][Legalizer] Relax a legalization loop detecting assert.
Feb 1 2018, 3:14 PM
aemerson committed rL324001: [GlobalISel] Fix assert failure when legalizing non-power-2 loads..
[GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Feb 1 2018, 12:50 PM
aemerson added a comment to D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.

Ping.

Feb 1 2018, 10:50 AM

Jan 31 2018

aemerson committed rL323933: [GlobalOpt] Improve common case efficiency of static global initializer….
[GlobalOpt] Improve common case efficiency of static global initializer…
Jan 31 2018, 4:00 PM
This revision was not accepted when it landed; it landed in state Needs Review.
Jan 31 2018, 4:00 PM
aemerson added inline comments to D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.
Jan 31 2018, 2:30 PM
aemerson updated the diff for D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.

Simplified the logic a bit and added an example in the function documentation of complex and simple addresses.

Jan 31 2018, 11:12 AM

Jan 30 2018

aemerson added a comment to D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.

This is very close now. Could you add an explicit examples (eg show the IR) showing which initialization remain slow (Complex) and which are fast now? This should also address the spirit of Adrian's question I think.

Thanks
Gerolf

Jan 30 2018, 9:22 PM
aemerson updated the diff for D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.

Addressed feedback.

Jan 30 2018, 8:39 PM
aemerson updated the diff for D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.

New fix, I think the issue was that IMPLICIT_DEFs weren't constraining their dest register, while COPY selection didn't constrain the source.

Jan 30 2018, 1:17 PM
aemerson added inline comments to D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
Jan 30 2018, 11:15 AM
aemerson accepted D42567: [AArch64] Properly handle dllimport of variables when using fast-isel.

Looks fine.

Jan 30 2018, 11:10 AM
aemerson added a comment to D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.

Meant to say, I'm *not* familiar with this part of the codebase.

Jan 30 2018, 10:32 AM
aemerson created D42697: [GlobalISel] Constrain the dest reg of IMPLICT_DEF.
Jan 30 2018, 10:31 AM

Jan 29 2018

aemerson added a comment to D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.

Thank you drilling into this! I have a few questions below. Also, could you comment on the time savings you measured for your implementation?

-Gerolf

With this change, the test case I was using completed compiling in about 45 seconds, a significant portion of which was spent in the front-end/elsewhere in the compiler.

Jan 29 2018, 8:57 PM
aemerson updated the diff for D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.

Updated with some more comments.

Jan 29 2018, 10:37 AM

Jan 26 2018

aemerson committed rL323582: [GlobalISel][Legalizer] Convert the FP constants to the right APFloat type for….
[GlobalISel][Legalizer] Convert the FP constants to the right APFloat type for…
Jan 26 2018, 11:08 PM
aemerson created D42612: [GlobalOpt] Improve common case efficiency of static global initializer evaluation.
Jan 26 2018, 7:12 PM
aemerson accepted D42568: [GlobalISel] Bail out on calls to dllimported functions.

LGTM from a GISel point of view, I can't say I understand Windows enough to say the tests are right.

Jan 26 2018, 8:01 AM

Jan 25 2018

aemerson committed rL323485: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel..
[Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 25 2018, 4:29 PM
aemerson committed rC323485: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel..
[Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 25 2018, 4:29 PM
aemerson closed D42276: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 25 2018, 4:29 PM
aemerson added inline comments to D42356: [AArch64] Implement dynamic stack probing for windows.
Jan 25 2018, 6:15 AM

Jan 24 2018

aemerson added a comment to D42512: [X86] When using Win64 ABI, exit with error if SSE is disabled for varargs.

Right, it's not clear to me how we'd avoid crashing later on since in this case it happens in a later pass, not in isel. If report_fatal_error does indeed always triggers crash diagnostics in clang then this is no better than assertion failures. Perhaps emitting the diagnostic first with errorUnsupported might be better than nothing.

Jan 24 2018, 5:50 PM
aemerson created D42512: [X86] When using Win64 ABI, exit with error if SSE is disabled for varargs.
Jan 24 2018, 4:12 PM
aemerson accepted D41373: [GISel][RFC]: GlobalISel Combiner prototype.

LGTM with some minor comment fixes.

Jan 24 2018, 2:51 PM
aemerson committed rL323384: [GlobalISel] Add a requires: asserts to a test..
[GlobalISel] Add a requires: asserts to a test.
Jan 24 2018, 2:43 PM
aemerson committed rL323371: [AArch64][GlobalISel] Fall back during AArch64 isel if we have a volatile load..
[AArch64][GlobalISel] Fall back during AArch64 isel if we have a volatile load.
Jan 24 2018, 12:38 PM
aemerson committed rL323369: [GlobalISel] Don't fall back to FastISel..
[GlobalISel] Don't fall back to FastISel.
Jan 24 2018, 12:01 PM
aemerson added a comment to D41373: [GISel][RFC]: GlobalISel Combiner prototype.

I think this is ok to progress now. We're happy with the overall design, although the pre-legalize pass is probably unnecessary at this moment, so leave that for another patch?

Jan 24 2018, 8:39 AM
aemerson updated subscribers of D42356: [AArch64] Implement dynamic stack probing for windows.

For GlobalISel we haven't had to have very target specific code in the IRTranslator for this before. @qcolombet any opinion on whether we want to include target/ABI specific code into the IRTranslator? My feeling is that we want to avoid this, the alternative I can see is to introduce a new opcode for G_DYNALLOCA and lower that at isel.

Jan 24 2018, 8:36 AM

Jan 23 2018

aemerson updated the diff for D42276: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.

I've added two kinds of warnings, one for targets which have incomplete GISel support, and another for unsupported optimisation levels (for ARM64 -O{1,2,3,s,z}).

Jan 23 2018, 7:59 PM

Jan 22 2018

aemerson added inline comments to D42276: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 22 2018, 3:18 AM

Jan 18 2018

aemerson updated the diff for D42276: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 18 2018, 5:48 PM
aemerson created D42276: [Driver] Add an -fexperimental-isel driver option to enable/disable GlobalISel.
Jan 18 2018, 5:44 PM
aemerson committed rL322878: [AArch64][GlobalISel] Add isel support for global values in the large code….
[AArch64][GlobalISel] Add isel support for global values in the large code…
Jan 18 2018, 11:23 AM
aemerson closed D42175: [AArch64][GlobalISel] Add isel support for global values in the large code model.
Jan 18 2018, 11:23 AM

Jan 17 2018

aemerson added a comment to D42175: [AArch64][GlobalISel] Add isel support for global values in the large code model.

@kristof.beyls @rogfer01 Could someone from ARM do a quick test of this? I don't have an ELF target handy to test it myself and I'd like to get this fixed in the release branch.

Jan 17 2018, 7:20 PM
aemerson added a reviewer for D42175: [AArch64][GlobalISel] Add isel support for global values in the large code model: ab.
Jan 17 2018, 9:38 AM
aemerson created D42175: [AArch64][GlobalISel] Add isel support for global values in the large code model.
Jan 17 2018, 6:17 AM

Jan 14 2018

aemerson committed rL322466: [GlobalISel][Legalizer] Convert some typedefs to using. NFC..
[GlobalISel][Legalizer] Convert some typedefs to using. NFC.
Jan 14 2018, 4:45 PM