Page MenuHomePhabricator

aemerson (Amara Emerson)
Asian George Costanza

Projects

User does not belong to any projects.

User Details

User Since
Sep 9 2013, 3:45 AM (459 w, 2 d)

The sea was angry that day, my friends - like an old man trying to send back soup in a deli.

Recent Activity

Mon, Jun 27

aemerson accepted D127956: [AArch64][SME] Add SME outer product intrinsics.

LGTM.

Mon, Jun 27, 10:17 AM · Restricted Project, Restricted Project
aemerson added a comment to D128332: [AArch64][SME] Add SVE2 psel, uclamp, sclamp and revd IR intrinsics.

I'm curious, did we not already have support for these are part of the SVE2 ACLE implementation?

Mon, Jun 27, 9:50 AM · Restricted Project, Restricted Project
aemerson accepted D128332: [AArch64][SME] Add SVE2 psel, uclamp, sclamp and revd IR intrinsics.

I'm curious, did we not already have support for these are part of the SVE2 ACLE implementation?

Mon, Jun 27, 9:40 AM · Restricted Project, Restricted Project

Tue, Jun 21

aemerson accepted D127957: [AArch64][SME] Add some SME PSTATE setting/query intrinsics.
Tue, Jun 21, 8:13 AM · Restricted Project, Restricted Project

Fri, Jun 17

aemerson added inline comments to D127957: [AArch64][SME] Add some SME PSTATE setting/query intrinsics.
Fri, Jun 17, 4:13 PM · Restricted Project, Restricted Project

Thu, Jun 16

aemerson added inline comments to D127957: [AArch64][SME] Add some SME PSTATE setting/query intrinsics.
Thu, Jun 16, 8:15 PM · Restricted Project, Restricted Project

Wed, Jun 15

aemerson accepted D127853: [AArch64][SME] Add SME cntsb/h/w/d intrinsics.

Can't the rdsvl immediate be used to do the scaling instead of a shift?

I think the rdsvl immediate specifies the multiplier, which equates to a shift left. Whereas I want to divide the length by the element size, i.e. a shift right.

Wed, Jun 15, 10:57 AM · Restricted Project, Restricted Project
aemerson added a comment to D127853: [AArch64][SME] Add SME cntsb/h/w/d intrinsics.

Can't the rdsvl immediate be used to do the scaling instead of a shift?

Wed, Jun 15, 9:08 AM · Restricted Project, Restricted Project

Tue, Jun 14

aemerson accepted D127757: [NFC][AArch64] Minor refactor of AArch64InstPrinter::printMatrixTileList.
Tue, Jun 14, 3:26 PM · Restricted Project, Restricted Project
aemerson accepted D127414: [AArch64][SME] Add SME read/write intrinsics that map to the mova instruction.

No further comments from me, LGTM if Cullen approves.

Tue, Jun 14, 10:24 AM · Restricted Project, Restricted Project

Mon, Jun 13

aemerson added a reviewer for D127488: [GlobalISel][DebugInfo] Remove debug info with zero line from constants inserted at entry block: aprantl.

+ @aprantl

Mon, Jun 13, 4:22 PM · debug-info, Restricted Project, Restricted Project
aemerson added a comment to D127551: [GISel] Fix unmerging of constants for big endian target.

Could you upload with more context (-U9999).

Mon, Jun 13, 4:20 PM · Restricted Project, Restricted Project
aemerson added inline comments to D127414: [AArch64][SME] Add SME read/write intrinsics that map to the mova instruction.
Mon, Jun 13, 4:16 PM · Restricted Project, Restricted Project
aemerson accepted D127210: [AArch64][SME] Add load/store intrinsics.

We can revisit the opaque pointers issue later. LGTM otherwise with a nit.

Mon, Jun 13, 11:23 AM · Restricted Project, Restricted Project

Thu, Jun 9

aemerson accepted D127317: [AArch64][SME] Add ldr/str (fill/spill) intrinsics.

LGTM.

Thu, Jun 9, 10:45 AM · Restricted Project, Restricted Project

Wed, Jun 8

aemerson added inline comments to D127317: [AArch64][SME] Add ldr/str (fill/spill) intrinsics.
Wed, Jun 8, 11:39 PM · Restricted Project, Restricted Project
aemerson updated subscribers of D127210: [AArch64][SME] Add load/store intrinsics.
Wed, Jun 8, 9:56 AM · Restricted Project, Restricted Project

May 27 2022

aemerson added inline comments to D126411: update_mir_test_checks: Better handling of common prefixes.
May 27 2022, 2:56 PM · Restricted Project, Restricted Project

May 19 2022

aemerson added a comment to rG86f7d7074a01: [RISCV] Use selectShiftMaskXLen ComplexPattern for isel of rotates..

Hi @craig.topper, this looks like it broke under EXPENSIVE_CHECKS? https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/22475/

May 19 2022, 2:24 PM · Restricted Project, Restricted Project

May 17 2022

aemerson added a comment to D125846: [StackProtector] Allow targets to specify an instruction is part of terminator sequence.

Can the test be reduced? Do you really need a main()?

May 17 2022, 9:49 PM · Restricted Project, Restricted Project

May 13 2022

aemerson committed rG41fef1044956: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef. (authored by aemerson).
[GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef.
May 13 2022, 12:21 PM · Restricted Project, Restricted Project
aemerson closed D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..
May 13 2022, 12:20 PM · Restricted Project, Restricted Project

May 9 2022

aemerson updated the diff for D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..

Only fold undef shifts, not LHS.

May 9 2022, 11:29 AM · Restricted Project, Restricted Project
aemerson added a comment to D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..

Actually G_SHL with undef LHS needs to be optimized to zero, not undef.

At the IR level, InstSimplify does this for all shifts (in the absence of exact/nsw/nuw flags). See SimplifyRightShift and SimplifyShlInst.

May 9 2022, 11:06 AM · Restricted Project, Restricted Project

May 6 2022

aemerson updated the diff for D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..
May 6 2022, 1:55 PM · Restricted Project, Restricted Project
aemerson added a comment to D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..

Actually G_SHL with undef LHS needs to be optimized to zero, not undef.

I don't understand why that would be true for SHL but not for the other shifts. What are the rules for undef in MIR? Is it like undef in IR, or like poison in IR, or neither?

May 6 2022, 12:37 PM · Restricted Project, Restricted Project

May 5 2022

aemerson updated the diff for D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..

Actually G_SHL with undef LHS needs to be optimized to zero, not undef.

May 5 2022, 4:12 PM · Restricted Project, Restricted Project
aemerson committed rG586802eb7290: [GlobalISel] Re-generate some tests. (authored by aemerson).
[GlobalISel] Re-generate some tests.
May 5 2022, 2:15 PM · Restricted Project, Restricted Project
aemerson retitled D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef. from [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts undef. to [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..
May 5 2022, 1:58 PM · Restricted Project, Restricted Project
aemerson requested review of D125041: [GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef..
May 5 2022, 1:58 PM · Restricted Project, Restricted Project
aemerson committed rG87e3646a1f8a: [AArch64][GlobalISel] Add undef combines to postlegalizer combiner. (authored by aemerson).
[AArch64][GlobalISel] Add undef combines to postlegalizer combiner.
May 5 2022, 9:22 AM · Restricted Project, Restricted Project

Mar 14 2022

aemerson committed rG8cbf18cb049d: [GlobalISel] Fix store merging incorrectly merging volatile stores. (authored by aemerson).
[GlobalISel] Fix store merging incorrectly merging volatile stores.
Mar 14 2022, 1:49 PM · Restricted Project

Mar 12 2022

aemerson added a comment to D121085: [AArch64][GlobalISel] Implement G_SELECT translate to min/max/abs.

Thanks for the patch. Unfortunately I see the code that this was ported from in SelectionDAGBuilder does the optimization there because it relies on ValueTracking's matchSelectPattern() infrastructure. @arsenm @foad do you think it's worth it here to replicate that functionality for generic MIR so we can avoid adding this to the IRTranslator?

Shouldn't the select -> min/max/abs be done in IR, so there would be no need for this patch? See D98152 for example.

Our development branch does not have that patch yet, I have tried that patch and it works well. it's more appropriate to handle it at the IR and no need for this patch. But I have a question,why gisel can not rely on ValueTracking infrastructure ? Because gisel is based on mir, but ValueTracking is an IR analysis ?@aemerson

There's nothing wrong with using the ValueTracking infrastructure, the problem is that the only place we can use IR analysis are during the IRTranslator phase, and we try to avoid doing optimizations there as much as possible. Sometimes that might not be practical, but where we can do like to perform optimizations on generic MIR in combine passes instead.

Mar 12 2022, 9:10 PM · Restricted Project, Restricted Project

Mar 7 2022

aemerson added reviewers for D121085: [AArch64][GlobalISel] Implement G_SELECT translate to min/max/abs: arsenm, paquette, foad.

Thanks for the patch. Unfortunately I see the code that this was ported from in SelectionDAGBuilder does the optimization there because it relies on ValueTracking's matchSelectPattern() infrastructure. @arsenm @foad do you think it's worth it here to replicate that functionality for generic MIR so we can avoid adding this to the IRTranslator?

Mar 7 2022, 4:43 PM · Restricted Project, Restricted Project

Feb 22 2022

aemerson added a reverting change for rG55c181a6c786: Revert "[AArch64][GlobalISel] Optimize conjunctions of compares to conditional…: rGb661470bce14: Revert "Revert "[AArch64][GlobalISel] Optimize conjunctions of compares to….
Feb 22 2022, 5:22 PM
aemerson committed rGb661470bce14: Revert "Revert "[AArch64][GlobalISel] Optimize conjunctions of compares to… (authored by aemerson).
Revert "Revert "[AArch64][GlobalISel] Optimize conjunctions of compares to…
Feb 22 2022, 5:22 PM

Feb 20 2022

aemerson committed rG2a46450849de: [AArch64][GlobalISel] Optimize conjunctions of compares to conditional compares. (authored by aemerson).
[AArch64][GlobalISel] Optimize conjunctions of compares to conditional compares.
Feb 20 2022, 1:21 AM
aemerson closed D117166: [AArch64][GlobalISel] Optimize conjunctions of compares to conditional conmpares..
Feb 20 2022, 1:21 AM · Restricted Project
aemerson committed rGb09e63bad1e5: [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops. (authored by aemerson).
[AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops.
Feb 20 2022, 1:12 AM
aemerson closed D117160: [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops..
Feb 20 2022, 1:12 AM · Restricted Project

Feb 16 2022

aemerson committed rGc8b8c8e989e5: [AArch64][GlobalISel] Implement support for clang.arc.attachedcall call operand… (authored by aemerson).
[AArch64][GlobalISel] Implement support for clang.arc.attachedcall call operand…
Feb 16 2022, 5:35 PM
aemerson closed D119983: [AArch64][GlobalISel] Implement support for clang.arc.attachedcall call operand bundles..
Feb 16 2022, 5:35 PM · Restricted Project
aemerson accepted D119292: GlobalISel: Merge different versions of isConstantOrConstantVector.

LGTM.

Feb 16 2022, 3:25 PM · Restricted Project
aemerson requested review of D119983: [AArch64][GlobalISel] Implement support for clang.arc.attachedcall call operand bundles..
Feb 16 2022, 3:13 PM · Restricted Project

Jan 12 2022

aemerson added inline comments to D117166: [AArch64][GlobalISel] Optimize conjunctions of compares to conditional conmpares..
Jan 12 2022, 8:56 PM · Restricted Project
aemerson updated the diff for D117160: [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops..
Jan 12 2022, 3:29 PM · Restricted Project
aemerson requested review of D117166: [AArch64][GlobalISel] Optimize conjunctions of compares to conditional conmpares..
Jan 12 2022, 3:27 PM · Restricted Project
aemerson committed rGb9499e14d24f: [AArch64][GlobalISel] Re-generate checks for a test. (authored by aemerson).
[AArch64][GlobalISel] Re-generate checks for a test.
Jan 12 2022, 3:13 PM
aemerson accepted D117026: GlobalISel: Fix CSEMIRBuilder mishandling constant folds of vectors.

I have no strong preference of whether we emit a copy or not FWIW.

Jan 12 2022, 2:43 PM · Restricted Project
aemerson accepted D117143: GlobalISel: Always enable GISelKnownBits for InstructionSelect.

LGTM.

Jan 12 2022, 2:25 PM · Restricted Project
aemerson requested review of D117160: [AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops..
Jan 12 2022, 2:24 PM · Restricted Project

Jan 10 2022

aemerson added inline comments to D116702: [GlobalISel] Combine select + fcmp to fminnum/fmaxnum/fminimum/fmaximum.
Jan 10 2022, 4:44 PM · Restricted Project, Restricted Project

Dec 17 2021

aemerson accepted D99814: [JumpThreading] Change asserts for WantInteger into actual checks.

LGTM with the test nit unless anyone else has objections.

Dec 17 2021, 10:48 AM · Restricted Project
aemerson added inline comments to D99814: [JumpThreading] Change asserts for WantInteger into actual checks.
Dec 17 2021, 10:45 AM · Restricted Project

Dec 9 2021

aemerson committed rG98095afbcb43: [AArch64][GlobalISel] Split vector stores of zero. (authored by aemerson).
[AArch64][GlobalISel] Split vector stores of zero.
Dec 9 2021, 7:05 PM
aemerson closed D115479: [AArch64][GlobalISel] Split vector stores of zero..
Dec 9 2021, 7:04 PM · Restricted Project
aemerson requested review of D115479: [AArch64][GlobalISel] Split vector stores of zero..
Dec 9 2021, 4:07 PM · Restricted Project
aemerson committed rG2717f62c97cf: [GlobalISel] Make G_PTR_ADD pattern matcher non-commutative. (authored by mysterymath).
[GlobalISel] Make G_PTR_ADD pattern matcher non-commutative.
Dec 9 2021, 12:38 PM
aemerson closed D114655: Make G_PTR_ADD pattern matcher non-commutative..
Dec 9 2021, 12:38 PM · Restricted Project

Dec 6 2021

aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

Thanks for the feedback folks. To be honest I don't have the time right now to discuss and redesign the whole thing with David (have some parental leave coming up as well). If anyone else wants to pick this up and continue it feel free to do so, I published the patches to help other people with their debugging problems, but unless someone else picks this up and works to reach a consensus on design, it will have to lie unmaintained as a patch for Q1/Q2 next year at least.

Dec 6 2021, 4:18 PM · Restricted Project

Nov 30 2021

aemerson accepted D114655: Make G_PTR_ADD pattern matcher non-commutative..

LGTM, thanks.

Nov 30 2021, 12:29 AM · Restricted Project
aemerson accepted D114389: AArch64 GIsel: legalize lshr operands, even if it is poison.

LGTM. Can you rewrite the patch description for the commit message to more clearly explain what the actual problem this is solving too?

Nov 30 2021, 12:27 AM · Restricted Project, Restricted Project

Nov 22 2021

aemerson added a comment to D112852: [GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues.

Requesting changes for the missing MIR serialization of the new property.

The issue here is determining where the barrier for disallowing these undefined vregs should be. LiveDebugVariables naturally discards these particular uses, so this is sufficient based on my knowledge.

Well to me this feels like we are making the intermediate representation more complicated (I consider it more complicated because DBG_VALUE instructions are special with lifetime rules now) for a very small gain.

That said I don't feel strongly about this. If others think this is benefitial then we can go ahead.

We haven’t changed the meaning of the MIR, we just clarified the existing semantics on the RFC thread. There’s no middle ground here as far as I can see, either it’s valid to have undefined uses or not. If it is, then we’re free to do anything that leaves them around.

Nov 22 2021, 5:28 PM · Restricted Project

Nov 16 2021

aemerson committed rGdcd8728d8394: Remove unnecessary <any> include. (authored by aemerson).
Remove unnecessary <any> include.
Nov 16 2021, 12:59 AM
aemerson added inline comments to D109131: [GlobalISel] Add a store-merging optimization pass and enable for AArch64..
Nov 16 2021, 12:37 AM · Restricted Project

Nov 15 2021

aemerson committed rGdc84770d559b: [GlobalISel] Add a store-merging optimization pass and enable for AArch64. (authored by aemerson).
[GlobalISel] Add a store-merging optimization pass and enable for AArch64.
Nov 15 2021, 9:11 PM
aemerson closed D109131: [GlobalISel] Add a store-merging optimization pass and enable for AArch64..
Nov 15 2021, 9:10 PM · Restricted Project

Nov 11 2021

aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Yes, I've seen DebugCounter. It's useful when you're already identified the file where something is going wrong, and you can enable the counters and pass counter values to opt. It doesn't however work across multiple TUs or support parallel debugging, so it's not solving the same problem.

Perhaps it'd be worth considering the overlap here, and maybe avoiding it: What would it be like if this new thing /only/ bisected at the granularity of a whole compilation action (ie: one invocation of clang), rather than specific optimizations? (possibly even only at the "whole compiler" granularity - using two compilers - and choosing between one or the other for each compilation)

Then once the specific file has been identified, use the existing sub-file granularity tools (a wrapper script could help cover both of these for the user so it wasn't a bunch more manual work).

I think that might help prevent overlap of functionality/re-implementing similar/the same functionality through different mechanisms for reducing specific optimization applications?

I think that restricting it to only allow bisection across translation units would be artificially constraining the functionality to not step on the toes of other features. That in itself doesn't seem the right approach, because the simplest thing right now is to support arbitrary bisection granularities. If we constrained it to only allow bisection down to TUs, then we haven't really simplified or re-used any code, all we've done is to make the user experience worse.

Nov 11 2021, 3:55 PM · Restricted Project

Nov 9 2021

aemerson committed rGaf4dc633f86f: [AArch64][GlobalISel] Fix atomic truncating stores from generating invalid… (authored by aemerson).
[AArch64][GlobalISel] Fix atomic truncating stores from generating invalid…
Nov 9 2021, 8:48 PM

Nov 3 2021

aemerson accepted D93154: GlobalISel: remove assert that memcpy Src and Dst addrspace must be identical.

LGTM.

Nov 3 2021, 3:15 PM · Restricted Project

Nov 2 2021

aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Nov 2 2021, 1:09 PM · Restricted Project
aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

Let me just step back a little bit and say that now that I think about what we did, having something that answers "should I run in this instance" is desirable, the implementation doesn't really matter. We did it with function attributes, but having a bisect client API like you're introducing is fine.
My only complain is that the client interface should not have remote in the name :P.

From an abstraction level, we need two things:

  1. Something that tells if an optimization needs to run (the remote bisect client here)
  2. Something that drives the on/off of the optimizations based on the previous state (here your daemon)

The way we did that was:
For #1 we added annotations in the IR
For #2 we implemented the driver directly in our JIT daemon

Essentially, that boils down to something that formulates a plan and something that executes the plan. At one point I was thinking that formulating the plan could be changing the pass pipeline (don't insert what you don't want to run), but that look like too much work :).

How does that work when you have parallel builds?

Each module is assigned an ID and the bisect plan and previous state are mapped to this ID.
The ID is saved in the module metadata but for now we didn't use it since all we needed was added to the IR via annotations (i.e., we didn't need to come up with a key to ask information about specific pass). In that regard, you're approach is more general.
For the ID, we used a hash of the module before the bisect annotations were added, i.e., as long as you don't change the front-end the ID are stable between runs.

To summarize:
Compute the module ID -> add annotation based on past information -> run the backend (at this point, the backend runs by itself.)

When multiple clang processes are running simultaneously, and you want to bisect to a specific translation unit, and then within that TU to a specific point in the module, don't you need some co-ordination?

At a high level here is what the driver was doing:

  • Bisect optnone on each module
  • Find the module(s) that creates the problem (the minimal set of modules that needs optimizations turn on)
  • Do the same on each function (the minimal set may involve more than one function)
  • Try to "outline" the basic blocks of each problematic function and do the same process on the newly created functions
  • Split the problematic basic block to make them smaller and continue
  • When you're happy with the size of the basic blocks, start bisecting the optimizations on the problematic functions (possibly basic block extracted). Right now we were only bisecting a handful of optimization because the final diff with the basic block splitting usually made the faulty optimization easy to find by hand.

The way it worked is all that state was saved in a file <shaderID>-bisect-info. You could bootstrap the process by populating the file by hand, i.e., by telling the JIT process which module you want to bisect.

As far as bisecting to a specific point in the TU, we were always going all the way down to the executable then you had to supply a script that tells whether or not the program is working. That's similar to what git bisect is doing (if the script runs 0, the program works, if that's one it doesn't). In your script you could check for whatever (specific sequence of asm, executable producing some results, etc.)

Note: The pass I was talking about in my previous reply that we insert in the LLVM pipeline, is generating all the information to start the bisect process (e.g., the shader ID, the list of all the functions, the list of all basic blocks). Then the driver was using this information to tell that pass to add some annotation on some function (e.g., optnone, noinline, etc.), but also to split some basic block and outline them (and attach some annotation on them).

Cheers,
-Quentin

Nov 2 2021, 12:20 PM · Restricted Project
aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

Hi Amara,

I am kind of repeating what I said in https://reviews.llvm.org/D113031, but putting it here for better visibility.

I think the thing that drives the bisection shouldn't trickle down in the optimizations themselves, instead I would rather this information to be encoded in the IR itself (like a generalization of optnone).

For instance, for us a daemon based approach doesn't work at all because we are running a JIT compiler that runs in its own sandbox and we cannot query an external process from it.

Internally, we developed a bisect tool that annotates the IR upfront (to make an analogy with clang, this is as if clang would add a bunch of "bisect" attribute on the IR) before sending it for compilation instead of having the backend query a bisection client.
Note: technically we didn't modify clang, we instead had a specific pass at the beginning of the LLVM pipeline that would make all the bisection decisions, but also perform some transformation to isolate the bugs (like creating functions out of basic blocks, splitting the blocks to make the functions smaller, and blocking inlining.)

Cheers,
-Quentin

Nov 2 2021, 11:12 AM · Restricted Project
aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

D113031 contains an example of a client for this in GlobalISel.

Nov 2 2021, 10:23 AM · Restricted Project
aemerson requested review of D113031: [GlobalISel] Add a bisection point after instruction selection..
Nov 2 2021, 10:18 AM · Restricted Project
aemerson requested review of D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..
Nov 2 2021, 10:17 AM · Restricted Project

Oct 31 2021

aemerson added reviewers for D112852: [GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues: dsanders, aditya_nandakumar, tstellar, nhaehnle, dblaikie.
Oct 31 2021, 6:38 PM · Restricted Project

Oct 29 2021

aemerson committed rG5dd9e019ddb4: [AArch64][GlobalISel] Fix an crash in RBS due to a new regclass being added. (authored by aemerson).
[AArch64][GlobalISel] Fix an crash in RBS due to a new regclass being added.
Oct 29 2021, 11:47 AM

Oct 19 2021

aemerson added a comment to D111132: [GlobalISel] Better verification of G_UNMERGE_VALUES.
  1. Splitting a vector into its elements (the converse of G_BUILD_VECTOR).

Is there any appetite for using a new G_SPLIT_VECTOR opcode for this case?

I don't see a reason to have a scalar and vector version of the same thing.

You mean, splitting a vector into vectors vs splitting a vector into scalars? Then why do we have both G_BUILD_VECTOR and G_CONCAT_VECTORS? The asymmetry annoys me!

Oct 19 2021, 11:15 AM · Restricted Project

Oct 18 2021

aemerson added a comment to D111777: [GlobalISel] Refactor CSEMIRBuilder's handling of unary op constant folding. NFC..

No particular objection from me, though it would be more consistent if ConstantFoldUnaryOp built the G_CONSTANT, just like ConstantFoldVectorUnaryOp builds the G_BUILD_VECTOR.

That then raises the question of how you would call these functions from CombinerHelper match* functions, which don't want to build the IR until you get to the corresponding apply* function.

That then also raises the question of whether we really want to be doing constant folding both in the builder and in combiners.

I think the answer is yes, but for different reasons. I think constant folding in the builder only makes sense for handful of artifact-like operations, but general constant folding is always going to be needed in the combiners as other operations fold to constants

Oct 18 2021, 2:29 PM · Restricted Project
aemerson added a comment to D111970: [GlobalISel][Legalizer] Restore eraseFromParentAndMarkDBGValuesForRemoval() for CallLowering artifacts..

Please see the discussion here: https://groups.google.com/g/llvm-dev/c/R9lSoAqh5e4

Oct 18 2021, 2:28 PM · Restricted Project

Oct 15 2021

aemerson added inline comments to D111888: [AArch64][GISel] Optimize 8 and 16 bit variants of uaddo..
Oct 15 2021, 10:38 AM · Restricted Project

Oct 13 2021

aemerson added inline comments to D111777: [GlobalISel] Refactor CSEMIRBuilder's handling of unary op constant folding. NFC..
Oct 13 2021, 11:19 PM · Restricted Project
aemerson requested review of D111777: [GlobalISel] Refactor CSEMIRBuilder's handling of unary op constant folding. NFC..
Oct 13 2021, 11:17 PM · Restricted Project

Oct 12 2021

aemerson committed rG5abce56edbee: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder. (authored by aemerson).
[GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder.
Oct 12 2021, 11:31 AM
aemerson closed D111524: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder..
Oct 12 2021, 11:31 AM · Restricted Project
aemerson added inline comments to D111524: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder..
Oct 12 2021, 10:00 AM · Restricted Project

Oct 11 2021

aemerson committed rG53ebfa7c5d1b: [AArch64][GlobalISel] Fix combiner assertion in matchConstantOp(). (authored by aemerson).
[AArch64][GlobalISel] Fix combiner assertion in matchConstantOp().
Oct 11 2021, 3:55 PM
aemerson updated the diff for D111524: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder..

Hoist end iterator eval in loop. Rebase on test check regeneration.

Oct 11 2021, 3:16 PM · Restricted Project
aemerson committed rGda904719e9a7: [GlobalISel] Regenerate some MIR tests with CHECK-NEXT for another patch. (authored by aemerson).
[GlobalISel] Regenerate some MIR tests with CHECK-NEXT for another patch.
Oct 11 2021, 2:40 PM
aemerson added a comment to D111524: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder..

No objections from me, but I do wonder if there's a way to apply this more consistently to all the constant folds, not just the binops.

Oct 11 2021, 9:35 AM · Restricted Project
aemerson requested review of D111524: [GlobalISel] Add support for constant vector folding of binops in CSEMIRBuilder..
Oct 11 2021, 12:14 AM · Restricted Project
aemerson accepted D107160: [AArch64] Do not emit an extra zero-extend for i1 argument.
Oct 11 2021, 12:03 AM · Restricted Project

Oct 10 2021

aemerson committed rGf1e9ecea442a: [AArch64][GlobalISel] Legalize G_VECREDUCE_XOR. Treated same as other bitwise… (authored by aemerson).
[AArch64][GlobalISel] Legalize G_VECREDUCE_XOR. Treated same as other bitwise…
Oct 10 2021, 5:02 PM

Oct 9 2021

aemerson committed rGf95d9c95bbf4: [GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly… (authored by aemerson).
[GlobalISel] Fix the stores of truncates -> wide store combine for non-evenly…
Oct 9 2021, 9:19 PM

Oct 8 2021

aemerson added inline comments to D111036: [GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c).
Oct 8 2021, 11:30 AM · Restricted Project
aemerson committed rG17b89f9daad5: [GlobalISel] Improve G_UMHULH -> LSHR combine to accept non-uniform constant… (authored by aemerson).
[GlobalISel] Improve G_UMHULH -> LSHR combine to accept non-uniform constant…
Oct 8 2021, 11:25 AM

Oct 7 2021

aemerson added a comment to D111036: [GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c).

Oops sorry about that, I didn't realize this was already on my branch when I committed 72ce310bf0de so this went in by mistake.

Oct 7 2021, 11:55 PM · Restricted Project
aemerson committed rG08b3c0d995d8: [GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c) (authored by aemerson).
[GlobalISel] Combine G_UMULH x, (1 << c)) -> x >> (bitwidth - c)
Oct 7 2021, 11:52 PM
aemerson committed rG72ce310bf0de: [GlobalISel][IRTranslator] Fix a use-after-free bug when translating trap-func… (authored by aemerson).
[GlobalISel][IRTranslator] Fix a use-after-free bug when translating trap-func…
Oct 7 2021, 11:52 PM