Page MenuHomePhabricator

aemerson (Amara Emerson)
Asian George Costanza

Projects

User does not belong to any projects.

User Details

User Since
Sep 9 2013, 3:45 AM (498 w, 5 d)

The sea was angry that day, my friends - like an old man trying to send back soup in a deli.

Recent Activity

Fri, Mar 24

aemerson updated subscribers of D146531: [AArch64][GlobalISel] Add support for some across-vector NEON intrinsics.

@efriedma @spatel @craig.topper what do you think of adding the f8 ValueType here? My SDAG knowledge is rusty, any potential problems?

Fri, Mar 24, 10:42 AM · Restricted Project, Restricted Project

Mon, Mar 20

aemerson committed rG41e9c4b88c28: [NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction. (authored by aemerson).
[NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction.
Mon, Mar 20, 11:17 AM · Restricted Project, Restricted Project
aemerson closed D146375: [NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction..
Mon, Mar 20, 11:17 AM · Restricted Project, Restricted Project

Sat, Mar 18

aemerson requested review of D146375: [NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction..
Sat, Mar 18, 11:21 PM · Restricted Project, Restricted Project

Fri, Mar 3

aemerson added a comment to D143977: [RFC][GlobalISel] Allow match against iPTR operand in leaf node.

Also, any tests?

Fri, Mar 3, 9:59 AM · Restricted Project, Restricted Project
aemerson updated the diff for D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

Fix build.

Fri, Mar 3, 1:45 AM · Restricted Project, Restricted Project
aemerson updated the diff for D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

Address comments.

Fri, Mar 3, 1:43 AM · Restricted Project, Restricted Project

Thu, Mar 2

aemerson committed rG1e1f1195a8bf: [AArch64] Fix crash in LowerBUILD_VECTOR trying to create invalid… (authored by aemerson).
[AArch64] Fix crash in LowerBUILD_VECTOR trying to create invalid…
Thu, Mar 2, 2:10 PM · Restricted Project, Restricted Project
aemerson closed D145185: [AArch64] Fix crash in LowerBUILD_VECTOR trying to create invalid EXTRACT_SUBVECTOR..
Thu, Mar 2, 2:10 PM · Restricted Project, Restricted Project
aemerson requested review of D145185: [AArch64] Fix crash in LowerBUILD_VECTOR trying to create invalid EXTRACT_SUBVECTOR..
Thu, Mar 2, 1:55 PM · Restricted Project, Restricted Project

Feb 28 2023

aemerson added a comment to D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

I’ve thought about adding a MachineValue that acts more like an SDValue which would track the type and index

Feb 28 2023, 10:55 PM · Restricted Project, Restricted Project
aemerson added a comment to D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

I’m also wondering for performance reasons whether we should have the types be returned via a single call. With the current patch we call getOperand() twice for each register and I’m not sure if they’ll get optimized to a single call. Maybe:
auto [DstReg, DstTy, Src1Reg, Src1Ty] = MI.getFirst2RegsAndTys();

Feb 28 2023, 7:28 AM · Restricted Project, Restricted Project

Feb 27 2023

aemerson committed rG31d6a572579a: [AArch64][GlobalISel] Reorder stack up-adjustment and register copies (authored by aemerson).
[AArch64][GlobalISel] Reorder stack up-adjustment and register copies
Feb 27 2023, 11:24 AM · Restricted Project, Restricted Project
aemerson closed D144791: Reorder stack up-adjustment and register copies.
Feb 27 2023, 11:24 AM · Restricted Project, Restricted Project
aemerson accepted D144791: Reorder stack up-adjustment and register copies.

Thanks, LGTM.

Feb 27 2023, 9:53 AM · Restricted Project, Restricted Project

Feb 26 2023

aemerson added a comment to D144791: Reorder stack up-adjustment and register copies.

Thanks for the fix. Could you also add a test specific to the bug you're seeing as well? The test case from the GitHub issue is fine.

Feb 26 2023, 5:08 PM · Restricted Project, Restricted Project
aemerson committed rG4bc643462422: [GlobalISel] Fix an assertion failure in matchHoistLogicOpWithSameOpcodeHands(). (authored by aemerson).
[GlobalISel] Fix an assertion failure in matchHoistLogicOpWithSameOpcodeHands().
Feb 26 2023, 4:06 PM · Restricted Project, Restricted Project

Feb 24 2023

aemerson accepted D144670: [AArch64][GlobalISel] Legalize G_SHUFFLE_VECTOR with smaller dest size.
Feb 24 2023, 11:09 PM · Restricted Project, Restricted Project
aemerson updated the diff for D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

Use structured bindings with helpers in MachineInstr.

Feb 24 2023, 3:20 PM · Restricted Project, Restricted Project
aemerson added a comment to D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

Could you achieve the same with structured bindings?

auto [Dst, Src1, Src2, Src3] = MI.getOperandsRegs(0, 1, 2, 3);

Yes that does seem much cleaner :)

Feb 24 2023, 3:26 AM · Restricted Project, Restricted Project

Feb 23 2023

aemerson updated the diff for D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..

Add some variants that also declare the corresponding types for each register.

Feb 23 2023, 11:05 PM · Restricted Project, Restricted Project
aemerson requested review of D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..
Feb 23 2023, 4:40 PM · Restricted Project, Restricted Project

Feb 21 2023

aemerson updated the diff for D144336: [GlobalISel] Fix DIVREM combine from inserting a divrem before its operands' defs..

Check all defs.

Feb 21 2023, 12:52 AM · Restricted Project, Restricted Project

Feb 19 2023

aemerson added a comment to rG66d64aac36a6: [GlobalISel] Fix a store-merging bug due to use of >= instead of >..

Just realized I wrote the commit message the wrong way around. This bug is due to using > instead of >=.

Feb 19 2023, 5:41 PM · Restricted Project, Restricted Project
aemerson committed rG66d64aac36a6: [GlobalISel] Fix a store-merging bug due to use of >= instead of >. (authored by aemerson).
[GlobalISel] Fix a store-merging bug due to use of >= instead of >.
Feb 19 2023, 3:56 PM · Restricted Project, Restricted Project

Feb 18 2023

aemerson requested review of D144336: [GlobalISel] Fix DIVREM combine from inserting a divrem before its operands' defs..
Feb 18 2023, 2:26 PM · Restricted Project, Restricted Project
aemerson committed rGddf167c44201: [GlobalISel] Fix G_ZEXTLOAD being converted to G_SEXTLOAD incorrectly. (authored by aemerson).
[GlobalISel] Fix G_ZEXTLOAD being converted to G_SEXTLOAD incorrectly.
Feb 18 2023, 10:05 AM · Restricted Project, Restricted Project
aemerson committed rG556657c0fd6a: [NFC][GlobalISel] Regenerate test checks for extending-loads test. (authored by aemerson).
[NFC][GlobalISel] Regenerate test checks for extending-loads test.
Feb 18 2023, 2:09 AM · Restricted Project, Restricted Project

Feb 17 2023

aemerson committed rGb309bc04eebc: [GlobalISel] Combine out-of-range shifts to undef. (authored by aemerson).
[GlobalISel] Combine out-of-range shifts to undef.
Feb 17 2023, 3:05 PM · Restricted Project, Restricted Project
aemerson closed D144303: [GlobalISel] Combine out-of-range shifts to undef.
Feb 17 2023, 3:05 PM · Restricted Project, Restricted Project
aemerson added inline comments to D144303: [GlobalISel] Combine out-of-range shifts to undef.
Feb 17 2023, 2:51 PM · Restricted Project, Restricted Project
aemerson requested review of D144303: [GlobalISel] Combine out-of-range shifts to undef.
Feb 17 2023, 2:38 PM · Restricted Project, Restricted Project

Feb 15 2023

aemerson accepted D144014: [LSR] Improve filtered uses in NarrowSearchSpaceByPickingWinnerRegs.
Feb 15 2023, 1:53 PM · Restricted Project, Restricted Project
aemerson added a comment to D144014: [LSR] Improve filtered uses in NarrowSearchSpaceByPickingWinnerRegs.

LGTM. Does this unblock the inliner change?

Feb 15 2023, 1:52 PM · Restricted Project, Restricted Project

Feb 10 2023

aemerson added a comment to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

It’s not clear from the original commit message why the test is related to inlining order? It seems entirely testing vectorization cost model which should be insensitive to these kind of changes, right?

It's a phase ordering test - it's testing the entire pipeline including all the inlining and simplification that needs to happen :)

You can run update_test_checks of the file to see the differences. I believe the inlining causes differences in the code that then cause different vector factors to be chosen. I can try to add a similar test for the other case that got worse, if they are similar.

Feb 10 2023, 8:58 AM · Restricted Project, Restricted Project, Restricted Project
aemerson added a comment to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

Hello - I had to revert this because of some large regressions we got from routines in CMSIS-DSP.

The llvm/test/Transforms/PhaseOrdering/ARM/arm_mult_q15.ll test shows the problem - that's why that test exists to ensure that any pipeline changes don't negatively affect these routines. Unfortunately you just changed the test as opposed to showing the problems that this causes. They might be fixable with some other tweaks elsewhere, but the ordering of inlining seems important for getting the correct code that can be vectorized nicely.

There are some other cases around inlining this thing on v6m cores: https://github.com/ARM-software/CMSIS-DSP/blob/809202bf185280a322efc2e2c850a544747f9d79/Include/arm_math_memory.h#L76, but I'm not sure about the details yet. The mult examples were the really large regressions.

Feb 10 2023, 7:29 AM · Restricted Project, Restricted Project, Restricted Project

Feb 9 2023

aemerson committed rG8e33c41e72ad: Inliner: Address missed review comments for D143624 (authored by aemerson).
Inliner: Address missed review comments for D143624
Feb 9 2023, 10:13 PM · Restricted Project, Restricted Project
aemerson added inline comments to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..
Feb 9 2023, 9:53 PM · Restricted Project, Restricted Project, Restricted Project
aemerson committed rGcae033dcf227: Inlining: Run the legacy AlwaysInliner before the regular inliner. (authored by aemerson).
Inlining: Run the legacy AlwaysInliner before the regular inliner.
Feb 9 2023, 4:50 PM · Restricted Project, Restricted Project, Restricted Project
aemerson closed D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..
Feb 9 2023, 4:49 PM · Restricted Project, Restricted Project, Restricted Project
aemerson updated the diff for D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

Address comments. Disable -mandatory-inlining-first by default and insert lifetime intrinsics if not at -O0.

Feb 9 2023, 4:42 PM · Restricted Project, Restricted Project, Restricted Project
aemerson added a comment to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

Had a chat offline with @mtrofin, wanted to be clear for future purposes that we do need the separate AlwaysInliner pass because it's used in -O0 and constructing a call graph there is non-trivial in terms of compile time. Originally the mandatory mode of the normal inliner was added to maybe remove the separate AlwaysInliner pass in the future, but that's not going to happen because of what I just said. Given that, we can eventually remove the mandatory mode of the normal inliner after this patch goes through. So this patch should also make mandatory-inlining-first false by default, then we remove it in a separate patch.

Feb 9 2023, 3:06 PM · Restricted Project, Restricted Project, Restricted Project
aemerson added a comment to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

__clang_hip_math.hip is annoying...

We'll need to remove the MandatoryFirst inliner in ModuleInlinerWrapperPass, although not sure if @mtrofin has any issues with that or not

This isn't quite what I had initially thought, but this might be better. (I was thinking that we sort the calls in the inliner to visit alwaysinline calls first, but that might cause more compile time issues since we have to update the call graph after visiting all the calls in a function, but we might be visiting every function twice if we first batch process the alwaysinline calls then all other calls)

I think that doesn't actually do the same thing as this, since the Calls vector is populated by visiting the functions in the current SCC. What we're trying to do with this patch is to ensure that all always-inline calls globally are processed first.

That's true, but the legacy pass manager where the inliner explosion didn't happen in your case didn't process always-inline calls before other calls. So I don't think it's necessary to process alwaysinline calls globally first to fix your case. However, given that we still do two more rounds of inlining in the inliner pipeline after the alwaysinliner pass you added and your case still doesn't blow up, this solution does seem robust.

Sure, the exponential compile time case is actually just a side benefit here. The motivating reason for this change is actually to improve code size when building codebases that make heavy use of always_inline.

Feb 9 2023, 1:11 PM · Restricted Project, Restricted Project, Restricted Project
aemerson added a comment to D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..

__clang_hip_math.hip is annoying...

We'll need to remove the MandatoryFirst inliner in ModuleInlinerWrapperPass, although not sure if @mtrofin has any issues with that or not

This isn't quite what I had initially thought, but this might be better. (I was thinking that we sort the calls in the inliner to visit alwaysinline calls first, but that might cause more compile time issues since we have to update the call graph after visiting all the calls in a function, but we might be visiting every function twice if we first batch process the alwaysinline calls then all other calls)

Feb 9 2023, 11:19 AM · Restricted Project, Restricted Project, Restricted Project

Feb 8 2023

aemerson requested review of D143624: Inlining: Run the legacy AlwaysInliner before the regular inliner..
Feb 8 2023, 8:04 PM · Restricted Project, Restricted Project, Restricted Project

Feb 7 2023

aemerson accepted D143526: [GlobalISel] Handle ptr size != index size in IRTranslator.
Feb 7 2023, 1:55 PM · Restricted Project, Restricted Project
aemerson accepted D143517: [AArch64][GlobalISel] Legalize wide s8/s16 vectors G_ADD/G_MUL/G_OR/....

LGTM, thanks!

Feb 7 2023, 10:26 AM · Restricted Project, Restricted Project

Feb 6 2023

aemerson accepted D132536: [AArch64] Consider tiny code model in emitLoadFromConstantPool..
Feb 6 2023, 2:33 PM · Restricted Project, Restricted Project
aemerson accepted D143100: [AArch64][GlobalISel] Widen G_ADD/G_MUL/G_OR/... element types if size < 8b.

LGTM. It would be nice if we could refactor the legalizer to allow better re-use of these rule patterns but that's for another patch.

Feb 6 2023, 2:32 PM · Restricted Project, Restricted Project
aemerson added a comment to D141135: [RFC][GlobalISel] Replace the current GlobalISel matcher with a bottom-up matcher.

From my reading of this review, it seems no one has a fundamental disagreement with the overall direction. In that case, I think it's ready to go into a conventional implementation review.

Feb 6 2023, 2:28 PM · Restricted Project, Restricted Project

Feb 3 2023

aemerson accepted D143002: [AArch64][GlobalISel] Selection for i8 buildvectors.

LGTM, could you add a MIR test? We should already have some for the existing build_vector support for other types.

Feb 3 2023, 3:22 PM · Restricted Project, Restricted Project

Feb 2 2023

aemerson added a comment to D127762: [Clang][AArch64] Add ACLE attributes for SME..

Gentle ping on this discussion. @aaron.ballman

Feb 2 2023, 2:52 PM · Restricted Project, Restricted Project

Jan 27 2023

aemerson added a comment to D138602: [WIP] Alwaysinliner time explosion with new pass manager.

Another ping

Jan 27 2023, 10:30 AM · Restricted Project, Restricted Project

Jan 20 2023

aemerson added a comment to D98213: [InlineCost] Enable the cost benefit analysis on FDO.

Note the size here is not static code size (it excludes cold code). It is actually a proxy to model the runtime cost due to increased instruction footprint (icache pressure). The multiplier's role is to make the savings and 'size' cost comparable in terms of unit. The cycle name here is also counted at IR level, so not at low level.

I understand that, but it still doesn't make the comparison of different units any better. Introducing a scaling factor is irrelevant, it's just an arbitrary scalar.

Jan 20 2023, 3:46 PM · Restricted Project, Restricted Project
aemerson added a comment to D98213: [InlineCost] Enable the cost benefit analysis on FDO.

The main one being that the heuristics seem to completely override all sensible thresholds for code size. The feature caims to compare the performance gains by measuring the cycles saved from a PGO run, and comparing against some threshold. This doesn't make sense since the length of the runs for PGO are arbitrary, and you shouldn't be using them in absolute terms. The only way it makes sense is if you have a very specific workload and the PGO profile run is exactly the same as the real world runs.

If you look at llvm/lib/Analysis/InlineCost.cpp:892, you'll see:

//  CycleSavings      PSI->getOrCompHotCountThreshold()                                                                                       
// -------------- >= -----------------------------------                                                                                      
//       Size              InlineSavingsMultiplier

If you rearrange this inequality, we have:

//            CycleSavings                          Size                                                                                      
// ----------------------------------- >= -------------------------                                                                           
//  PSI->getOrCompHotCountThreshold()      InlineSavingsMultiplier

Notice that the LHS divides CycleSavings by PSI->getOrCompHotCountThreshold(), so I am not using cycle savings in absolute terms. If I do a PGO run twice as long, both the numerator and denominator would double, yielding the same ratio.

The original change gave absolutely no justification, formal or intuitive, for the use of the equation to determine profitability. Giving positive runtime results is no justification for completely overriding any reasonable thresholds and potentially impacting code size & compile time by several orders of magnitude as we've seen internally at Apple.

Do you have any test case that you can share, possibly reduced and/or with sensitive code removed? I realize it may be difficult to come up with one from a large piece of software, but I am happy to take a look at a test case that demonstrates at least one problem -- code size or compile time.

Jan 20 2023, 3:02 PM · Restricted Project, Restricted Project
aemerson added a comment to D98213: [InlineCost] Enable the cost benefit analysis on FDO.

I want to resurrect this change that enabled these cost-benefit heuristics originally developed in https://reviews.llvm.org/D92780

There are a number of issues with the original change that I can see, and this has caused all manner of problems for us internally as we moved to the new pass manager (and therefore started using these heuristics).

The main one being that the heuristics seem to completely override all sensible thresholds for code size. The feature caims to compare the performance gains by measuring the cycles saved from a PGO run, and comparing against some threshold. This doesn't make sense since the length of the runs for PGO are arbitrary, and you shouldn't be using them in absolute terms. The only way it makes sense is if you have a very specific workload and the PGO profile run is exactly the same as the real world runs. The original change gave absolutely no justification, formal or intuitive, for the use of the equation to determine profitability. Giving positive runtime results is no justification for completely overriding any reasonable thresholds and potentially impacting code size & compile time by several orders of magnitude as we've seen internally at Apple.

Even worse: D92780 was accepted into mainline LLVM with absolutely no tests, and then enabled for all users in this change. I didn't even see any discussion about tests, or explanation for why they were impractical. We've disabled these heuristics downstream but I absolutely believe that this must be reverted upstream too until it can be re-worked and re-reviewed in detail.

FYI: the cycle savings are normalized by dividing the function entry count, so it is not used in absolute sense.

Right, that's fair enough. See the following comment from the code:

//  CycleSavings      PSI->getOrCompHotCountThreshold()
// -------------- >= -----------------------------------
//       Size              InlineSavingsMultiplier

Cycles are being compared against code size here. How is that metric supposed to be interpreted? What does (cycles/entry)/(bytes) even mean as a unit of measurement? The feature itself seems to be trying to optimize too fine details in machine cycles in a part of the compiler that shouldn't be dealing with such low level specifics.

Jan 20 2023, 2:56 PM · Restricted Project, Restricted Project
Herald added a project to D92780: [InlineCost] Implement cost-benefit-based inliner: Restricted Project.

FYI I left a comment on https://reviews.llvm.org/D98213 about why I think this feature should not have been enabled.

Jan 20 2023, 2:22 PM · Restricted Project, Restricted Project
Herald added a project to D98213: [InlineCost] Enable the cost benefit analysis on FDO: Restricted Project.

I want to resurrect this change that enabled these cost-benefit heuristics originally developed in https://reviews.llvm.org/D92780

Jan 20 2023, 2:21 PM · Restricted Project, Restricted Project

Jan 17 2023

aemerson added a comment to D138602: [WIP] Alwaysinliner time explosion with new pass manager.

ping

Jan 17 2023, 11:47 AM · Restricted Project, Restricted Project

Jan 13 2023

aemerson added a comment to D141135: [RFC][GlobalISel] Replace the current GlobalISel matcher with a bottom-up matcher.
  • Visit of changed instructions in the Combiner is now done iteratively.
  • The array with the rules to execute in the generated source is a bit more compact.

Running LLVM test suite with -fglobal-isel on an AArch64 EC2 instance (2 CPUs, 4GB):
Current implementation: 22m43.224s
Bottom-up implementation: 22m47.165s

Size of the binaries:
Current implementation: llc 127.280.024, clang 217.540.416
Bottom-up implementation: llc 127.026.960, clang 217.291.456

I haven't had time to properly look at the implementation, but a question on this. When you say "running the test suite", are you talking about building the test suite with -fglobal-isel? Not running the generated code right?

Yes, for the time measurement I referred to building the test suite, as this shows the impact of my code on the compile time. (I also ran the test suite to make sure that there are no regressions.)

Secondly, you mentioned that this is a drop on replacement for the existing matcher, yet we're seeing some code size differences. Do you know why?

Yes. The matcher my code generates has a simpler structure. It is basically:

  • Calculate the match set number for the current instruction, which is either a constant assignment or a table lookup.
  • Translate the match set number to a list of rules to execute, which is another table lookup.
  • Run the rules

The current matcher has a couple of if and switch statements before the rules are run. I noted that the generated tables are much more compact than the code generated from if and switch. The overall price my implementation pays is a greater construction time in llvm-tblgen.

Jan 13 2023, 3:37 PM · Restricted Project, Restricted Project

Jan 11 2023

aemerson added a comment to D141135: [RFC][GlobalISel] Replace the current GlobalISel matcher with a bottom-up matcher.
  • Visit of changed instructions in the Combiner is now done iteratively.
  • The array with the rules to execute in the generated source is a bit more compact.

Running LLVM test suite with -fglobal-isel on an AArch64 EC2 instance (2 CPUs, 4GB):
Current implementation: 22m43.224s
Bottom-up implementation: 22m47.165s

Size of the binaries:
Current implementation: llc 127.280.024, clang 217.540.416
Bottom-up implementation: llc 127.026.960, clang 217.291.456

I haven't had time to properly look at the implementation, but a question on this. When you say "running the test suite", are you talking about building the test suite with -fglobal-isel? Not running the generated code right?

Jan 11 2023, 10:24 PM · Restricted Project, Restricted Project
aemerson added a comment to D141372: MachineIRBuilder: Rename buildMerge. NFC.

Tiny driveby bikeshed (feel free to ignore): buildMergeLikeInstr? It is longer, but Op seems to more commonly mean "Operand" here (save for some stuff like buildAssertOp).

I'm not a huge fan of the "Op" part either, but there is a precedent for it in the GMergeLikeOp wrapper class.

There could be an argument made that both GMergeLikeOp and buildAssertOp should use Instr instead of Op, too... Should I just rename all of them?

As a separate commit yes that seems fine to me.

Jan 11 2023, 11:20 AM · Restricted Project, Restricted Project

Jan 10 2023

aemerson accepted D141372: MachineIRBuilder: Rename buildMerge. NFC.

Seems perfectly reasonable to me.

Jan 10 2023, 10:04 AM · Restricted Project, Restricted Project

Jan 9 2023

aemerson accepted D139419: [AArch64] lower abs intrinsic to new ABS instruction in GIsel.

LGTM, thanks!

Jan 9 2023, 10:11 AM · Restricted Project, Restricted Project

Jan 6 2023

aemerson added inline comments to D139419: [AArch64] lower abs intrinsic to new ABS instruction in GIsel.
Jan 6 2023, 2:55 PM · Restricted Project, Restricted Project

Jan 5 2023

aemerson added inline comments to D139419: [AArch64] lower abs intrinsic to new ABS instruction in GIsel.
Jan 5 2023, 10:15 AM · Restricted Project, Restricted Project
aemerson accepted D139417: [AArch64] add GlobalIsel support for scalar CNT instruction.

LGTM, thanks!

Jan 5 2023, 10:09 AM · Restricted Project, Restricted Project
aemerson added inline comments to D139419: [AArch64] lower abs intrinsic to new ABS instruction in GIsel.
Jan 5 2023, 12:52 AM · Restricted Project, Restricted Project
aemerson accepted D139417: [AArch64] add GlobalIsel support for scalar CNT instruction.

LGTM with some simplifications.

Jan 5 2023, 12:35 AM · Restricted Project, Restricted Project
aemerson accepted D139420: [AArch64][GlobalISel] implement GPR (U/S)(MIN/MAX) instr support.
Jan 5 2023, 12:32 AM · Restricted Project, Restricted Project

Dec 7 2022

aemerson committed rG53445f5b1cfc: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug. (authored by aemerson).
[GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Dec 7 2022, 10:29 AM · Restricted Project, Restricted Project
aemerson closed D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Dec 7 2022, 10:29 AM · Restricted Project, Restricted Project

Dec 5 2022

aemerson added inline comments to D138082: GlobalISel: ComputeNumSignBits from load range metadata.
Dec 5 2022, 11:22 AM · Restricted Project, Restricted Project

Nov 23 2022

aemerson updated the diff for D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

Remove G_INVOKE_REGION_END

Nov 23 2022, 8:37 PM · Restricted Project, Restricted Project
aemerson added a comment to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

Forgot to remove the END instruction, I’ll do that and update.

Nov 23 2022, 3:55 PM · Restricted Project, Restricted Project
aemerson updated the diff for D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

Address comments.

Nov 23 2022, 10:31 AM · Restricted Project, Restricted Project

Nov 17 2022

aemerson added a comment to D113030: Add a new tool for parallel safe bisection, "llvm-bisectd"..

How does this differ from opt-bisect?

It's cross-process/whole-build. Imagine opt-bisect, but the action tracking is for a whole build. My hope was that maybe we could do something more coarse-grained on the build level (without the need for the compiler itself to be communicating with a service) in a wrapper - eg: good/bad compiler (or flag set) and each action gets one or the other, bisect down to the fewest actions that need the bad compiler, then use opt-bisect within a single action, holding others constant, etc. So we didn't have two different ways of doing fine-grained bisection.

Nov 17 2022, 12:10 PM · Restricted Project, Restricted Project
aemerson added a comment to D138187: [GlobalISel] Fix crash in applyShiftOfShiftedLogic caused by CSEMIRBuilder reuse instruction.

Can you link to the bug report? Issue 58432 seems to be unrelated to GlobalISel.

Nov 17 2022, 7:27 AM · Restricted Project, Restricted Project

Nov 16 2022

aemerson added inline comments to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Nov 16 2022, 3:13 PM · Restricted Project, Restricted Project
aemerson added a comment to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

Does G_INVOKE_REGION_START work with the terminators() iterator?

Nov 16 2022, 3:11 PM · Restricted Project, Restricted Project
aemerson added inline comments to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Nov 16 2022, 3:08 PM · Restricted Project, Restricted Project
aemerson added a comment to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

I don't know anything about exceptions, but looking at translateInvoke, I'm confused by several things.

Why do you need this new thing, instead of just looking for EH_LABEL? Why is this specially handling inlineasm such that it may turn off the label if the called operand can't throw as some kind of optimization? If it can't throw, wouldn't it optimize to use a regular call in the first place?

I’m not sure about the inline asm thing since I’m new to this code too, but semantically EH_LABEL isn’t a terminator, it’s a label used for computing exception table information. It seems cleaner to have a dedicated barrier instruction. The reason the legalizations insert at the first terminator is because that guarantees the instruction will execute. If you specifically look for an EH label then it’s leaking implementation details about instruction placement to the API client. Granted, having to use a forward walking ‘getFirstIteratorForward()’ isn’t ideal either.

Nov 16 2022, 1:43 PM · Restricted Project, Restricted Project

Nov 15 2022

aemerson added inline comments to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Nov 15 2022, 9:37 PM · Restricted Project, Restricted Project
aemerson added a comment to D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.

Should the invoke just stay as a terminator instead? I'd hate to have more code placement mechanisms

What do you mean stay as a terminator? Invoke is an IR only instruction that we no longer have after translation.

Nov 15 2022, 9:34 PM · Restricted Project, Restricted Project

Nov 13 2022

aemerson requested review of D137905: [GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
Nov 13 2022, 1:59 AM · Restricted Project, Restricted Project

Nov 12 2022

aemerson accepted D137778: [GlobalISel] Correct constant type in matchReassocConstantInnerLHS.

LGTM, thanks!

Nov 12 2022, 11:29 PM · Restricted Project, Restricted Project

Nov 11 2022

aemerson committed rG2179f5133561: [AArch64][GlobalISel] Select TBZ for icmp sge x, 0. (authored by aemerson).
[AArch64][GlobalISel] Select TBZ for icmp sge x, 0.
Nov 11 2022, 8:16 PM · Restricted Project, Restricted Project
aemerson added a comment to D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..

Sorry, for some reason I hadn’t received any emails about this so I only noticed now. I’ll look into the issues and come back once the issues are resolved.

Nov 11 2022, 4:59 PM · Restricted Project, Restricted Project, Restricted Project

Nov 7 2022

aemerson committed rGf64802e8d3e9: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64… (authored by aemerson).
[Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64…
Nov 7 2022, 3:04 PM · Restricted Project, Restricted Project
aemerson closed D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..
Nov 7 2022, 3:04 PM · Restricted Project, Restricted Project, Restricted Project

Nov 2 2022

aemerson added a comment to D137274: AMDGPU/GlobalISel: Fix combine crash because LI is not set in prelegalizer.

To my understanding prelegalizer now also requires legalizer info?

Nov 2 2022, 8:18 PM · Restricted Project, Restricted Project
aemerson added a comment to D109240: GlobalISel: Artifact combine merge-like and unmerge into copy.

Sorry I missed this @, was on parental leave at the time. This is a very nice improvement in code quality, thanks!

Nov 2 2022, 8:16 PM · Restricted Project, Restricted Project
aemerson updated the diff for D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..

Add a clang release note.

Nov 2 2022, 2:03 PM · Restricted Project, Restricted Project, Restricted Project
aemerson updated the summary of D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..
Nov 2 2022, 10:48 AM · Restricted Project, Restricted Project, Restricted Project
aemerson added inline comments to D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..
Nov 2 2022, 10:48 AM · Restricted Project, Restricted Project, Restricted Project
aemerson requested review of D137269: [Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms..
Nov 2 2022, 10:07 AM · Restricted Project, Restricted Project, Restricted Project

Nov 1 2022

aemerson committed rG974cf7164915: [AArch64][GlobalISel] Add a simple cross-regclass copy optimization post… (authored by aemerson).
[AArch64][GlobalISel] Add a simple cross-regclass copy optimization post…
Nov 1 2022, 4:09 PM · Restricted Project, Restricted Project
aemerson closed D136793: [AArch64][GlobalISel] Add some minor post-selection optimizations..
Nov 1 2022, 4:09 PM · Restricted Project, Restricted Project

Oct 31 2022

aemerson accepted D136937: [GlobalISel] Compute debug location when merging stores more accurately.

Thanks. LGTM.

Oct 31 2022, 10:21 PM · Restricted Project, Restricted Project

Oct 28 2022

aemerson added a comment to D136937: [GlobalISel] Compute debug location when merging stores more accurately.

Thanks for fixing this, could you add a small test as well?

Oct 28 2022, 9:23 AM · Restricted Project, Restricted Project